aosp native源码基础: Futex同步机制实战剖析

Futex同步机制实战解析

背景:

经常在学习aosp的一些native模块时候,或者说分析一些anr问题会经常遇到一个相关方法“futex”,那么这个futex到底是啥呢?今天带大家来剖析一下。
在并发编程的世界中,锁是保证线程安全的基础工具。然而,传统的同步原语如互斥锁和信号量往往伴随着高昂的性能开销。

// 传统互斥锁(总是需要系统调用)
    pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
    pthread_mutex_lock(&mutex);  
    pthread_mutex_unlock(&mutex); 

Linux 内核为了解决这个问题,引入了 Futex(Fast Userspace muTEX) 机制,它已成为现代高性能并发编程的基石。

什么是 Futex?

Futex按英文翻译过来就是快速用户空间互斥体。

Futex是一种用户态和内核态混合的同步机制。首先,同步的进程间通过mmap共享一段内存,futex变量就位于这段共享的内存中且操作是原子的,当进程尝试进入互斥区或者退出互斥区的时候,先去查看共享内存中的futex变量,如果没有竞争发生,则只修改futex,而不 用再执行系统调用了。当通过访问futex变量告诉进程有竞争发生,则还是得执行系统调用去完成相应的处理(wait 或者 wake up)。

使用man futex命令进行查看futex的介绍和使用手册:

FUTEX(2)                                                                                  Linux Programmer's Manual                                                                                  FUTEX(2)

NAME
       futex - fast user-space locking

SYNOPSIS
       #include <linux/futex.h>
       #include <sys/time.h>

       int futex(int *uaddr, int futex_op, int val,
                 const struct timespec *timeout,   /* or: uint32_t val2 */
                 int *uaddr2, int val3);

       Note: There is no glibc wrapper for this system call; see NOTES.

DESCRIPTION
       The futex() system call provides a method for waiting until a certain condition becomes true.  It is typically used as a blocking construct in the context of shared-memory synchronization.  When us‐
       ing futexes, the majority of the synchronization operations are performed in user space.  A user-space program employs the futex() system call only when it is likely that the program  has  to  block
       for a longer time until the condition becomes true.  Other futex() operations can be used to wake any processes or threads waiting for a particular condition.

       A  futex is a 32-bit value—referred to below as a futex word—whose address is supplied to the futex() system call.  (Futexes are 32 bits in size on all platforms, including 64-bit systems.)  All fu‐
       tex operations are governed by this value.  In order to share a futex between processes, the futex is placed in a region of shared memory, created using (for example) mmap(2)  or  shmat(2).   (Thus,
       the  futex  word may have different virtual addresses in different processes, but these addresses all refer to the same location in physical memory.)  In a multithreaded program, it is sufficient to
       place the futex word in a global variable shared by all threads.

       When executing a futex operation that requests to block a thread, the kernel will block only if the futex word has the value that the calling thread supplied (as one of the arguments of the  futex()
       call)  as  the expected value of the futex word.  The loading of the futex word's value, the comparison of that value with the expected value, and the actual blocking will happen atomically and will
       be totally ordered with respect to concurrent operations performed by other threads on the same futex word.  Thus, the futex word is used to connect the synchronization in user space with the imple‐
       mentation of blocking by the kernel.  Analogously to an atomic compare-and-exchange operation that potentially changes shared memory, blocking via a futex is an atomic compare-and-block operation.

       One  use of futexes is for implementing locks.  The state of the lock (i.e., acquired or not acquired) can be represented as an atomically accessed flag in shared memory.  In the uncontended case, a
       thread can access or modify the lock state with atomic instructions, for example atomically changing it from not acquired to acquired using an atomic  compare-and-exchange  instruction.   (Such  in‐
       structions  are performed entirely in user mode, and the kernel maintains no information about the lock state.)  On the other hand, a thread may be unable to acquire a lock because it is already ac‐
       quired by another thread.  It then may pass the lock's flag as a futex word and the value representing the acquired state as the expected value to a futex() wait operation.  This  futex()  operation
       will  block  if and only if the lock is still acquired (i.e., the value in the futex word still matches the "acquired state").  When releasing the lock, a thread has to first reset the lock state to
       not acquired and then execute a futex operation that wakes threads blocked on the lock flag used as a futex word (this can be further optimized to avoid unnecessary wake-ups).  See futex(7) for more
       detail on how to use futexes.

       Besides the basic wait and wake-up futex functionality, there are further futex operations aimed at supporting more complex use cases.

       Note  that  no  explicit initialization or destruction is necessary to use futexes; the kernel maintains a futex (i.e., the kernel-internal implementation artifact) only while operations such as FU‐
       TEX_WAIT, described below, are being performed on a particular futex word.

 Arguments
       The uaddr argument points to the futex word.  On all platforms, futexes are four-byte integers that must be aligned on a four-byte boundary.  The operation to perform on the futex  is  specified  in
       the futex_op argument; val is a value whose meaning and purpose depends on futex_op.

       The remaining arguments (timeout, uaddr2, and val3) are required only for certain of the futex operations described below.  Where one of these arguments is not required, it is ignored.

       For  several blocking operations, the timeout argument is a pointer to a timespec structure that specifies a timeout for the operation.  However,  notwithstanding the prototype shown above, for some
       operations, the least significant four bytes of this argument are instead used as an integer whose meaning is determined by the operation.  For these operations, the kernel casts the  timeout  value
       first to unsigned long, then to uint32_t, and in the remainder of this page, this argument is referred to as val2 when interpreted in this fashion.

       Where it is required, the uaddr2 argument is a pointer to a second futex word that is employed by the operation.

       The interpretation of the final integer argument, val3, depends on the operation.

参数解释:

uaddr:用户态下共享内存的地址,里面存放的是一个对齐的整型计数器;

futex_op:存放着操作类型,例如:

FUTEX_WAIT:
原子性的检查uaddr中计数器的值是否为val,如果是则让进程休眠,直到FUTEX_WAKE或者超时(time-out)。也就是把进程挂到uaddr相对应的等待队列上

FUTEX_WAKE:
最多唤醒val个等待在uaddr上进程。

Futex 的高级特性

1 私有 Futex (FUTEX_PRIVATE_FLAG)
对于进程内同步,可以使用私有标志提升性能:

// 私有 Futex 操作(进程内使用,内核可优化)
#define FUTEX_WAIT_PRIVATE   (FUTEX_WAIT | FUTEX_PRIVATE_FLAG)
#define FUTEX_WAKE_PRIVATE   (FUTEX_WAKE | FUTEX_PRIVATE_FLAG)

int futex_wait_private(uint32_t *uaddr, uint32_t expected) {
    return syscall(SYS_futex, uaddr, FUTEX_WAIT_PRIVATE, expected, NULL, NULL, 0);
}

int futex_wake_private(uint32_t *uaddr, int count) {
    return syscall(SYS_futex, uaddr, FUTEX_WAKE_PRIVATE, count, NULL, NULL, 0);
}

2 带超时的等待

int futex_wait_timeout(uint32_t *uaddr, uint32_t expected, long milliseconds) {
    struct timespec timeout = {
        .tv_sec = milliseconds / 1000,
        .tv_nsec = (milliseconds % 1000) * 1000000
    };
    
    return syscall(SYS_futex, uaddr, FUTEX_WAIT, expected, &timeout, NULL, 0);
}

官方实战使用代码demo:

/* futex_demo.c

          Usage: futex_demo [nloops]
                           (Default: 5)

          Demonstrate the use of futexes in a program where parent and child
          use a pair of futexes located inside a shared anonymous mapping to
          synchronize access to a shared resource: the terminal. The two
          processes each write 'num-loops' messages to the terminal and employ
          a synchronization protocol that ensures that they alternate in
          writing messages.
       */
#define _GNU_SOURCE
#include <stdio.h>
#include <errno.h>
#include <stdatomic.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <linux/futex.h>
#include <sys/time.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
                       } while (0)

static int *futex1, *futex2, *iaddr;

static int
futex(int *uaddr, int futex_op, int val,
     const struct timespec *timeout, int *uaddr2, int val3)
{
   return syscall(SYS_futex, uaddr, futex_op, val,
                  timeout, uaddr, val3);
}

/* Acquire the futex pointed to by 'futexp': wait for its value to
  become 1, and then set the value to 0. */

static void
fwait(int *futexp)
{
   int s;

   /* atomic_compare_exchange_strong(ptr, oldval, newval)
      atomically performs the equivalent of:

          if (*ptr == *oldval)
              *ptr = newval;

      It returns true if the test yielded true and *ptr was updated. */

   while (1) {
      /* Is the futex available? */
       const int one = 1;
       if (atomic_compare_exchange_strong(futexp, &one, 0))
           break;      /* Yes */

       /* Futex is not available; wait */

       s = futex(futexp, FUTEX_WAIT, 0, NULL, NULL, 0);
       if (s == -1 && errno != EAGAIN)
           errExit("futex-FUTEX_WAIT");
   }
}

/* Release the futex pointed to by 'futexp': if the futex currently
  has the value 0, set its value to 1 and the wake any futex waiters,
  so that if the peer is blocked in fpost(), it can proceed. */

static void
fpost(int *futexp)
{
   int s;

   /* atomic_compare_exchange_strong() was described in comments above */

   const int zero = 0;
   if (atomic_compare_exchange_strong(futexp, &zero, 1)) {
       s = futex(futexp, FUTEX_WAKE, 1, NULL, NULL, 0);
       if (s  == -1)
           errExit("futex-FUTEX_WAKE");
   }
}

int
main(int argc, char *argv[])
{
   pid_t childPid;
   int j, nloops;

   setbuf(stdout, NULL);

   nloops = (argc > 1) ? atoi(argv[1]) : 5;

   /* Create a shared anonymous mapping that will hold the futexes.
      Since the futexes are being shared between processes, we
      subsequently use the "shared" futex operations (i.e., not the
      ones suffixed "_PRIVATE") */

   iaddr = mmap(NULL, sizeof(int) * 2, PROT_READ | PROT_WRITE,
               MAP_ANONYMOUS | MAP_SHARED, -1, 0);
   if (iaddr == MAP_FAILED)
       errExit("mmap");

   futex1 = &iaddr[0];
   futex2 = &iaddr[1];

   *futex1 = 0;        /* State: unavailable */
   *futex2 = 1;        /* State: available */

   /* Create a child process that inherits the shared anonymous
      mapping */

   childPid = fork();
   if (childPid == -1)
       errExit("fork");

   if (childPid == 0) {        /* Child */
       for (j = 0; j < nloops; j++) {
           fwait(futex1);
           printf("Child  (%ld) %d\n", (long) getpid(), j);
           fpost(futex2);
       }

       exit(EXIT_SUCCESS);
   }

   /* Parent falls through to here */

   for (j = 0; j < nloops; j++) {
       fwait(futex2);
       printf("Parent (%ld) %d\n", (long) getpid(), j);
       fpost(futex1);
   }

   wait(NULL);

   exit(EXIT_SUCCESS);
}

这个futex_demo.c中还有一个比较关键的方法atomic_compare_exchange_strong,
atomic_compare_exchange_strong(ptr, oldval, newval) 是现代 C/C++ 中用于实现原子比较交换操作的标准函数,注释中其实也很清楚写到了:
ptr指向的值如果和oldval指向的值相等,那么就把ptr指向值更新为newval,而且返回true。

编译运行部分:

编译方式:

gcc   futex_demo.c -o futex_man

执行结果如下:

test@test:~/demos/futex$ ./futex_man 
Parent (201299) 0
Child  (201300) 0
Parent (201299) 1
Child  (201300) 1
Parent (201299) 2
Child  (201300) 2
Parent (201299) 3
Child  (201300) 3
Parent (201299) 4
Child  (201300) 4

与传统同步机制对比性能优势

// 传统互斥锁(总是需要系统调用)

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

void traditional_lock() {
    pthread_mutex_lock(&mutex);  // 总是进入内核
    // 临界区
    pthread_mutex_unlock(&mutex); // 总是进入内核
}


// Futex 互斥锁(无竞争时完全在用户空间)

       //首先原子操作判断值,如果不符合直接返回,不会进入内核
       if (atomic_compare_exchange_strong(futexp, &one, 0))
           break;      /* Yes */

       /* Futex is not available; wait */
			//进入内核
       s = futex(futexp, FUTEX_WAIT, 0, NULL, NULL, 0);

更多framework实战开发干货,请关注下面“千里马学框架”

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

千里马学框架

帮助你了,就请我喝杯咖啡

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值