背景:
经常在学习aosp的一些native模块时候,或者说分析一些anr问题会经常遇到一个相关方法“futex”,那么这个futex到底是啥呢?今天带大家来剖析一下。
在并发编程的世界中,锁是保证线程安全的基础工具。然而,传统的同步原语如互斥锁和信号量往往伴随着高昂的性能开销。
// 传统互斥锁(总是需要系统调用)
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_lock(&mutex);
pthread_mutex_unlock(&mutex);
Linux 内核为了解决这个问题,引入了 Futex(Fast Userspace muTEX) 机制,它已成为现代高性能并发编程的基石。
什么是 Futex?
Futex按英文翻译过来就是快速用户空间互斥体。
Futex是一种用户态和内核态混合的同步机制。首先,同步的进程间通过mmap共享一段内存,futex变量就位于这段共享的内存中且操作是原子的,当进程尝试进入互斥区或者退出互斥区的时候,先去查看共享内存中的futex变量,如果没有竞争发生,则只修改futex,而不 用再执行系统调用了。当通过访问futex变量告诉进程有竞争发生,则还是得执行系统调用去完成相应的处理(wait 或者 wake up)。
使用man futex命令进行查看futex的介绍和使用手册:
FUTEX(2) Linux Programmer's Manual FUTEX(2)
NAME
futex - fast user-space locking
SYNOPSIS
#include <linux/futex.h>
#include <sys/time.h>
int futex(int *uaddr, int futex_op, int val,
const struct timespec *timeout, /* or: uint32_t val2 */
int *uaddr2, int val3);
Note: There is no glibc wrapper for this system call; see NOTES.
DESCRIPTION
The futex() system call provides a method for waiting until a certain condition becomes true. It is typically used as a blocking construct in the context of shared-memory synchronization. When us‐
ing futexes, the majority of the synchronization operations are performed in user space. A user-space program employs the futex() system call only when it is likely that the program has to block
for a longer time until the condition becomes true. Other futex() operations can be used to wake any processes or threads waiting for a particular condition.
A futex is a 32-bit value—referred to below as a futex word—whose address is supplied to the futex() system call. (Futexes are 32 bits in size on all platforms, including 64-bit systems.) All fu‐
tex operations are governed by this value. In order to share a futex between processes, the futex is placed in a region of shared memory, created using (for example) mmap(2) or shmat(2). (Thus,
the futex word may have different virtual addresses in different processes, but these addresses all refer to the same location in physical memory.) In a multithreaded program, it is sufficient to
place the futex word in a global variable shared by all threads.
When executing a futex operation that requests to block a thread, the kernel will block only if the futex word has the value that the calling thread supplied (as one of the arguments of the futex()
call) as the expected value of the futex word. The loading of the futex word's value, the comparison of that value with the expected value, and the actual blocking will happen atomically and will
be totally ordered with respect to concurrent operations performed by other threads on the same futex word. Thus, the futex word is used to connect the synchronization in user space with the imple‐
mentation of blocking by the kernel. Analogously to an atomic compare-and-exchange operation that potentially changes shared memory, blocking via a futex is an atomic compare-and-block operation.
One use of futexes is for implementing locks. The state of the lock (i.e., acquired or not acquired) can be represented as an atomically accessed flag in shared memory. In the uncontended case, a
thread can access or modify the lock state with atomic instructions, for example atomically changing it from not acquired to acquired using an atomic compare-and-exchange instruction. (Such in‐
structions are performed entirely in user mode, and the kernel maintains no information about the lock state.) On the other hand, a thread may be unable to acquire a lock because it is already ac‐
quired by another thread. It then may pass the lock's flag as a futex word and the value representing the acquired state as the expected value to a futex() wait operation. This futex() operation
will block if and only if the lock is still acquired (i.e., the value in the futex word still matches the "acquired state"). When releasing the lock, a thread has to first reset the lock state to
not acquired and then execute a futex operation that wakes threads blocked on the lock flag used as a futex word (this can be further optimized to avoid unnecessary wake-ups). See futex(7) for more
detail on how to use futexes.
Besides the basic wait and wake-up futex functionality, there are further futex operations aimed at supporting more complex use cases.
Note that no explicit initialization or destruction is necessary to use futexes; the kernel maintains a futex (i.e., the kernel-internal implementation artifact) only while operations such as FU‐
TEX_WAIT, described below, are being performed on a particular futex word.
Arguments
The uaddr argument points to the futex word. On all platforms, futexes are four-byte integers that must be aligned on a four-byte boundary. The operation to perform on the futex is specified in
the futex_op argument; val is a value whose meaning and purpose depends on futex_op.
The remaining arguments (timeout, uaddr2, and val3) are required only for certain of the futex operations described below. Where one of these arguments is not required, it is ignored.
For several blocking operations, the timeout argument is a pointer to a timespec structure that specifies a timeout for the operation. However, notwithstanding the prototype shown above, for some
operations, the least significant four bytes of this argument are instead used as an integer whose meaning is determined by the operation. For these operations, the kernel casts the timeout value
first to unsigned long, then to uint32_t, and in the remainder of this page, this argument is referred to as val2 when interpreted in this fashion.
Where it is required, the uaddr2 argument is a pointer to a second futex word that is employed by the operation.
The interpretation of the final integer argument, val3, depends on the operation.
参数解释:
uaddr:用户态下共享内存的地址,里面存放的是一个对齐的整型计数器;
futex_op:存放着操作类型,例如:
FUTEX_WAIT:
原子性的检查uaddr中计数器的值是否为val,如果是则让进程休眠,直到FUTEX_WAKE或者超时(time-out)。也就是把进程挂到uaddr相对应的等待队列上
FUTEX_WAKE:
最多唤醒val个等待在uaddr上进程。
Futex 的高级特性
1 私有 Futex (FUTEX_PRIVATE_FLAG)
对于进程内同步,可以使用私有标志提升性能:
// 私有 Futex 操作(进程内使用,内核可优化)
#define FUTEX_WAIT_PRIVATE (FUTEX_WAIT | FUTEX_PRIVATE_FLAG)
#define FUTEX_WAKE_PRIVATE (FUTEX_WAKE | FUTEX_PRIVATE_FLAG)
int futex_wait_private(uint32_t *uaddr, uint32_t expected) {
return syscall(SYS_futex, uaddr, FUTEX_WAIT_PRIVATE, expected, NULL, NULL, 0);
}
int futex_wake_private(uint32_t *uaddr, int count) {
return syscall(SYS_futex, uaddr, FUTEX_WAKE_PRIVATE, count, NULL, NULL, 0);
}
2 带超时的等待
int futex_wait_timeout(uint32_t *uaddr, uint32_t expected, long milliseconds) {
struct timespec timeout = {
.tv_sec = milliseconds / 1000,
.tv_nsec = (milliseconds % 1000) * 1000000
};
return syscall(SYS_futex, uaddr, FUTEX_WAIT, expected, &timeout, NULL, 0);
}
官方实战使用代码demo:
/* futex_demo.c
Usage: futex_demo [nloops]
(Default: 5)
Demonstrate the use of futexes in a program where parent and child
use a pair of futexes located inside a shared anonymous mapping to
synchronize access to a shared resource: the terminal. The two
processes each write 'num-loops' messages to the terminal and employ
a synchronization protocol that ensures that they alternate in
writing messages.
*/
#define _GNU_SOURCE
#include <stdio.h>
#include <errno.h>
#include <stdatomic.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <linux/futex.h>
#include <sys/time.h>
#define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \
} while (0)
static int *futex1, *futex2, *iaddr;
static int
futex(int *uaddr, int futex_op, int val,
const struct timespec *timeout, int *uaddr2, int val3)
{
return syscall(SYS_futex, uaddr, futex_op, val,
timeout, uaddr, val3);
}
/* Acquire the futex pointed to by 'futexp': wait for its value to
become 1, and then set the value to 0. */
static void
fwait(int *futexp)
{
int s;
/* atomic_compare_exchange_strong(ptr, oldval, newval)
atomically performs the equivalent of:
if (*ptr == *oldval)
*ptr = newval;
It returns true if the test yielded true and *ptr was updated. */
while (1) {
/* Is the futex available? */
const int one = 1;
if (atomic_compare_exchange_strong(futexp, &one, 0))
break; /* Yes */
/* Futex is not available; wait */
s = futex(futexp, FUTEX_WAIT, 0, NULL, NULL, 0);
if (s == -1 && errno != EAGAIN)
errExit("futex-FUTEX_WAIT");
}
}
/* Release the futex pointed to by 'futexp': if the futex currently
has the value 0, set its value to 1 and the wake any futex waiters,
so that if the peer is blocked in fpost(), it can proceed. */
static void
fpost(int *futexp)
{
int s;
/* atomic_compare_exchange_strong() was described in comments above */
const int zero = 0;
if (atomic_compare_exchange_strong(futexp, &zero, 1)) {
s = futex(futexp, FUTEX_WAKE, 1, NULL, NULL, 0);
if (s == -1)
errExit("futex-FUTEX_WAKE");
}
}
int
main(int argc, char *argv[])
{
pid_t childPid;
int j, nloops;
setbuf(stdout, NULL);
nloops = (argc > 1) ? atoi(argv[1]) : 5;
/* Create a shared anonymous mapping that will hold the futexes.
Since the futexes are being shared between processes, we
subsequently use the "shared" futex operations (i.e., not the
ones suffixed "_PRIVATE") */
iaddr = mmap(NULL, sizeof(int) * 2, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_SHARED, -1, 0);
if (iaddr == MAP_FAILED)
errExit("mmap");
futex1 = &iaddr[0];
futex2 = &iaddr[1];
*futex1 = 0; /* State: unavailable */
*futex2 = 1; /* State: available */
/* Create a child process that inherits the shared anonymous
mapping */
childPid = fork();
if (childPid == -1)
errExit("fork");
if (childPid == 0) { /* Child */
for (j = 0; j < nloops; j++) {
fwait(futex1);
printf("Child (%ld) %d\n", (long) getpid(), j);
fpost(futex2);
}
exit(EXIT_SUCCESS);
}
/* Parent falls through to here */
for (j = 0; j < nloops; j++) {
fwait(futex2);
printf("Parent (%ld) %d\n", (long) getpid(), j);
fpost(futex1);
}
wait(NULL);
exit(EXIT_SUCCESS);
}
这个futex_demo.c中还有一个比较关键的方法atomic_compare_exchange_strong,
atomic_compare_exchange_strong(ptr, oldval, newval) 是现代 C/C++ 中用于实现原子比较交换操作的标准函数,注释中其实也很清楚写到了:
ptr指向的值如果和oldval指向的值相等,那么就把ptr指向值更新为newval,而且返回true。
编译运行部分:
编译方式:
gcc futex_demo.c -o futex_man
执行结果如下:
test@test:~/demos/futex$ ./futex_man
Parent (201299) 0
Child (201300) 0
Parent (201299) 1
Child (201300) 1
Parent (201299) 2
Child (201300) 2
Parent (201299) 3
Child (201300) 3
Parent (201299) 4
Child (201300) 4
与传统同步机制对比性能优势
// 传统互斥锁(总是需要系统调用)
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
void traditional_lock() {
pthread_mutex_lock(&mutex); // 总是进入内核
// 临界区
pthread_mutex_unlock(&mutex); // 总是进入内核
}
// Futex 互斥锁(无竞争时完全在用户空间)
//首先原子操作判断值,如果不符合直接返回,不会进入内核
if (atomic_compare_exchange_strong(futexp, &one, 0))
break; /* Yes */
/* Futex is not available; wait */
//进入内核
s = futex(futexp, FUTEX_WAIT, 0, NULL, NULL, 0);
更多framework实战开发干货,请关注下面“千里马学框架”
Futex同步机制实战解析

1448

被折叠的 条评论
为什么被折叠?



