Per-CPU memory for user space-优快云博客

The kernel makes extensive use of per-CPU data as a way to avoid contention between processors and improve scalability. Using the same technique in user space is harder, though, since there is little control over which CPU a process may be running on at any given time. That hasn't stopped Mathieu Desnoyers from trying, though; in the memory-management track of the 2025 Linux Storage, Filesystem, Memory-Management, and BPF Summit, he presented a proposal for how user-space per-CPU memory could work.
内核广泛使用每 CPU 数据（per-CPU data）来避免处理器之间的竞争并提高可扩展性。然而，在用户空间中使用同样的技术要困难得多，因为无法很好地控制一个进程在任何时刻运行在哪个 CPU 上。但这并没有阻止 Mathieu Desnoyers 尝试这一方向；在 2025 年 Linux 存储、文件系统、内存管理与 BPF 峰会的内存管理专题中，他提出了一个用户空间 per-CPU 内存的工作提案。

Desnoyers started by saying that his objective is to help user-space developers make better use of restartable sequences, which facilitate some types of access to per-CPU data by interrupting a process if it is migrated during a critical section. User-space applications generally use thread-local storage for this kind of code, but that becomes inefficient if there are more threads running than CPUs to run on. Thread-local storage must also be defined statically, making it inflexible to work with, and it can slow down thread creation if the area is large.
Desnoyers 开场表示，他的目标是帮助用户空间开发者更好地利用可重启序列（restartable sequences），该机制允许在关键区段中，如果进程被迁移，则中断其访问，从而便于访问 per-CPU 数据。用户空间应用通常会使用线程局部存储（thread-local storage）来实现这类代码，但如果运行的线程数量多于 CPU 数量，这种方式的效率会变得很低。此外，线程局部存储必须在编译时静态定义，灵活性差，而且如果该区域很大，还会导致线程创建变慢。

So he would like to provide true per-CPU data as an alternative. One way not to do that, he said, is to structure per-CPU data in an array indexed by the CPU