NUMA and node interleaving

在某些操作系统中启用NodeInterleaving可能导致性能下降及内存处理错误。该功能通过跨CPU分配内存来掩盖非统一内存访问的问题,但会牺牲整体性能。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Chris Linton-Ford writes:
> "Enable Node Interleaving": initially disabled, when turned on the
> operating system booted, but I got the following error message:
> "MPO disabled because memory is interleaved"

You definitely don't want "Node Interleaving."  As a BIOS option, it
wins a special award for being the most misleading misfeature since
plug-n-play.

Opteron systems have memory attached to each CPU, but a global address
space.  When you access memory that's not local to your CPU, you incur
a penalty going over the internal "hypertransport" between the CPUs.

On OSes (such as Solaris) that are aware of NUMA hardware, that's not
a problem.  The OS automatically optimizes the allocation of memory
based on where the task is running (or vice-versa) so that you get
good locality.

On OSes that are ignorant of this sort of hardware, naive testing can
produce strange results.  You run your test once and it runs fast.
Run it again, and it's slow.  Run again, it's fast again.  It all
depends on where Wind^Wthat OS puts your application, and it's
unpredictable.

This is where "Node Interleaving" comes in.  Turning that misfeature
on causes memory to be mapped page-by-page in a round-robin fashion
among the CPUs instead of being contiguous.  In other words, every 4K
boundary talks to a different CPU in the system.  This averages out
the costs, making everything run equally poorly, and making your naive
OS and benchmarks seem to be stable.

The OS can't keep track of that sort of addressing mess, so it causes
MPO to be turned off.  You don't want this on Solaris.
 
NUMA(non-uniform memory access)是一种计算机系统架构,它允许多个处理器和内存子系统同时工作。在NUMA系统中,不同的CPU与内存之间的访问速度是不同的,因此需要进行一些特殊处理来保证性能。在NUMA系统中,每个CPU和内存子系统都被分配到一个NUMA节点中。 在代码中,NUMA节点通常使用NUMA API进行管理和访问。例如,通过调用numa_alloc_onnode()函数可以在指定的NUMA节点上分配内存。同时,numa_run_on_node()函数可以将当前线程绑定到指定的NUMA节点上运行,以确保访问该节点上的内存。 以下是一个使用NUMA API的代码示例: ``` #include <numa.h> int main() { // 获取系统中的NUMA节点数量 int num_nodes = numa_max_node() + 1; printf("Number of NUMA nodes: %d\n", num_nodes); // 在NUMA节点0上分配1MB的内存 void *mem = numa_alloc_onnode(1024 * 1024, 0); printf("Allocated memory on NUMA node 0\n"); // 将当前线程绑定到NUMA节点1上 numa_run_on_node(1); printf("Running on NUMA node 1\n"); // 访问NUMA节点1上的内存 *((int*) mem) = 42; printf("Memory value on NUMA node 1: %d\n", *((int*) mem)); return 0; } ``` 在这个示例中,我们使用numa_max_node()函数获取了系统中的NUMA节点数量,并使用numa_alloc_onnode()函数在NUMA节点0上分配了1MB的内存。接着,我们使用numa_run_on_node()函数将当前线程绑定到NUMA节点1上,并访问了在NUMA节点0上分配的内存。通过这种方式,我们可以在不同的NUMA节点之间进行内存访问,从而实现更高效的计算和数据处理。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值