NUMA and node interleaving

最新推荐文章于 2024-04-29 13:50:13 发布

原创最新推荐文章于 2024-04-29 13:50:13 发布 · 2.8k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#allocation #application #solaris #testing #os #access

在某些操作系统中启用NodeInterleaving可能导致性能下降及内存处理错误。该功能通过跨CPU分配内存来掩盖非统一内存访问的问题，但会牺牲整体性能。

Chris Linton-Ford writes:
> "Enable Node Interleaving": initially disabled, when turned on the
> operating system booted, but I got the following error message:
> "MPO disabled because memory is interleaved"

You definitely don't want "Node Interleaving." As a BIOS option, it
wins a special award for being the most misleading misfeature since
plug-n-play.

Opteron systems have memory attached to each CPU, but a global address
space. When you access memory that's not local to your CPU, you incur
a penalty going over the internal "hypertransport" between the CPUs.

On OSes (such as Solaris) that are aware of NUMA hardware, that's not
a problem. The OS automatically optimizes the allocation of memory
based on where the task is running (or vice-versa) so that you get
good locality.

On OSes that are ignorant of this sort of hardware, naive testing can
produce strange results. You run your test once and it runs fast.
Run it again, and it's slow. Run again, it's fast again. It all
depends on where Wind^Wthat OS puts your application, and it's
unpredictable.

This is where "Node Interleaving" comes in. Turning that misfeature
on causes memory to be mapped page-by-page in a round-robin fashion
among the CPUs instead of being contiguous. In other words, every 4K
boundary talks to a different CPU in the system. This averages out
the costs, making everything run equally poorly, and making your naive
OS and benchmarks seem to be stable.

The OS can't keep track of that sort of addressing mess, so it causes
MPO to be turned off. You don't want this on Solaris.