Hot and cold pages

该博客介绍了Linux内核中的一种内存管理优化策略,即区分热页和冷页。处理器缓存包含最近访问过的数据,内核通过维护每个CPU的热页和冷页列表来优化内存分配。当释放页面时,如果认为页面可能仍在缓存中(热页),则放入热列表,否则放入冷列表。分配内存时优先考虑热页,但在某些情况下,如用于DMA读操作,使用冷页更合适。这种方法减少了锁竞争,提高了性能。根据测试,此优化可以带来1%-12%的性能提升。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

void __free_pages(struct page *page, unsigned int order)
{
	if (put_page_testzero(page)) {//检查页框是否还有进程在使用,就是检查_count变量的值是否为0
		if (order == 0) //如果是1个页框,则放回每CPU高速缓存中
			free_hot_cold_page(page, false);
		else  //如果是多个页框,则放回伙伴系统
			__free_pages_ok(page, order);
	}
}

__free_pages 有个概念order 为0时,free_hot_cold_page,其中的cold page 和 hot page 的解释如下,参考

  1. https://lwn.net/Articles/14768/
  2. https://blog.youkuaiyun.com/u012489236/article/details/107397096

One generally thinks of a system’s RAM as being the fastest place to keep data. But memory is slow; the real speed comes from working out of the onboard cache in the processor itself. Much effort has, over the years, gone into trying to optimize the kernel’s cache behavior and avoiding the need to go to main memory. The new page allocation system is just another step in that direction.

The processor cache contains memory which has been accessed recently. The kernel often has a good idea of which pages have seen recent accesses and are thus likely to be present in cache. The hot-n-cold patch tries to take advantage of that information by adding two per-CPU free page lists (for each memory zone). When a processor frees a page that is suspected to be “hot” (i.e. represented in that processor’s cache), it gets pushed onto the hot list; others go onto the cold list. The lists have high and low limits; after all, if the hot list grows larger than the processor’s cache, the chances of those pages actually being hot start to get pretty small.

When the kernel needs a page of memory, the new allocator normally tries to get that page from the processor’s hot list. Even if the page is simply going to be overwritten, it’s still better to use a cache-warm page. Interestingly, though, there are times when it makes sense to use a cold page instead. If the page is to be used for DMA read operations, it will be filled by the device performing the operation and the cache will be invalidated anyway. So 2.5.45 includes a new GPF_COLD page allocation flag for the situations where using a cold page makes more sense.

The use of per-CPU page lists also cuts down on lock contention, which also helps performance. When pages must be moved between the hot/cold lists and the main memory allocator, they are transferred in multi-page chunks, which also cuts down on lock contention and makes things go faster.

Andrew Morton has benchmarked this patch, and included a number of results with one of the patchsets. Performance benefits vary from a mere 1-2% on the all-important kernel compilation time to 12% on the SDET test. That was enough, apparently, to convince Linus.

### PCIe L2 Power State 的 HotCold 差异 在讨论 PCIe 链路的状态时,L2 是一种低功率状态,在这种状态下链路仍然保持连接但处于节能模式。对于 Intel Bay Trail/J1900/N2940 平台而言,Linux Kernel 与硬件之间的兼容性问题可能会影响这些状态的行为表现[^1]。 #### 功耗模式概述 - **Hot Mode**: 在 L2 状态下,“Hot”表示设备虽然进入较低功耗模式但仍维持较高的唤醒能力。这意味着当主机请求重新激活链路时,恢复时间较短。然而,为了支持快速退出机制,某些电路仍需保持活动状态以监测信号变化并迅速响应。因此,尽管整体能耗有所降低,但在热态中的实际消耗依然高于冷态。 - **Cold Mode**: “Cold”则代表更深层次的休眠方式,此时大部分功能单元都会被关闭从而实现最大程度上的节电效果。不过相应地,从该种状况返回正常操作所需的时间也会变得更久一些因为需要更多初始化过程来重建完整的通信环境。 #### 技术细节对比表 | 特性 | Hot Mode | Cold Mode | |-----------------|-----------------------------------|----------------------------------| | 响应速度 | 较快 | 较慢 | | 能量节省程度 | 中等 | 显著 | | 维持的功能 | 更多 | 极少 | 需要注意的是具体实施情况会因不同厂商的设计而有所不同,并且操作系统及其驱动程序的支持水平也会影响到最终能否充分利用这两种不同的子状态所带来的好处。 ```bash # 示例命令用于查看当前系统的PCIe电源管理设置 lspci -vvv | grep "Power" ``` 上述脚本可以帮助用户初步了解其系统内的PCI Express组件所采用的各种电力管理模式配置详情。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值