Hot and cold pages

最新推荐文章于 2025-03-28 17:07:22 发布

不想做火影的海贼王

最新推荐文章于 2025-03-28 17:07:22 发布

阅读量240

点赞数

分类专栏： linux 内存管理

本文链接：https://blog.youkuaiyun.com/huifeidedabian/article/details/117286417

版权

linux 内存管理专栏收录该内容

3 篇文章

订阅专栏

该博客介绍了Linux内核中的一种内存管理优化策略，即区分热页和冷页。处理器缓存包含最近访问过的数据，内核通过维护每个CPU的热页和冷页列表来优化内存分配。当释放页面时，如果认为页面可能仍在缓存中（热页），则放入热列表，否则放入冷列表。分配内存时优先考虑热页，但在某些情况下，如用于DMA读操作，使用冷页更合适。这种方法减少了锁竞争，提高了性能。根据测试，此优化可以带来1%-12%的性能提升。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

void __free_pages(struct page *page, unsigned int order)
{
	if (put_page_testzero(page)) {//检查页框是否还有进程在使用，就是检查_count变量的值是否为0
		if (order == 0) //如果是1个页框，则放回每CPU高速缓存中
			free_hot_cold_page(page, false);
		else  //如果是多个页框，则放回伙伴系统
			__free_pages_ok(page, order);
	}
}

__free_pages 有个概念order 为0时，free_hot_cold_page，其中的cold page 和 hot page 的解释如下，参考

https://lwn.net/Articles/14768/
https://blog.youkuaiyun.com/u012489236/article/details/107397096

One generally thinks of a system’s RAM as being the fastest place to keep data. But memory is slow; the real speed comes from working out of the onboard cache in the processor itself. Much effort has, over the years, gone into trying to optimize the kernel’s cache behavior and avoiding the need to go to main memory. The new page allocation system is just another step in that direction.

The processor cache contains memory which has been accessed recently. The kernel often has a good idea of which pages have seen recent accesses and are thus likely to be present in cache. The hot-n-cold patch tries to take advantage of that information by adding two per-CPU free page lists (for each memory zone). When a processor frees a page that is suspected to be “hot” (i.e. represented in that processor’s cache), it gets pushed onto the hot list; others go onto the cold list. The lists have high and low limits; after all, if the hot list grows larger than the processor’s cache, the chances of those pages actually being hot start to get pretty small.

When the kernel needs a page of memory, the new allocator normally tries to get that page from the processor’s hot list. Even if the page is simply going to be overwritten, it’s still better to use a cache-warm page. Interestingly, though, there are times when it makes sense to use a cold page instead. If the page is to be used for DMA read operations, it will be filled by the device performing the operation and the cache will be invalidated anyway. So 2.5.45 includes a new GPF_COLD page allocation flag for the situations where using a cold page makes more sense.

The use of per-CPU page lists also cuts down on lock contention, which also helps performance. When pages must be moved between the hot/cold lists and the main memory allocator, they are transferred in multi-page chunks, which also cuts down on lock contention and makes things go faster.

Andrew Morton has benchmarked this patch, and included a number of results with one of the patchsets. Performance benefits vary from a mere 1-2% on the all-important kernel compilation time to 12% on the SDET test. That was enough, apparently, to convince Linus.