The Kernel Samepage Merging Process

本文深入解析了内核同页合并(KSM)服务的工作原理,包括其使用红黑树进行稳定和不稳定页的高效查找、匿名页的扫描排除、合并后的只读属性、用户空间应用注册候选区域进行合并的机制,以及合并页面的写操作处理流程。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

http://alouche.net/blog/2011/07/18/the-kernel-samepage-merging-process/

KSM, simply put is a service daemon which scans the page addresses to find duplicate pages, merges them and therefore reduces the memory density. The code used in this post as example can be found under /mm/ksm.c in the kernel source.

Before continuing, it is important to keep in mind that:

  • KSM uses a red-black tree for the stable and unstable trees - efficiency is  O(log n)  per tree since the height can never be more than  (2log (n+1))  with n being the number of nodes.
  • KSM only scans anonymous pages, file backed pages such as HugePages are not scanned and cannot be merged by KSM. This is different to Transparent Huge pages where as in RedHat 6.1, KSM will break up THP into small pages if shareable 4K pages are found and only if the system is running out of memory.
  • Merged pages are read-only as they are CoW protected.
  • Userspace application can register candidate regions for merging through the madwise() system call. We will not tackle the KSM API details in this post.
  • Because of the CoW nature, a merged page write action by an application will raise a page fault, which in return triggers the break_cow() routine, which issue a copy of the merged page to the writing application.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
static void break_cow(struct rmap_item *rmap_item)
{
         struct mm_struct *mm = rmap_item->mm;
         unsigned long addr = rmap_item->address;
         struct vm_area_struct *vma;

  /*
  * It is not an accident that whenever we want to break COW
  * to undo, we also need to drop a reference to the anon_vma.
  */
         put_anon_vma(rmap_item->anon_vma);

         down_read(&mm->mmap_sem);
         if (ksm_test_exit(mm))
                 goto out;
         vma = find_vma(mm, addr);
         if (!vma || vma->vm_start > addr)
                 goto out;
         if (!(vma->vm_flags & VM_MERGEABLE) || !vma->anon_vma)
                 goto out;
         break_ksm(vma, addr);
 out:
         up_read(&mm->mmap_sem);
 }

With this in mind, here is a summarized view on how KSM works per steps

KSM scan pages and elects whether a page could be considered to be merged… these pages are referred as “candidate” pages. To quickly state it, a candidate page which does not exists in the stable tree is added as a node to the unstable tree, but we will get to this later in the post. To determine if a page has changed or not, KSM relies on a 32bit checksum, which is then added to the page content and evaluated on the next scan.

1
2
3
4
5
6
7
8
static u32 calc_checksum(struct page *page)
{
         u32 checksum;
         void *addr = kmap_atomic(page, KM_USER0);
         checksum = jhash2(addr, PAGE_SIZE / 4, 17);
         kunmap_atomic(addr, KM_USER0);
         return checksum;
}

In other words, KSM finds page X, creates a checksum, stores it in page X - on the next scan, if the checksum of page X did not change, then it is considered as a candidate page.

For each candidate page, KSM starts a memcmp_pages() operation to the stable tree which contains the merged pages.

1
2
3
4
5
6
7
8
9
10
11
12
static int memcmp_pages(struct page *page1, struct page *page2)
{
         char *addr1, *addr2;
         int ret;

         addr1 = kmap_atomic(page1, KM_USER0);
         addr2 = kmap_atomic(page2, KM_USER1);
         ret = memcmp(addr1, addr2, PAGE_SIZE);
         kunmap_atomic(addr2, KM_USER1);
         kunmap_atomic(addr1, KM_USER0);
         return ret;
 }

This unique process is as follow:

1
2
3
4
5
6
7
8
9
10
ret = memcmp_pages(page, tree_page);

if (ret < 0) {
  put_page(tree_page);
node = node->rb_left;
} else if (ret > 0) {
put_page(tree_page);
node = node->rb_right;
} else
return tree_page;

Understanding the following requires an understanding of how a binary tree works in general, more specifically how a red-black tree works.

The stable tree is walked left if the candidate page is less than the page in the stable tree, right if the candidate page is superior to the stable page and the page is simply merge and the candidate page freed if both pages are identical.

The stable tree search function is referenced at http://lxr.free-electrons.com/source/mm/ksm.c#L985

Now if the candidate page was not found in the stable tree, its checksum is re-computed to determine whether the data has changed since or not. If it has changed it is then ignored; if not, the searching process continues in the unstable tree as with the search in the stable tree. The recursion __unstable_tree_search_insert() __can be seen at http://lxr.free-electrons.com/source/mm/ksm.c#L1078.

While searching the unstable tree, KSM will create a new node in this binary tree if the candidate page is unique

1
2
3
4
rmap_item->address |= UNSTABLE_FLAG;
rmap_item->address |= (ksm_scan.seqnr & SEQNR_MASK);
rb_link_node(&rmap_item->node, parent, new);
rb_insert_color(&rmap_item->node, &root_unstable_tree);

and if not unique such as the unstable tree contains similar candidate pages, it will be merged to the existing similar node and moved to the stable tree.

Once the KSM scan is done, the unstable tree is destroyed and recreated on the next iteration

I hope that was informative.

Cheers,

内容概要:本文档详细介绍了基于Google Earth Engine (GEE) 构建的阿比让绿地分析仪表盘的设计与实现。首先,定义了研究区域的几何图形并将其可视化。接着,通过云掩膜函数和裁剪操作预处理Sentinel-2遥感影像,筛选出高质量的数据用于后续分析。然后,计算中值图像并提取NDVI(归一化差异植被指数),进而识别绿地及其面积。此外,还实现了多个高级分析功能,如多年变化趋势分析、人口-绿地交叉分析、城市热岛效应分析、生物多样性评估、交通可达性分析、城市扩张分析以及自动生成优化建议等。最后,提供了数据导出、移动端适配和报告生成功能,确保系统的实用性和便捷性。 适合人群:具备一定地理信息系统(GIS)和遥感基础知识的专业人士,如城市规划师、环境科学家、生态学家等。 使用场景及目标:①评估城市绿地分布及其变化趋势;②分析绿地与人口的关系,为城市规划提供依据;③研究城市热岛效应及生物多样性,支持环境保护决策;④评估交通可达性,优化城市交通网络;⑤监测城市扩张情况,辅助土地利用管理。 其他说明:该系统不仅提供了丰富的可视化工具,还集成了多种空间分析方法,能够帮助用户深入理解城市绿地的空间特征及其对环境和社会的影响。同时,系统支持移动端适配,方便随时随地进行分析。用户可以根据实际需求选择不同的分析模块,生成定制化的报告,为城市管理提供科学依据。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值