【linux kernel】【VM管理】Linux overcommit处理机制

最新推荐文章于 2025-08-10 10:21:33 发布

原创最新推荐文章于 2025-08-10 10:21:33 发布 · 328 阅读

0 ·

CC 4.0 BY-SA版权

linux 同时被 3 个专栏收录

17 篇文章

订阅专栏

linux kernel

11 篇文章

订阅专栏

1 篇文章

订阅专栏

本文详细介绍了Linux内核支持的三种过度提交处理模式：启发式处理、总是过度提交和不进行过度提交。每种模式针对不同场景，如科学应用、确保未来内存分配可用性和典型系统使用。文章还讲解了如何通过sysctl设置策略，以及一些注意事项。

The Linux kernel supports the following overcommit handling modes

0 - Heuristic overcommit handling. Obvious overcommits of
address space are refused. Used for a typical system. It
ensures a seriously wild allocation fails while allowing
overcommit to reduce swap usage. root is allowed to
allocate slightly more memory in this mode. This is the
default.

1 - Always overcommit. Appropriate for some scientific
applications. Classic example is code using sparse arrays
and just relying on the virtual memory consisting almost
entirely of zero pages.

2 - Don’t overcommit. The total address space commit
for the system is not permitted to exceed swap + a
configurable amount (default is 50%) of physical RAM.
Depending on the amount you use, in most situations
this means a process will not be killed while accessing
pages but will receive errors on memory allocation as
appropriate.

	Useful for applications that want to guarantee their memory allocations will be available in the future without having to initialize every page.

The overcommit policy is set via the sysctl `vm.overcommit_memory’.

The overcommit amount can be set via `vm.overcommit_ratio’ (percentage)
or vm.overcommit_kbytes’ (absolute value).

The current overcommit limit and amount committed are viewable in
/proc/meminfo as CommitLimit and Committed_AS respectively.
在这里插入图片描述

Gotchas

The C language stack growth does an implicit mremap. If you want absolute guarantees and run close to the edge you MUST mmap your stack for the largest size you think you will need. For typical stack usage this does not matter much but it’s a corner case if you really really care.

In mode 2 the MAP_NORESERVE flag is ignored.

How It Works

The overcommit is based on the following rules

For a file backed map
SHARED or READ-only - 0 cost (the file is the map not swap)
PRIVATE WRITABLE - size of mapping per instance

For an anonymous or /dev/zero map
SHARED - size of mapping
PRIVATE READ-only - 0 cost (but of little use)
PRIVATE WRITABLE - size of mapping per instance

Additional accounting
Pages made writable copies by mmap
shmfs memory drawn from the same pool

Status

o We account mmap memory mappings
o We account mprotect changes in commit
o We account mremap changes in size
o We account brk
o We account munmap
o We report the commit status in /proc
o Account and check on fork
o Review stack handling/building on exec
o SHMfs accounting
o Implement actual limit enforcement