Linux内存管理之页面回收_linux pageoutrun-优快云博客

本文链接：https://blog.youkuaiyun.com/zhouhuacai/article/details/78152268

当Linux内存紧张时，内核通过页面回收来释放内存，包括LRU算法和第二次机会法。LRU利用链表管理活跃和不活跃页面，而第二次机会法考虑了页面的访问频率。此外，kswapd内核线程在内存不足时负责回收页面，确保系统的稳定运行。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

概述

有内存页面分配，自然就有内存页面回收。一种是主动释放内存，另一种是内核去回收内存页面。

在linux内存充足时，内核会尽量多地使用内存作为文件缓存（page cache）从而提高系统性能，但当内存紧张时，文件缓存页面会被丢弃或回写到块设备中，然后释放出物理内存。当然这会在一定程度上影响系统的性能。

Linux内核将很少使用到的内存换出去交换（swap）分区,以便释放内存，这个机制称为页交换（wappping）.这些处理机制统称为页面回收。

页面回收算法

Linux内核中采用的页交换算法主要是LRU算法和第二次机会法（second chance）。

LRU算法

LRUj Least recently used(最近最少使用)的缩写。在内存不足时，最近最少使用的内存页面会成为被换出的候选者。
LRU算法使用链表来管理，分为活跃LRU，不活跃LRU。页面总是在活跃LRU与不活跃LRU之间转移。
这里写图片描述

第二次机会法

从LRU算法上可以看出，当系统内存短缺时,LRU链表尾部的页面将会离开并被换出。当系统需要这些页面时，这些页面会重新置于LRU链表的开头。显然这个设计不是很巧妙，在换出页面的时候，没有考虑到使用情况的频繁程度。也就是即便是频繁使用的页面，依然会因为在LRU链表尾部而被换出。

第二次机会法就是为了改进上述的缺点。当选择置换页面时，依然与LRU算法一样，但二次机会法设置了一个访问状态位。所以要检查页面的访问位，如果是0，就淘汰这页面。如果访问位是1，就给它第二次机会，并选择下一个页面来换出。当该页面得到第二次机会时，它的访问位被清0，如果在该页在些期、间再次被访问过，访问位则被置1.因此，如果一个页面经常会使用，其访问位总保持为1，它一直不会被淘汰出去。

kswapd内核线程

Linux内核中有一个非常重要的内核线程kswapd，负责定期及在内存不足的情况下回收页面。
kswapd内核线程初始化时会为系统每个NUMA内存节点创建一个名为“kswapd%d”的内核线程。
这里写图片描述

kswap会在内存页面小于PAGE_LOW时被唤醒。

static inline struct page *
__alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
    struct zonelist *zonelist, enum zone_type high_zoneidx,
    nodemask_t *nodemask, struct zone *preferred_zone,
    int migratetype)
{
    const gfp_t wait = gfp_mask & __GFP_WAIT;
    struct page *page = NULL;
    int alloc_flags;
    unsigned long pages_reclaimed = 0;
    unsigned long did_some_progress;
    bool sync_migration = false;
    bool deferred_compaction = false;
    bool contended_compaction = false;

    /*
     * In the slowpath, we sanity check order to avoid ever trying to
     * reclaim >= MAX_ORDER areas which will never succeed. Callers may
     * be using allocators in order of preference for an area that is
     * too large.
     */
    if (order >= MAX_ORDER) {
        WARN_ON_ONCE(!(gfp_mask & __GFP_NOWARN));
        return NULL;
    }

    /*
     * GFP_THISNODE (meaning __GFP_THISNODE, __GFP_NORETRY and
     * __GFP_NOWARN set) should not cause reclaim since the subsystem
     * (f.e. slab) using GFP_THISNODE may choose to trigger reclaim
     * using a larger set of nodes after it has established that the
     * allowed per node queues are empty and that nodes are
     * over allocated.
     */
    if (IS_ENABLED(CONFIG_NUMA) &&
            (gfp_mask & GFP_THISNODE) == GFP_THISNODE)
        goto nopage;

restart:
    if (!(gfp_mask & __GFP_NO_KSWAPD))
        wake_all_kswapd(order, zonelist, high_zoneidx,
                        zone_idx(preferred_zone));//唤醒kswapd线程

    /*
     * OK, we're below the kswapd watermark and have kicked background
     * reclaim. Now things get more complex, so set up alloc_flags according
     * to how we want to proceed.
     */
    alloc_flags = gfp_to_alloc_flags(gfp_mask);

    /*
     * Find the true preferred zone if the allocation is unconstrained by
     * cpusets.
     */
    if (!(alloc_flags & ALLOC_CPUSET) && !nodemask)
        first_zones_zonelist(zonelist, high_zoneidx, NULL,
                    &preferred_zone);

rebalance:
    /* This is the last chance, in general, before the goto nopage. */
    page = get_page_from_freelist(gfp_mask, nodemask, order, zonelist,
            high_zoneidx, alloc_flags & ~ALLOC_NO_WATERMARKS,
            preferred_zone, migratetype);
    if (page)
        goto got_pg;

    /* Allocate without watermarks if the context allows */
    if (alloc_flags & ALLOC_NO_WATERMARKS) {
        /*
         * Ignore mempolicies if ALLOC_NO_WATERMARKS on the grounds
         * the allocation is high priority and these type of
         * allocations are system rather than user orientated
         */
        zonelist = node_zonelist(numa_node_id(), gfp_mask);

        page = __alloc_pages_high_priority(gfp_mask, order,
                zonelist, high_zoneidx, nodemask,
                preferred_zone, migratetype);
        if (page) {
            goto got_pg;
        }
    }

    /* Atomic allocations - we can't balance anything */
    if (!wait)
        goto nopage;

    /* Avoid recursion of direct reclaim */