Linux内存管理 (21)OOM

本文深入剖析Linux的OOM(Out of Memory)机制,详细解释了OOM触发条件、影响因素及处理流程。通过实例演示了如何诊断和解决OOM问题。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

专题:Linux内存管理专题

关键词:OOM、oom_adj、oom_score、badness

 

Linux内核为了提高内存的使用效率采用过度分配内存(over-commit memory)的办法,造成物理内存过度紧张进而触发OOM机制来杀死一些进程回收内存。

该机制会监控那些占用内存过大,尤其是瞬间很快消耗大量内存的进程,为了防止内存耗尽会把该进程杀掉。

1. 关于OOM

内核检测到系统内存不足,在内存分配路径上触发out_of_memory,然后调用select_bad_process()选择一个'bad'进程杀掉,判断和选择一个‘bad'进程的过程由oom_badness()决定。

Linux下每个进程都有自己的OOM权重,在/proc/<pid>/oom_adj里面,范围是-17到+15,取值越高,越容易被杀掉。

下面从几个方便来分析OOM:

  • 触发OOM的条件是什么?
  • 影响OOM行为有哪些参数?
  • OOM流程分析。
  • 一个OOM实例的解析。

2. OOM触发路径

在内存分配路径上,当内存不足的时候会触发kswapd、或者内存规整,极端情况会触发OOM,来获取更多内存。

在内存回收失败之后,__alloc_pages_may_oom是OOM的入口,但是主要工作在out_of_memory中进行处理。

由于Linux内存都是以页为单位,所以__alloc_pages_nodemask是必经之处。

alloc_pages
  ->_alloc_pages
    ->__alloc_pages_nodemask
      ->__alloc_pages_slowpath-------------------------此时已经说明内存不够,会触发一些内存回收、内存规整机制,极端情况触发OOM。
        ->__alloc_pages_may_oom -----------------------进入OOM的开始,包括一些检查动作。
->out_of_memory------------------------------OOM的核心
->select_bad_process-----------------------选择最'bad'进程
->oom_scan_process_thread ->oom_badness----------------------------计算当前进程有多'badness'
->oom_kill_process-------------------------杀死选中的进程

还有一种情况是do_page_fault(),如果产生VM_FAULT_OOM错误,就进入pagefault_out_of_memory()。

asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long write,
                              unsigned long mmu_meh)
{
...
good_area:
...
        fault = handle_mm_fault(vma, address, write ? FAULT_FLAG_WRITE : 0);
        if (unlikely(fault & VM_FAULT_ERROR)) {
                if (fault & VM_FAULT_OOM)-------------------------------------------handle_mm_fault()时产生VM_FAULT_OOM错误,进入out_of_memory处理。
                        goto out_of_memory;
                else if (fault & VM_FAULT_SIGBUS)
                        goto do_sigbus;
                else if (fault & VM_FAULT_SIGSEGV)
                        goto bad_area;
                BUG();
        }
        if (fault & VM_FAULT_MAJOR)
                tsk->maj_flt++;
        else
                tsk->min_flt++;

        up_read(&mm->mmap_sem);
        return;
...
out_of_memory:
        pagefault_out_of_memory();
        return;
...
}


void pagefault_out_of_memory(void)
{
    struct oom_control oc = {
        .zonelist = NULL,
        .nodemask = NULL,
        .memcg = NULL,
        .gfp_mask = 0,
        .order = 0,------------------------------------------------------------------单个页面情况。
    };
...
    out_of_memory(&oc);
}

 

3. 影响OOM的内核参数

参照Linux内存管理 (23)内存sysfs节点和工具的OOM章节。 

4. OOM代码分析

OOM的核心数据结构是struct oom_control,在include/linux/oom.h中。

struct oom_control {
    /* Used to determine cpuset */
    struct zonelist *zonelist;

    /* Used to determine mempolicy */
    nodemask_t *nodemask;

    /* Memory cgroup in which oom is invoked, or NULL for global oom */
    struct mem_cgroup *memcg;

    /* Used to determine cpuset and node locality requirement */
    const gfp_t gfp_mask;----------------------------------发生异常时页面分配掩码。

    /*
     * order == -1 means the oom kill is required by sysrq, otherwise only
     * for display purposes.
     */
    const int order;---------------------------------------发生异常时申请页面order大小。

    /* Used by oom implementation, do not set */
    unsigned long totalpages;
    struct task_struct *chosen;----------------------------OOM选中的当前进程结构。
    unsigned long chosen_points;---------------------------OOM对进程评分的最高分。
};

__alloc_pages_may_oom是内存分配路径上的OOM入口,在进入OOM之前还会检查一些特殊情况。

static inline struct page *
__alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
    const struct alloc_context *ac, unsigned long *did_some_progress)
{
    struct oom_control oc = {---------------------------------------------------------OOM控制参数。
        .zonelist = ac->zonelist,
        .nodemask = ac->nodemask,
        .memcg = NULL,
        .gfp_mask = gfp_mask,
        .order = order,
    };
    struct page *page;

    *did_some_progress = 0;

    /*
     * Acquire the oom lock.  If that fails, somebody else is
     * making progress for us.
     */
    if (!mutex_trylock(&oom_lock)) {
        *did_some_progress = 1;
        schedule_timeout_uninterruptible(1);
        return NULL;
    }

    page = get_page_from_freelist(gfp_mask | __GFP_HARDWALL, order,
                    ALLOC_WMARK_HIGH|ALLOC_CPUSET, ac);-----------------------------再次使用高水位检查一次,是否需要启动OOM流程。
    if (page)
        goto out;

    if (!(gfp_mask & __GFP_NOFAIL)) {----------------------------------------------__GFP_NOFAIL是不允许内存申请失败的情况,下面都是允许失败的处理。
        /* Coredumps can quickly deplete all memory reserves */
        if (current->flags & PF_DUMPCORE)
            goto out;
        /* The OOM killer will not help higher order allocs */
        if (order > PAGE_ALLOC_COSTLY_ORDER)---------------------------------------order超过3的申请失败,不会启动OOM回收。
            goto out;
        /* The OOM killer does not needlessly kill tasks for lowmem */
        if (ac->high_zoneidx < ZONE_NORMAL)
            goto out;
        if (pm_suspended_storage())
            goto out;

        /* The OOM killer may not free memory on a specific node */
        if (gfp_mask & __GFP_THISNODE)
            goto out;
    }
    /* Exhausted what can be done so it's blamo time */
    if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) {-------------经过上面各种情况,任然需要进行OOM处理。调用out_of_memory()。
        *did_some_progress = 1;

        if (gfp_mask & __GFP_NOFAIL) {
            page = get_page_from_freelist(gfp_mask, order,
                    ALLOC_NO_WATERMARKS|ALLOC_CPUSET, ac);-------------------------对于__GFP_NOFAIL的分配情况,降低分配条件从ALLOC_WMARK_HIGH|ALLOC_CPUSET降低到ALLOC_NO_WATERMARKS|ALLOC_CPUSET。
            /*
             * fallback to ignore cpuset restriction if our nodes
             * are depleted
             */
            if (!page)
                page = get_page_from_freelist(gfp_mask, order,
                    ALLOC_NO_WATERMARKS, ac);--------------------------------------如果还是分配失败,再次降低分配标准,从ALLOC_NO_WATERMARKS|ALLOC_CPUSET降低到ALLOC_NO_WATERMARKS。真的是为了成功,节操越来越低啊。
        }
    }
out:
    mutex_unlock(&oom_lock);
    return page;
}

out_of_memory函数是OOM机制的核心,他可以分为两部分。一是调挑选最’bad‘的进程,二是杀死它。

bool out_of_memory(struct oom_control *oc)
{
    unsigned long freed = 0;
    enum oom_constraint constraint = CONSTRAINT_NONE;

    if (oom_killer_disabled)----------------------------------------------------在freeze_processes会将其置位,即禁止OOM;在thaw_processes会将其清零,即打开OOM。所以,如果在冻结过程,不允许OOM。
        return false;

    if (!is_memcg_oom(oc)) {
        blocking_notifier_call_chain(&oom_notify_list, 0, &freed);
        if (freed > 0)
            /* Got some memory back in the last second. */
            return true;
    }

    /*
     * If current has a pending SIGKILL or is exiting, then automatically
     * select it.  The goal is to allow it to allocate so that it may
     * quickly exit and free its memory.
     */
    if (task_will_free_mem(current)) {----------------------------------------如果当前进程正因为各种原因将要退出,或者释放内存,将当前进程作为OOM候选者,然后唤醒OOM reaper去收割进而释放内存。
        mark_oom_victim(current);
        wake_oom_reaper(current);
        return true;
    }

    /*
     * The OOM killer does not compensate for IO-less reclaim.
     * pagefault_out_of_memory lost its gfp context so we have to
     * make sure exclude 0 mask - all other users should have at least
     * ___GFP_DIRECT_RECLAIM to get here.
     */
    if (oc->gfp_mask && !(oc->gfp_mask & (__GFP_FS|__GFP_NOFAIL)))-----------如果内存申请掩码包括__GFP_DS或__GFP_NOFAIL,则不进行OOM收割。
        return true;

    constraint = constrained_alloc(oc);--------------------------------------未定义CONFIG_NUMA返回CONSTRAINT_NONE。
    if (constraint != CONSTRAINT_MEMORY_POLICY)
        oc->nodemask = NULL;
    check_panic_on_oom(oc, constraint);--------------------------------------检查sysctl_panic_on_oom设置,以及是否由sysrq触发,来决定是否触发panic。

    if (!is_memcg_oom(oc) && sysctl_oom_kill_allocating_task &&--------------如果设置了sysctl_oom_kill_allocating_task,那么当内存耗尽时,会把当前申请内存分配的进程杀掉。
        current->mm && !oom_unkillable_task(current, NULL, oc->nodemask) &&
        current->signal->oom_score_adj != OOM_SCORE_ADJ_MIN) {
        get_task_struct(current);
        oc->chosen = current;
        oom_kill_process(oc, "Out of memory (oom_kill_allocating_task)");
        return true;
    }

    select_bad_process(oc);-------------------------------------------------遍历所有进程,进程下的线程,查找合适的候选进程。
    /* Found nothing?!?! Either we hang forever, or we panic. */
    if (!oc->chosen && !is_sysrq_oom(oc) && !is_memcg_oom(oc)) {------------如果没有合适候选进程,并且OOM不是由sysrq触发的,进入panic。
        dump_header(oc, NULL);
        panic("Out of memory and no killable processes...\n");
    }
    if (oc->chosen && oc->chosen != (void *)-1UL) {
        oom_kill_process(oc, !is_memcg_oom(oc) ? "Out of memory" :
                 "Memory cgroup out of memory");----------------------------杀死选中的进程。
        schedule_timeout_killable(1);
    }
    return !!oc->chosen;
}

select_bad_process()通过oom_evaluate_task()来评估每个进程的得分,对于进程1、内核线程、得分低的进程直接跳过。

static void select_bad_process(struct oom_control *oc)
{
    if (is_memcg_oom(oc))
        mem_cgroup_scan_tasks(oc->memcg, oom_evaluate_task, oc);
    else {
        struct task_struct *p;

        rcu_read_lock();
        for_each_process(p)----------------------------------------------遍历系统范围内所有进程线程。
            if (oom_evaluate_task(p, oc))
                break;
        rcu_read_unlock();
    }

    oc->chosen_points = oc->chosen_points * 1000 / oc->totalpages;
}

static int oom_evaluate_task(struct task_struct *task, void *arg)
{
    struct oom_control *oc = arg;
    unsigned long points;

    if (oom_unkillable_task(task, NULL, oc->nodemask))-------------------进程1以及内核线程等等不能被kill的线程跳过。
        goto next;

    if (!is_sysrq_oom(oc) && tsk_is_oom_victim(task)) {
        if (test_bit(MMF_OOM_SKIP, &task->signal->oom_mm->flags))
            goto next;
        goto abort;
    }

    if (oom_task_origin(task)) {
        points = ULONG_MAX;
        goto select;
    }

    points = oom_badness(task, NULL, oc->nodemask, oc->totalpages);------对进程task进行打分。
    if (!points || points < oc->chosen_points)---------------------------这里保证只取最高分的进程,所以分数最高者被选中。其他情况则直接跳过。
        goto next;

    /* Prefer thread group leaders for display purposes */
    if (points == oc->chosen_points && thread_group_leader(oc->chosen))
        goto next;
select:
    if (oc->chosen)
        put_task_struct(oc->chosen);
    get_task_struct(task);
    oc->chosen = task;--------------------------------------------------更新OOM选中的进程和当前最高分。
    oc->chosen_points = points;
next:
    return 0;
abort:
    if (oc->chosen)
        put_task_struct(oc->chosen);
    oc->chosen = (void *)-1UL;
    return 1;
}

在oom_badness中计算当前进程的得分,返回选中进程的结构体,以及进程得分ppoints。

oom_badness()是给进程打分的函数,可以说是核心中的核心。最终结果受oom_score_adj和当前进程内存使用量综合影响。

unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
              const nodemask_t *nodemask, unsigned long totalpages)
{
    long points;
    long adj;

    if (oom_unkillable_task(p, memcg, nodemask))-------------------------------如果进程不可被杀,直接跳过。
        return 0;

    p = find_lock_task_mm(p);
    if (!p)
        return 0;

    /*
     * Do not even consider tasks which are explicitly marked oom
     * unkillable or have been already oom reaped or the are in
     * the middle of vfork
     */
    adj = (long)p->signal->oom_score_adj;--------------------------------------获取当前进程的oom_score_adj参数。
    if (adj == OOM_SCORE_ADJ_MIN ||
            test_bit(MMF_OOM_SKIP, &p->mm->flags) ||
            in_vfork(p)) {
        task_unlock(p);
        return 0;--------------------------------------------------------------如果当前进程oom_score_adj为OOM_SCORE_ADJ_MIN的话,就返回0.等于告诉OOM,此进程不参数'bad'评比。
    }

    /*
     * The baseline for the badness score is the proportion of RAM that each
     * task's rss, pagetable and swap space use.
     */
    points = get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS) +
        atomic_long_read(&p->mm->nr_ptes) + mm_nr_pmds(p->mm);-----------------可以看出points综合了内存占用情况,包括RSS部分、swap file或者swap device占用内存、以及页表占用内存。
    task_unlock(p);

    /*
     * Root processes get 3% bonus, just like the __vm_enough_memory()
     * implementation used by LSMs.
     */
    if (has_capability_noaudit(p, CAP_SYS_ADMIN))------------------------------如果是root用户,增加3%的使用特权。
        points -= (points * 3) / 100;

    /* Normalize to oom_score_adj units */
    adj *= totalpages / 1000;--------------------------------------------------这里可以看出oom_score_adj对最终分数的影响,如果oom_score_adj小于0,则最终points就会变小,进程更加不会被选中。
    points += adj;-------------------------------------------------------------将归一化后的adj和points求和,作为当前进程的分数。

    /*
     * Never return 0 for an eligible task regardless of the root bonus and
     * oom_score_adj (oom_score_adj can't be OOM_SCORE_ADJ_MIN here).
     */
    return points > 0 ? points : 1;
}

oom_kill_process()用于杀死最高分的进程,包括进程下的线程。

static void oom_kill_process(struct oom_control *oc, const char *message)
{
    struct task_struct *p = oc->chosen;
    unsigned int points = oc->chosen_points;
    struct task_struct *victim = p;
    struct task_struct *child;
    struct task_struct *t;
    struct mm_struct *mm;
    unsigned int victim_points = 0;
    static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
                          DEFAULT_RATELIMIT_BURST);
    bool can_oom_reap = true;

    task_lock(p);
    if (task_will_free_mem(p)) {------------------------------------------对于非coredump正处于退出状态的线程,标注TIF_MEMDIE并唤醒reaper线程进行收割,然后退出。
        mark_oom_victim(p);
        wake_oom_reaper(p);
        task_unlock(p);
        put_task_struct(p);
        return;
    }
    task_unlock(p);

    if (__ratelimit(&oom_rs))
        dump_header(oc, p);----------------------------------------------在kill进程之前,将系统栈信息、内存信息、所有进程的内存消耗情况打印。

    pr_err("%s: Kill process %d (%s) score %u or sacrifice child\n",
        message, task_pid_nr(p), p->comm, points);-----------------------输出将要kill掉的进程名、pid、score。

    /*
     * If any of p's children has a different mm and is eligible for kill,
     * the one with the highest oom_badness() score is sacrificed for its
     * parent.  This attempts to lose the minimal amount of work done while
     * still freeing memory.
     */
    read_lock(&tasklist_lock);
    for_each_thread(p, t) {-----------------------------------------------遍历进程下的线程
        list_for_each_entry(child, &t->children, sibling) {
            unsigned int child_points;

            if (process_shares_mm(child, p->mm))
                continue;
            /*
             * oom_badness() returns 0 if the thread is unkillable
             */
            child_points = oom_badness(child,
                oc->memcg, oc->nodemask, oc->totalpages);-----------------计算子线程的得分情况
            if (child_points > victim_points) {---------------------------将得分最高者计为victim,得分为victim_points。
                put_task_struct(victim);
                victim = child;
                victim_points = child_points;
                get_task_struct(victim);
            }
        }
    }
    read_unlock(&tasklist_lock);

    p = find_lock_task_mm(victim);
    if (!p) {
        put_task_struct(victim);
        return;
    } else if (victim != p) {
        get_task_struct(p);
        put_task_struct(victim);
        victim = p;
    }

    /* Get a reference to safely compare mm after task_unlock(victim) */
    mm = victim->mm;
    atomic_inc(&mm->mm_count);
    /*
     * We should send SIGKILL before setting TIF_MEMDIE in order to prevent
     * the OOM victim from depleting the memory reserves from the user
     * space under its control.
     */
    do_send_sig_info(SIGKILL, SEND_SIG_FORCED, victim, true);--------------发送SIGKILL信号给victim进程。
    mark_oom_victim(victim);-----------------------------------------------标注TIF_MEMDIE是因为OOM被杀死。
    pr_err("Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB, shmem-rss:%lukB\n",
        task_pid_nr(victim), victim->comm, K(victim->mm->total_vm),
        K(get_mm_counter(victim->mm, MM_ANONPAGES)),
        K(get_mm_counter(victim->mm, MM_FILEPAGES)),
        K(get_mm_counter(victim->mm, MM_SHMEMPAGES)));---------------------被kill进程的内存信息。
    task_unlock(victim);

    /*
     * Kill all user processes sharing victim->mm in other thread groups, if
     * any.  They don't get access to memory reserves, though, to avoid
     * depletion of all memory.  This prevents mm->mmap_sem livelock when an
     * oom killed thread cannot exit because it requires the semaphore and
     * its contended by another thread trying to allocate memory itself.
     * That thread will now get access to memory reserves since it has a
     * pending fatal signal.
     */
    rcu_read_lock();
    for_each_process(p) {--------------------------------------------------继续处理共享内存的相关线程
        if (!process_shares_mm(p, mm))
            continue;
        if (same_thread_group(p, victim))
            continue;
        if (is_global_init(p)) {
            can_oom_reap = false;
            set_bit(MMF_OOM_SKIP, &mm->flags);
            pr_info("oom killer %d (%s) has mm pinned by %d (%s)\n",
                    task_pid_nr(victim), victim->comm,
                    task_pid_nr(p), p->comm);
            continue;
        }
        /*
         * No use_mm() user needs to read from the userspace so we are
         * ok to reap it.
         */
        if (unlikely(p->flags & PF_KTHREAD))-----------------------------内核线程跳过。
            continue;
        do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
    }
    rcu_read_unlock();

    if (can_oom_reap)
        wake_oom_reaper(victim);

    mmdrop(mm);---------------------------------------------------------释放mm空间的内存。包括申请的页面、mm结构体等。
    put_task_struct(victim);--------------------------------------------释放task_struct占用的内存空间,包括cgroup等等。
}

 dump_head()有助于发现OOM现场,找出OOM原因。

static void dump_header(struct oom_control *oc, struct task_struct *p)
{
    nodemask_t *nm = (oc->nodemask) ? oc->nodemask : &cpuset_current_mems_allowed;

    pr_warn("%s invoked oom-killer: gfp_mask=%#x(%pGg), nodemask=%*pbl, order=%d, oom_score_adj=%hd\n",
        current->comm, oc->gfp_mask, &oc->gfp_mask,
        nodemask_pr_args(nm), oc->order,
        current->signal->oom_score_adj);
    if (!IS_ENABLED(CONFIG_COMPACTION) && oc->order)
        pr_warn("COMPACTION is disabled!!!\n");

    cpuset_print_current_mems_allowed();
    dump_stack();-------------------------------------------------------输出当前现场的栈信息。
    if (oc->memcg)
        mem_cgroup_print_oom_info(oc->memcg, p);
    else
        show_mem(SHOW_MEM_FILTER_NODES);--------------------------------输出整个系统的内存使用情况。
    if (sysctl_oom_dump_tasks)
        dump_tasks(oc->memcg, oc->nodemask);----------------------------dump系统所有进程的内存使用情况。
}

static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
{
    struct task_struct *p;
    struct task_struct *task;

    pr_info("[ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name\n");
    rcu_read_lock();
    for_each_process(p) {
        if (oom_unkillable_task(p, memcg, nodemask))-------------------不可被kill的进程不显示。
            continue;

        task = find_lock_task_mm(p);-----------------------------------内核线程等没有自己的mm,也无法被kill,所以不显示。
        if (!task) {
            /*
             * This is a kthread or all of p's threads have already
             * detached their mm's.  There's no need to report
             * them; they can't be oom killed anyway.
             */
            continue;
        }

        pr_info("[%5d] %5d %5d %8lu %8lu %7ld %7ld %8lu         %5hd %s\n",
            task->pid, from_kuid(&init_user_ns, task_uid(task)),
            task->tgid, task->mm->total_vm, get_mm_rss(task->mm),
            atomic_long_read(&task->mm->nr_ptes),
            mm_nr_pmds(task->mm),
            get_mm_counter(task->mm, MM_SWAPENTS),
            task->signal->oom_score_adj, task->comm);-----------------total_vm和rss单位都是页。
        task_unlock(task);
    }
    rcu_read_unlock();
}

 

 

5. OOM实例解析

 

[19174.926798] copy invoked oom-killer: gfp_mask=0x24200c8(GFP_USER|__GFP_MOVABLE), nodemask=0, order=0, oom_score_adj=0--------参考dump_header(),输出OOM产生现场线程信息,包括分配掩码、OOM信息。
[19174.937586] CPU: 0 PID: 163 Comm: copy Not tainted 4.9.56 #1-----------------------------------------------------------------参考show_stack(),显示栈信息。可以看出OOM现场的调用信息,这里可以看出是CMA分配出发了OOM。
[19174.943274] 
Call Trace:
[<802f63c2>] dump_stack+0x1e/0x3c
[<80132224>] dump_header.isra.6+0x84/0x1a0
[<800f2d68>] oom_kill_process+0x23c/0x49c
[<800f32fc>] out_of_memory+0xb0/0x3a0
[<800f7834>] __alloc_pages_nodemask+0xa84/0xb5c
[<801306b8>] alloc_migrate_target+0x34/0x6c
[<8012f30c>] migrate_pages+0x108/0xbe4
[<800f8a0c>] alloc_contig_range+0x188/0x378
[<80130c54>] cma_alloc+0x100/0x220
[<80388fe2>] dma_alloc_from_contiguous+0x2e/0x48
[<8037bb30>] xxxxx_dma_alloc_coherent+0x48/0xdc
[<8037be8c>] mem_zone_ioctl+0xf0/0x198
[<80148cec>] do_vfs_ioctl+0x84/0x70c
[<80149408>] SyS_ioctl+0x94/0xb8
[<8004a246>] csky_systemcall+0x96/0xe0
[19175.001223] Mem-Info:--------------------------------------------------------------------------------------------------------参考show_mem(),输出系统内存详细使用情况。这里可以看出free=592很少,active_anon和shmem非常大。
[19175.003535] active_anon:99682 inactive_anon:12 isolated_anon:1
[19175.003535]  active_file:55 inactive_file:75 isolated_file:0
[19175.003535]  unevictable:0 dirty:0 writeback:0 unstable:0
[19175.003535]  slab_reclaimable:886 slab_unreclaimable:652
[19175.003535]  mapped:2 shmem:91862 pagetables:118 bounce:0
[19175.003535]  free:592 free_pcp:61 free_cma:0
[19175.035394] Node 0 active_anon:398728kB inactive_anon:48kB active_file:220kB inactive_file:300kB unevictable:0kB isolated(anon):4kB isolated(file):0kB mapped:8kB dirty:0kB writeback:0kB shmem:367448kB writeback_tmp:0kB unstable:0kB pages_scanned:2515 all_unreclaimable? yes
[19175.059602] Normal free:2368kB min:2444kB low:3052kB high:3660kB active_anon:398728kB inactive_anon:48kB active_file:220kB inactive_file:300kB unevictable:0kB writepending:0kB present:1048572kB managed:734584kB mlocked:0kB slab_reclaimable:3544kB slab_unreclaimable:2608kB kernel_stack:624kB pagetables:472kB bounce:0kB free_pcp:244kB local_pcp:244kB free_cma:0kB
[19175.091602] lowmem_reserve[]: 0 0 0
[19175.095144] Normal: 21*4kB (MHI) 14*8kB (MHI) 13*16kB (HI) 2*32kB (HI) 4*64kB (MI) 2*128kB (MH) 0*256kB 2*512kB (HI) 1*1024kB (H) 1*2048kB (I) 0*4096kB = 5076kB
91996 total pagecache pages
[19175.112370] 262143 pages RAM
[19175.115254] 0 pages HighMem/MovableOnly
[19175.119106] 78497 pages reserved
[19175.122350] 90112 pages cma reserved
[19175.125942] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name-------------------------------参考dump_tasks(),输出系统可被kill的进程内存使用情况。
[19175.134514] [  135]     0   135     1042       75       4       0        0         -1000 sshd
[19175.143070] [  146]     0   146      597      141       3       0        0             0 autologin
[19175.152057] [  147]     0   147      608      152       4       0        0             0 sh
[19175.160434] [  161]     0   161   109778     7328     104       0        0             0 xxxxx
[19175.169068] Out of memory: Kill process 161 (xxxxx) score 39 or sacrifice child
[19175.176439] Killed process 161 (xxxxx) total-vm:439112kB, anon-rss:29304kB, file-rss:8kB, shmem-rss:0kB

 

 通过上面的信息可以知道是哪个进程、OOM现场、哪些内存消耗太多。

这里需要重点查看系统的active_anon和shmem为什么如此大,造成了OOM。

 

相关阅读:《Linux OOM机制介绍》、《Linux内核OOM机制的详细分析》、《Linux内核OOM机制分析

 

转载于:https://www.cnblogs.com/arnoldlu/p/8567559.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值