linux永久修改oom adj,【翻译】如何配置LinuxOOMKiller【HowtoConfiguretheLinuxOut-of-MemoryKiller】...

最新推荐文章于 2024-12-06 11:25:05 发布

转载最新推荐文章于 2024-12-06 11:25:05 发布 · 1k 阅读

文章标签：

#linux永久修改oom adj

本文介绍Linux OOMKiller的基本配置选项及其对不同进程的影响。通过调整配置，可以改变系统面对内存不足时的行为，如重启系统或调整特定进程的优先级。此外，还介绍了如何完全禁用OOMKiller及可能带来的风险。

【Configuring the OOM Killer】

The OOM killer on Linux has several configuration options that

allow developers some choice as to the behavior the system will

exhibit when it is faced with an out-of-memory condition. These

settings and choices vary depending on the environment and

applications that the system has configured on it.

Linux上的OOM

Killer有几个配置选项，当系统面临内存不足的情况时，允许开发人员对系统表现出的行为进行一些选择。这些设置和选择因系统配置的环境和应用程序而异。

Note: It's suggested that testing and tuning be performed in a

development environment before making changes on important

production systems.

注意：建议在对重要的生产系统进行更改之前，在开发环境中进行测试和调优。

In some environments, when a system runs a single critical

task, rebooting when a system runs into an OOM condition might be a

viable option to return the system back to operational status

quickly without administrator intervention. While not an optimal

approach, the logic behind this is that if our application is

unable to operate due to being killed by the OOM killer, then a

reboot of the system will restore the application if it starts with

the system at boot time. If the application is manually started by

an administrator, this option is not beneficial.

在某些环境中，当系统运行单个关键任务时，在系统运行到OOM状态时重新启动可能是一个可行的选择，可以使系统在没有管理员干预的情况下快速恢复到可运行状态。虽然不是最佳方法，但其背后的逻辑是，如果我们的应用程序是被OOM

killer 关闭而停止，那么如果它会在系统启动时随系统拉起。如果应用程序是由管理员手动启动的，则此选项没有好处。

(没太理解，是说如果程序不能自动拉起，是跟随系统启动而拉起的话，此时出现OOM后就最好重启系统？)

The following settings will cause the system to panic and

reboot in an out-of-memory condition. The sysctl commands will set

this in real time, and appending the settings to sysctl.conf will

allow these settings to survive reboots. The X for kernel.panic is

the number of seconds before the system should be rebooted. This

setting should be adjusted to meet the needs of your

environment.

以下设置将导致系统死机，并在内存不足的情况下重新启动。sysctl命令对此设置后立即生效，并将设置附加到sysctl.conf将允许这些设置在重新启动后继续生效。X代表内核。panic是系统重新启动前的秒数。应该调整此设置以满足您的环境需求。

sysctl vm.panic_on_oom=1

sysctl kernel.panic=X

echo "vm.panic_on_oom=1" >> /etc/sysctl.conf

echo "kernel.panic=X" >> /etc/sysctl.conf

We can also tune the way that the OOM killer handles OOM

conditions with certain processes. Take, for example, our oracle

process 2592 that was killed earlier. If we want to make our oracle

process less likely to be killed by the OOM killer, we can do the

following.

我们还可以调整OOM

killer用某些过程处理OOM情况。例如，我们的oracle进程2592在早些时候被终止了。如果我们想降低我们的oracle进程被OOM

killer 关掉的可能性，我们可以按下面来操作。

echo -15 > /proc/2592/oom_adj

We can make the OOM killer more likely to kill our oracle

process by doing the following.

如下操作可以让OOM killer更倾向于关闭我们的oracle进程。

echo 10 > /proc/2592/oom_adj

If we want to exclude our oracle process from the OOM killer,

we can do the following, which will exclude it completely from the

OOM killer. It is important to note that this might cause

unexpected behavior depending on the resources and configuration of

the system. If the kernel is unable to kill a process using a large

amount of memory, it will move onto other available processes. Some

of those processes might be important operating system processes

that ultimately might cause the system to go down.

如果想从OOM killer中排除我们的oracle进程，我们可以执行以下操作，这将从OOM

killer中完全排除它。需要注意的是，这可能会导致意外行为，具体取决于系统的资源和配置。如果内核不能kill一个使用大量内存的进程，它将继续考察其他可kill的进程，其中一些进程可能是重要的操作系统进程，(kill掉某一个)可能导致系统崩溃。

echo -17 > /proc/2592/oom_adj

We can set valid ranges for oom_adj from -16 to +15, and a

setting of -17 exempts a process entirely from the OOM killer. The

higher the number, the more likely our process will be selected for

termination if the system encounters an OOM condition. The contents

of /proc/2592/oom_score can also be viewed to determine how likely

a process is to be killed by the OOM killer. A score of 0 is an

indication that our process is exempt from the OOM killer. The

higher the OOM score, the more likely a process will be killed in

an OOM condition.

我们可以把oom_adj设置为从-16到+15的有效范围，设置-17的话就是将进程从OOM

killer中排除。(进程的oom_adj)数值越大，系统遇到OOM情况时就越有可能被选择终止。也可以查看/proc/2592/oom_score的值来确定一个进程被oom

killer关闭的可能性。0分表明进程不会被OOM

killer考虑关闭。oom_score值越高，进程在OOM条件下被终止的可能性就越大。

The OOM killer can be completely disabled with the following

command. This is not recommended for production environments,

because if an out-of-memory condition does present itself, there

could be unexpected behavior depending on the available system

resources and configuration. This unexpected behavior could be

anything from a kernel panic to a hang depending on the resources

available to the kernel at the time of the OOM condition.

使用以下命令可以完全禁用OOM

killer。不建议在生产环境中使用，因为如果内存不足的情况出现，根据可用的系统资源和配置，可能会出现意外行为。这种意外的行为可能是从内核死机到挂起的任何事情，这取决于在OOM条件下内核可用的资源。

sysctl vm.overcommit_memory=2

echo "vm.overcommit_memory=2" >> /etc/sysctl.conf

For some environments, these configuration options are not

optimal and further tuning and adjustments might be needed.

Configuring HugePages for your kernel can assist with OOM issues

depending on the needs of the applications running on the

system.

对于某些环境，这些配置选项不是最佳的，可能需要进一步的调优和调整。根据系统上运行的应用程序的需要，为您的内核配置大页HugePages可以帮助解决OOM问题。