linux永久修改oom adj,【翻译】如何配置LinuxOOMKiller【HowtoConfiguretheLinuxOut-of-MemoryKiller】...

本文介绍Linux OOMKiller的基本配置选项及其对不同进程的影响。通过调整配置,可以改变系统面对内存不足时的行为,如重启系统或调整特定进程的优先级。此外,还介绍了如何完全禁用OOMKiller及可能带来的风险。

【Configuring the OOM Killer】

The OOM killer on Linux has several configuration options that

allow developers some choice as to the behavior the system will

exhibit when it is faced with an out-of-memory condition. These

settings and choices vary depending on the environment and

applications that the system has configured on it.

Linux上的OOM

Killer有几个配置选项,当系统面临内存不足的情况时,允许开发人员对系统表现出的行为进行一些选择。这些设置和选择因系统配置的环境和应用程序而异。

Note: It's suggested that testing and tuning be performed in a

development environment before making changes on important

production systems.

注意:建议在对重要的生产系统进行更改之前,在开发环境中进行测试和调优。

In some environments, when a system runs a single critical

task, rebooting when a system runs into an OOM condition might be a

viable option to return the system back to operational status

quickly without administrator intervention. While not an optimal

approach, the logic behind this is that if our application is

unable to operate due to being killed by the OOM killer, then a

reboot of the system will restore the application if it starts with

the system at boot time. If the application is manually started by

an administrator, this option is not beneficial.

在某些环境中,当系统运行单个关键任务时,在系统运行到OOM状态时重新启动可能是一个可行的选择,可以使系统在没有管理员干预的情况下快速恢复到可运行状态。虽然不是最佳方法,但其背后的逻辑是,如果我们的应用程序是被OOM

killer 关闭而停止,那么如果它会在系统启动时随系统拉起。如果应用程序是由管理员手动启动的,则此选项没有好处。

(没太理解,是说如果程序不能自动拉起,是跟随系统启动而拉起的话,此时出现OOM后就最好重启系统?)

The following settings will cause the system to panic and

reboot in an out-of-memory condition. The sysctl commands will set

this in real time, and appending the settings to sysctl.conf will

allow these settings to survive reboots. The X for kernel.panic is

the number of seconds before the system should be rebooted. This

setting should be adjusted to meet the needs of your

environment.

以下设置将导致系统死机,并在内存不足的情况下重新启动。sysctl命令对此设置后立即生效,并将设置附加到sysctl.conf将允许这些设置在重新启动后继续生效。X代表内核。panic是系统重新启动前的秒数。应该调整此设置以满足您的环境需求。

sysctl vm.panic_on_oom=1

sysctl kernel.panic=X

echo "vm.panic_on_oom=1" >> /etc/sysctl.conf

echo "kernel.panic=X" >> /etc/sysctl.conf

We can also tune the way that the OOM killer handles OOM

conditions with certain processes. Take, for example, our oracle

process 2592 that was killed earlier. If we want to make our oracle

process less likely to be killed by the OOM killer, we can do the

following.

我们还可以调整OOM

killer用某些过程处理OOM情况。例如,我们的oracle进程2592在早些时候被终止了。如果我们想降低我们的oracle进程被OOM

killer 关掉的可能性,我们可以按下面来操作。

echo -15 > /proc/2592/oom_adj

We can make the OOM killer more likely to kill our oracle

process by doing the following.

如下操作可以让OOM killer更倾向于关闭我们的oracle进程。

echo 10 > /proc/2592/oom_adj

If we want to exclude our oracle process from the OOM killer,

we can do the following, which will exclude it completely from the

OOM killer. It is important to note that this might cause

unexpected behavior depending on the resources and configuration of

the system. If the kernel is unable to kill a process using a large

amount of memory, it will move onto other available processes. Some

of those processes might be important operating system processes

that ultimately might cause the system to go down.

如果想从OOM killer中排除我们的oracle进程,我们可以执行以下操作,这将从OOM

killer中完全排除它。需要注意的是,这可能会导致意外行为,具体取决于系统的资源和配置。如果内核不能kill一个使用大量内存的进程,它将继续考察其他可kill的进程,其中一些进程可能是重要的操作系统进程,(kill掉某一个)可能导致系统崩溃。

echo -17 > /proc/2592/oom_adj

We can set valid ranges for oom_adj from -16 to +15, and a

setting of -17 exempts a process entirely from the OOM killer. The

higher the number, the more likely our process will be selected for

termination if the system encounters an OOM condition. The contents

of /proc/2592/oom_score can also be viewed to determine how likely

a process is to be killed by the OOM killer. A score of 0 is an

indication that our process is exempt from the OOM killer. The

higher the OOM score, the more likely a process will be killed in

an OOM condition.

我们可以把oom_adj设置为从-16到+15的有效范围,设置-17的话就是将进程从OOM

killer中排除。(进程的oom_adj)数值越大,系统遇到OOM情况时就越有可能被选择终止。也可以查看/proc/2592/oom_score的值来确定一个进程被oom

killer关闭的可能性。0分表明进程不会被OOM

killer考虑关闭。oom_score值越高,进程在OOM条件下被终止的可能性就越大。

The OOM killer can be completely disabled with the following

command. This is not recommended for production environments,

because if an out-of-memory condition does present itself, there

could be unexpected behavior depending on the available system

resources and configuration. This unexpected behavior could be

anything from a kernel panic to a hang depending on the resources

available to the kernel at the time of the OOM condition.

使用以下命令可以完全禁用OOM

killer。不建议在生产环境中使用,因为如果内存不足的情况出现,根据可用的系统资源和配置,可能会出现意外行为。这种意外的行为可能是从内核死机到挂起的任何事情,这取决于在OOM条件下内核可用的资源。

sysctl vm.overcommit_memory=2

echo "vm.overcommit_memory=2" >> /etc/sysctl.conf

For some environments, these configuration options are not

optimal and further tuning and adjustments might be needed.

Configuring HugePages for your kernel can assist with OOM issues

depending on the needs of the applications running on the

system.

对于某些环境,这些配置选项不是最佳的,可能需要进一步的调优和调整。根据系统上运行的应用程序的需要,为您的内核配置大页HugePages可以帮助解决OOM问题。

See Also

Here are some additional resources about HugePages, dtrace,

sar, and OOM for NUMA architectures:

HugePages information on My Oracle Support (requires login):

HugePages on Linux: What It Is... and What It Is Not... [ID

361323.1]

dtrace

sar

NUMA architecture and OOM

About the Author

Robert Chase is a member of the Oracle Linux product

management team. He has been involved with Linux and open source

software since 1996. He has worked with systems as small as

embedded devices and with large supercomputer-class hardware.

Revision 1.0, 02/19/2013

<think>我们正在讨论CDH中YARNResourceManager的配置,特别是关于`yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage`参数。以下将详细解释该参数的作用和使用场景,并说明在CDH中的配置方法。###1.`yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage`参数详解该参数属于YARN的NodeManager级别配置,用于控制Linux容器执行器(LinuxContainerExecutor)如何使用cgroups进行资源限制。其作用如下:|**属性**|**说明**||----------|----------||**参数名**|`yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage`||**默认值**|`false`(在大多数CDH版本中)||**生效条件**|必须启用LinuxContainerExecutor且开启cgroups资源隔离||**可选值**|`true`(严格模式),`false`(宽松模式)|####参数作用机制:-**当设置为`false`(默认值)时**:NodeManager在资源隔离中采取**软限制**策略。即:1.允许容器**临时超用**资源(如CPU、内存),但cgroups会监控实际使用情况。2.若容器持续超用内存,Linux内核的OOMKiller会终止该容器进程。```mermaidgraphLRA[容器申请资源]-->B{资源使用}B-->|未超限|C[正常执行]B-->|临时超限|D[cgroups允许短暂超用]D-->|持续超内存|E[OOMKiller终止进程]```-**当设置为`true`时**:NodeManager启用**硬性资源限制**:1.cgroups严格按照配置值限制资源使用量。2.容器**无法超用**任何资源(例如尝试分配超过限制的内存时,将直接触发内存分配失败)。3.适用于对资源隔离要求极高的场景。####使用场景对比:|**场景需求**|**推荐值**|**原因**||--------------|------------|----------||避免瞬时资源峰值导致任务失败|`false`|允许短期超用,提高任务成功率||严格保障多租户资源隔离|`true`|防止恶意或异常任务挤占资源||测试环境或低负载集群|`false`|减少配置复杂性||生产环境高SLA要求|`true`|确保资源分配的确定性|###2.CDH中的配置步骤在ClouderaManager中配置该参数的流程如下:1.**登录ClouderaManager控制台**访问`http://<CM_HOST>:7180`,导航至**集群>YARN服务>配置**。2.**定位配置项**在搜索框输入`yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage`,或手动定位路径:**分类**→`NodeManager`→**高级配置代码段(安全阀)**。3.**设置参数值**```xml<property><name>yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage</name><value>true</value><!--或false--></property>```4.**同步配置并重启**-保存配置变更。-重启YARN服务(或仅重启NodeManager角色)。###3.与cgroups相关的其他关键参数配置`strict-resource-usage`时需联动调整以下参数:|**参数名**|**依赖关系**|**建议值**||------------|--------------|------------||`yarn.nodemanager.resource.percentage-physical-cpu-limit`|控制CPU使用率上限|如`50`表示50%物理CPU||`yarn.nodemanager.linux-container-executor.cgroups.hierarchy`|cgroups挂载路径|`/yarn`(默认)||`yarn.nodemanager.linux-container-executor.resources-handler.class`|资源处理器|`org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler`|>**验证配置生效**:>在NodeManager节点执行以下命令检查cgroup规则:>```bash>cat/sys/fs/cgroup/memory/yarn/container_*/memory.limit_in_bytes>```>若`strict-resource-usage=true`,容器内存限制将严格等于`yarn.scheduler.minimum-allocation-mb`的整数倍[^1]。###4.常见问题处理####Q:开启严格模式后任务频繁失败?**可能原因**:容器资源申请值过于接近实际需求,未预留缓冲空间。**解决方案**:-调整`yarn.scheduler.maximum-allocation-mb`增大单容器上限。-在队列配置中预留部分buffer资源(如资源池的`maximum-capacity`设为90%)。####Q:cgroups未生效?**排查步骤**:1.确认`/etc/yarn/conf/container-executor.cfg`中`cgroups.mount`已开启。2.检查NodeManager日志:```bashgrep-icgroup/var/log/yarn/yarn-nodemanager-*.log```出现`Usingstrictresourceusage`表示参数生效[^1]。§§相关问题§§1.如何在CDH中开启YARN的cgroups资源隔离?2.配置`yarn.nodemanager.resource.percentage-physical-cpu-limit=50`有何实际影响?3.YARN容器内存超用如何监控和告警?4.CDH版本升级后cgroups配置不兼容如何处理?[^2][^1]:CDHYARN资源隔离文档[^2]:CDH升级检查清单</think> ### 深入解析:CDH中YARN ResourceManager配置与`cgroups.strict-resource-usage`参数 #### 一、参数核心作用 **`yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage`** 是YARN NodeManager级别的关键参数,用于控制Linux容器是否强制执行cgroups的硬性资源限制: | **属性** | **默认值** | **作用机制** | **资源类型** | |----------|-----------|-------------|-------------| | **严格模式**(`true`) | `false` | 强制容器资源不超限 | 内存、CPU | | **宽松模式**(`false`) | ✓ (默认) | 允许短暂资源超用 | 内存、CPU | 数学模型表达资源约束: 设容器申请资源为$R_{\text{request}}$,实际使用为$R_{\text{used}}$,则: - **严格模式**:$R_{\text{used}} \leq R_{\text{request}}$(始终成立) - **宽松模式**:$R_{\text{used}} \leq R_{\text{request}} + \delta$(允许瞬时超量$\delta$) #### 二、配置场景与最佳实践 1. **场景选择** - ✅ **用严格模式**: - 多租户环境下需要强隔离性 - 安全敏感型应用(如金融计算) - CDH集群资源利用率>80%的高压场景 - ✅ **用宽松模式**: - 开发/测试环境 - 突发性负载(如临时数据分析任务) - 老旧硬件资源弹性不足时 2. **CDH配置路径** ```bash Cloudera Manager → YARN服务 → 配置搜索 → 输入参数名 → 修改值 ``` ```xml <!-- 配置示例(安全阀添加) --> <property> <name>yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage</name> <value>true</value> <!-- 或 false --> </property> ``` 3. **资源关联参数** 需协同配置(单位:MB): ```math \begin{cases} \text{NodeManager总内存} & : \texttt{yarn.nodemanager.resource.memory-mb} \\ \text{单容器最小内存} & : \texttt{yarn.scheduler.minimum-allocation-mb} \\ \text{虚拟内存比率} & : \texttt{yarn.nodemanager.vmem-pmem-ratio} \end{cases} ``` #### 三、严格模式底层机制 当启用`strict-resource-usage=true`时: 1. cgroups内存子系统通过`memory.memsw.limit_in_bytes`设置**绝对硬上限** 2. 容器超内存立即触发OOM Killer终止进程(而非先尝试回收) 3. CPU通过`cpu.cfs_quota_us`实现强隔离 ```mermaid graph TD A[容器进程] --> B[cgroups memory子系统] B -->|申请内存| C{检查剩余配额} C -->|严格模式| D[拒绝超额申请] C -->|宽松模式| E[允许超额+延迟回收] ``` #### 四、生产环境验证命令 ```bash # 检查生效配置 grep 'strict-resource-usage' /etc/hadoop/conf/yarn-site.xml # 监控容器资源隔离 watch -n 1 'cat /sys/fs/cgroup/memory/yarn/container_*/memory.usage_in_bytes' ``` > ⚠️ **重要警示**: > 切换严格模式后必须滚动重启NodeManager(非ResourceManager): > ```bash > sudo service hadoop-yarn-nodemanager restart > ```
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符  | 博主筛选后可见
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值