HugePages on Oracle Linux 64-bit (文档 ID 361468.1) 如何设置HugePages ?

本文深入探讨了Linux系统中HugePages的配置及其对Oracle数据库性能的影响,详细介绍了配置步骤、原理、注意事项及配置后验证方法,旨在帮助系统管理员和数据库管理员实现更高效的内存管理。

Purpose

This document aims to provide.

  • Basic configuration of HugePages on 64-bit Linux
  • Fundemental reasons to use HugePages on Linux
  • References to known problems
  • References to technical background on HugePages

Scope

Information in this document is useful for Linux system administrators and Oracle database administrators working with system administrators.

This document covers information about Linux HugePages for 64-bit architectures. For more generic and uses on 32-bit and for references please seeDocument 361323.1

The configuration steps provided here is primarily for Oracle Linux. Still the same concepts and configurations should apply to other Linux distributions.

Details

Introduction

HugePages is a feature of the Linux kernel  which allows larger pages to manage memory as the alternative to the small 4KB pagesize. For a detailed introduction, seeDocument 361323.1

Why Do You Need HugePages?

HugePages is crucial for faster Oracle database performance on Linux if you have a large RAM and SGA. If your combined database SGAs is large (like more than 8GB, can even be important for smaller), you will need HugePages configured. Note that the size of the SGA matters. Advantages of HugePages are:

 

  • Larger Page Size and Less # of Pages: Default page size is 4K whereas the HugeTLB size is 2048K. That means the system would need to handle 512 times less pages.
  • Reduced Page Table Walking: Since a HugePage covers greater contiguous virtual address range than a regular sized page, a probability of getting a TLB hit per TLB entry with HugePages are higher than with regular pages. This reduces the number of times page tables are walked to obtain physical address from a virtual address.
  • Less Overhead for Memory Operations: On virtual memory systems (any modern OS) each memory operation is actually two abstract memory operations. With HugePages, since there are less number of pages to work on, the possible bottleneck on page table access is clearly avoided.
  • Less Memory Usage: From the Oracle Database perspective, with HugePages, the Linux kernel will use less memory to create pagetables to maintain virtual to physical mappings for SGA address range, in comparison to regular size pages. This makes more memory to be available for process-private computations or PGA usage.
  • No Swapping: We must avoid swapping to happen on Linux OS at allDocument 1295478.1. HugePages are not swappable (whereas regular pages are). Therefore there is no page replacement mechanism overhead. HugePages are universally regarded as pinned.
  • No 'kswapd' Operations: kswapd will get very busy if there is a very large area to be paged (i.e. 13 million page table entries for 50GB memory) and will use an incredible amount of CPU resource. When HugePages are used, kswapd is not involved in managing them. See also Document 361670.1
There is a general misconception where the HugePages is a feature specific to 32-bit Linux.  HugePages is a generic feature available to all word-sizes and architectures. Just that there are some specifics with 32-bit platforms. Please see Document 361323.1 for further references.

 

How to Configure

The configuration steps below will guide you to do a persistent system configuration where you would need to do a complete reboot of the system. Please plan your operations accordingly:

下面的配置步骤将指导你做一个持久的系统配置,你需要做一个完整的重新启动系统。请计划好相应的操作。

Step 1: Have the memlock user limit set in /etc/security/limits.conf file. Set the value (in KB) slightly smaller than installed RAM. e.g. If you have 64GB RAM installed, you may set:

步骤1:设置的/ etc /security/ limits.conf文件memlock用户限制。设定值(单位为KB)略小于安装的RAM。例如如果你安装了64GB=67108864KB内存,你可以设置:

*   soft   memlock    60397977
*   hard   memlock    60397977

There is no harm in setting this value large than your SGA requirements.

设置此值比你大的SGA的要求是没有害处。

The parameters will be set by default on:

  • Oracle Linux with oracle-validated package (See Document 437743.1) installed.
  • Oracle Exadata DB compute nodes



Step 2: Re-logon to the Oracle product owner account (e.g. 'oracle') and check thememlock limit

第2步:重新登录到Oracle产品所有者帐户(例如'oracle')和检查memlock限制

$ ulimit -l
60397977

Step 3: If you have Oracle Database 11g or later, the default database created uses the Automatic Memory Management (AMM) feature which is incompatible with HugePages. Disable AMM before proceeding. To disable, set the initialization parameters MEMORY_TARGET and MEMORY_MAX_TARGET to 0 (zero). Please see Document 749851.1 for further information in case you encounter the error below:

ORA-00845: MEMORY_TARGET not supported on this system

步骤3:如果你的Oracle数据库是11g或更高版本,默认创建的数据库使用自动内存管理(AMM)功能,这是不符合hugepages的。禁用AMM,然后再继续。要禁用,设置初始化参数MEMORY_TARGET和MEMORY_MAX_TARGET的值为0。

Step 4: Make sure that all your database instances are up (including ASM instances) as they would run on production. Use the script hugepages_settings.sh inDocument 401749.1 to calculate the recommended value for the vm.nr_hugepages kernel parameter. e.g.:

第4步:确保所有的数据库实例(包括ASM实例)都已启动。使用脚本hugepages_settings.sh(在文档401749.1)计算建议的vm.nr_hugepages内核参数值。例如:

$ ./hugepages_settings.sh
...
Recommended setting: vm.nr_hugepages = 1496
$

Note: You can also calculate a proper value for the parameter yourself but that is not advised if you do not have extensive experience with HugePages and concepts.

注意:您还可以自己计算的参数的值,但如果你没有拥有丰富的hugepages经验与概念,不建议自己计算。

Step 5: Edit the file /etc/sysctl.conf and set the vm.nr_hugepages parameter there:

第5步:编辑文件/ etc / sysctl.conf并在那里设置vm.nr_hugepages参数:

...
vm.nr_hugepages = 1496
...

This will make the parameter to be set properly with each reboot.

这将使参数每次重启正确设置。

Step 6: Stop all the database instances and reboot the server

第6步:停止所有的数据库实例,并重新启动服务器。

(Although settings could have been done dynamically they would not be effective to the extent we require before a reboot. The best practice is to do a persistent system configuration and perform a reboot to complete the configuration as presented through the steps above)

What If the Database / SGA Configurations Change?

The performed configuration is basically based on the RAM installed and combined size of SGA of database instances you are running. Based on that when:

  • Amount of RAM installed for the Linux OS changed
  • New database instance(s) introduced
  • SGA size / configuration changed for one or more database instances
you should revise your HugePages configuration to make it suitable to the new memory framework. If not you may experience one or more problems below on the system:
  • Poor database performance
  • System running out of memory or excessive swapping
  • Database instances cannot be started
  • Crucial system services failing

Check and Validate the Configuration 检查并验证配置

After the system is rebooted, make sure that your database instances (including the ASM instances) are started. Automatic startup via OS configuration or CRS, or manual startup (whichever method you use) should have been performed. Check the HugePages state from /proc/meminfo. e.g.:

系统重新启动后,请确保你的数据库实例(包括ASM实例)以启动。通过操作系统配置或CRS自动启动,或手动启动(无论使用何种方法)应该已经完成。从/ proc / meminfo中检查hugepages的状态。例如:

 

# grep HugePages /proc/meminfo
HugePages_Total:    1496
HugePages_Free:      485
HugePages_Rsvd:      446
HugePages_Surp:        0

The values in the output will vary. To make sure that the configuration is valid, theHugePages_Free value should be smaller thanHugePages_Total and there should be someHugePages_Rsvd. HugePages_Rsvd counts free pages that are reserved for use (requested for an SGA, but not touched/mapped yet).

The sum of Hugepages_Free and HugePages_Rsvd may be smaller than your total combined SGA as instances allocate pages dynamically and proactively as needed.

输出中的数值会有所不同。为了确保配置有效,应HugePages_Free值较HugePages_Total小,应该有一些HugePages_Rsvd。 HugePages_Rsvd的数量是为使用(SGA的请求,还没有碰到或尚未映射)保留的。

Hugepages_Free和HugePages_Rsvd的合计,可能小于总结合SGA实例分配动态页面和主动请求的总和。

 

 

Script

 

#!/bin/bash
#
# hugepages_settings.sh
#
# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
#
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
#
# This script is provided by Doc ID 401749.1 from My Oracle Support
# http://support.oracle.com

# Welcome text
echo "
This script is provided by Doc ID 401749.1 from My Oracle Support
(http://support.oracle.com) where it is intended to compute values for
the recommended HugePages/HugeTLB configuration for the current shared
memory segments. Before proceeding with the execution please note following:
 * For ASM instance, it needs to configure ASMM instead of AMM.
 * The 'pga_aggregate_target' is outside the SGA and
   you should accommodate this while calculating SGA size.
 * In case you changes the DB SGA size,
   as the new SGA will not fit in the previous HugePages configuration,
   it had better disable the whole HugePages,
   start the DB with new SGA size and run the script again.
And make sure that:
 * Oracle Database instance(s) are up and running
 * Oracle Database 11g Automatic Memory Management (AMM) is not setup
   (See Doc ID 749851.1)
 * The shared memory segments can be listed by command:
     # ipcs -m


Press Enter to proceed..."

read

# Check for the kernel version
KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'`

# Find out the HugePage size
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'`
if [ -z "$HPG_SZ" ];then
    echo "The hugepages may not be supported in the system where the script is being executed."
    exit 1
fi

# Initialize the counter
NUM_PG=0

# Cumulative number of pages required to handle the running shared memory segments
for SEG_BYTES in `ipcs -m | cut -c44-300 | awk '{print $1}' | grep "[0-9][0-9]*"`
do
    MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`
    if [ $MIN_PG -gt 0 ]; then
        NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`
    fi
done

RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q`

# An SGA less than 100MB does not make sense
# Bail out if that is the case
if [ $RES_BYTES -lt 100000000 ]; then
    echo "***********"
    echo "** ERROR **"
    echo "***********"
    echo "Sorry! There are not enough total of shared memory segments allocated for
HugePages configuration. HugePages can only be used for shared memory segments
that you can list by command:

    # ipcs -m

of a size that can match an Oracle Database SGA. Please make sure that:
 * Oracle Database instance is up and running
 * Oracle Database 11g Automatic Memory Management (AMM) is not configured"
    exit 1
fi

# Finish with results
case $KERN in
    '2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;
           echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;
    '2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
     *) echo "Unrecognized kernel version $KERN. Exiting." ;;
esac

# End

 

 

 

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值