linux下修改drop_cache参数触发ORA-600 [KGHLKREM1]

在Linux环境下,当系统配置了hugepages且执行了修改drop_cache参数的操作时,可能会导致Oracle数据库实例崩溃。文章详细分析了问题原因及解决方案,并通过实例验证了解决方法的有效性。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

昨天在主站的3个节点上执行了如下命令:

echo 3 > /proc/sys/vm/drop_cache

直接导致其中一个节点2实例宕掉,详细的告警日志信息如下:

Mon Jun 25 17:06:51 CST 2012
Errors in file /oracle/admin/yesmynet/bdump/yesmynet2_lmon_10048.trc:
ORA-00600: internal error code, arguments: [KGHLKREM1], [0x4BC000020], [], [], [], [], [], []
Mon Jun 25 17:06:52 CST 2012
Trace dumping is performing id=[cdmp_20120625170652]
Mon Jun 25 17:06:52 CST 2012
Errors in file /oracle/admin/yesmynet/bdump/yesmynet2_lmon_10048.trc:
ORA-00600: internal error code, arguments: [KGHLKREM1], [0x4BC000020], [], [], [], [], [], []
Mon Jun 25 17:06:52 CST 2012
LMON: terminating instance due to error 481
Mon Jun 25 17:06:52 CST 2012
Shutting down instance (abort)
License high water mark = 798
Mon Jun 25 17:06:57 CST 2012
Instance terminated by LMON, pid = 10048
Mon Jun 25 17:06:57 CST 2012
Instance terminated by USER, pid = 29345

可以看出,17:06分的时候,lmon进程直接terminate实例2,mos相关文档描述如下:

ORA-600 [KGHLKREM1] On Linux Using Parameter drop_cache On hugepages Configuration [ID 1070812.1]   修改时间 20-DEC-2011     类型 PROBLEM     状态 PUBLISHED  

In this Document
  
  

  

asm1_lmd0_8600.trc
~~~~~~~~~~~~~~~~~~
*** 2010-02-08 15:57:38.274
***** Internal heap ERROR KGHLKREM1 addr=0x6c400020 ds=0x60000058 *****
***** Dump of memory around addr 0x6c400020:
06C3FF020 00000000 00000000 00000000 00000000 [................]
Repeat 511 times





 

Changes

1. On your system you are running with vm.drop_caches=1 (or 3), drop_cache have been set to a value greater than zero , or you are executing

echo 3 > /proc/sys/vm/drop_caches


 

/proc/sys/vm/drop_caches (since Linux 2.6.16)
Writing to this file causes the kernel to drop clean caches, dentries and inodes from memory, causing that memory to become free.

To free pagecache:

* echo 1 > /proc/sys/vm/drop_caches

To free dentries and inodes:

* echo 2 > /proc/sys/vm/drop_caches

To free pagecache, dentries and inodes:

* echo 3 > /proc/sys/vm/drop_caches

As this is a non-destructive operation, and dirty objects are not freeable, the user should run "sync" first in order to make sure all cached objects are freed.


2. You have setup the Hugepages

Cause

This is a Linux Kernel issue.
Using the linux kernel "drop_cache" parameter and having the hugepages a memory corruption can occurs.

Per internal Bug 9461825, executing vm.drop_caches corrupts Oracle Database SGA hugepages;
it is fixed in Linux Kernel version 2.6.18-194.0.0.0.4.EL5


Solution

1.  As a workaround when hugepages are set avoid any vm.drop_cache settings.

OR

2.  Upgrade to Linux Kernel version 2.6.18-194.0.0.0.4.EL5


References

BUG:9358381 - ASM INSTANCE IS CRASHING AS ORA-600[KGHLKREM1] WHEN HUGEPAGES ARE IN USE
https://bugzilla.redhat.com/show_bug.cgi?id=578977

而3个节点只有节点2使用了hugepage:

[root@rac2 ~]# grep Huge /proc/meminfo
HugePages_Total:  9885
HugePages_Free:   9836
HugePages_Rsvd:   4868
Hugepagesize:     2048 kB
 
linux内核版本如下:
 
[root@rac2 ~]# uname -a
Linux rac2 2.6.18-128.el5

看来,linux下在使用hugepages参数的情况下,尽量不要随便修改drop_cache参数,要么就直接升级linux内核版本到

2.6.18-194.0.0.0.4.EL5

最后关闭所有节点2的相关集群进程,然后在开启,终于恢复正常了!

记录一下~~

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/25618347/viewspace-733804/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/25618347/viewspace-733804/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值