bug:8575528

本文解决了一个Oracle数据库服务器CPU使用率异常升高的问题,通过分析发现是由于特定等待事件导致的无限自旋,最终通过安装补丁并调整参数解决了问题。

文章版权所有 Jusin Hao(luckyfriends) ,支持原创,转载请注明

客户提出数据库服务器会出现CPU使用在30%到50%的情况:该问题自3月份开始出现,至5月小机CPU已高达70%,因问题49(分公司问题记录):多单据保存错误。检查排除原因时已做KILL进程处理。CPU使用正常处于5%以下。自6月初开始,oracle空会话占用CPU,每个无用的进程占6%的CPU,现已存在5个,占用系统CPU30%;

根据我7月21日看到的情况,如下,数据库服务器rac1的cpu基本上在10~20%;rac2上的cpu会在20~30%;

clip_image002

上图为rac1(其中可以看到进程893258始终占用cpu 6%左右,同样rac2上也有类似的进程)

rac1:$ps -ef|grep 438764

oracle 893258 1 78 Jul 09 - 62356:02 oraclenfsnnc1 (LOCAL=NO)

clip_image004

上图我们可以到这个oracle进程相关情况,占用内存并不高,只是cpu始终为6%

select c.addr, c.INST_ID,a.SQL_TEXT,a.SQL_ID,v.SID,v.EVENT

from gv$process c, gv$session v, gv$sqltext a

where c.ADDR=v.PADDR

and a.HASH_VALUE=v.SQL_HASH_VALUE and a.ADDRESS=v.SQL_ADDRESS

and c.spid = 1020608;

clip_image006

通过上图我们看到对应的操作系统进程及其等待事件;

SELECT oid, tablename, dsName, tableType, dropFlag, activeTime, createTime, threadName, hostId FROM qdi_temptables WHERE dropflag = 1 or ( activeTime + 36000000 < 1308628115345 ) clip_image008

在rac2上我们也可以看到类似的操作系统进程,占用cpu,如下我们可以看到对应数据库sql机器等待事件:

1)

clip_image010

2)

clip_image012

3)

clip_image014

我们可以看到这些sql的等待事件都是kksfbc child completion,我们可以看到这样等待时间会话及其操作系统进程:

clip_image016

awr报告也可以看到这样的等待事件:

clip_image018

我通过查Oracle metalink技术支持可以看到引起这样的等待事件的情况是oracle 的bug6795880,如下所示是metalinke对该bug的描述

Bug 6795880  Session spins / OERI after 'kksfbc child completion' wait - superceded

This note gives a brief overview of bug 6795880.
The content was last updated on: 05-APR-2011
Click here for details of each of the sections below.

Affects:

Product (Component)

Oracle Server (Rdbms)

Range of versions believed to be affected

Versions BELOW 11.2

Versions confirmed as being affected

11.1.0.6

10.2.0.4

10.2.0.3

Platforms affected

Generic (all / most platforms affected)

Note that this fix can cause / expose the problem described in Bug:8575528
Note that this fix has been superceded by the fix in Bug:8575528

Fixed:

This issue is fixed in

11.1.0.7 (Server Patch Set)

10.2.0.5 (Server Patch Set)

10.2.0.4 Patch 5 on Windows Platforms

Symptoms:

Related To:

Hang (Process Spins)

Internal Error May Occur (ORA-600)

ORA-600 [kksSearchChildList1]

ORA-600 [kksSearchChildList2]

ORA-600 [kksSearchChildList3]

ORA-600 [kkshgnc-nextchild]

Dump in or under kksSearchChildList4

Waits for "cursor: pin S"

Waits for "kksfbc child completion"

This fix is DISABLED by default

_CURSOR_FEATURES_ENABLED

Description

This fix has been superseded by bug:8575528.

A session may go into an infinite spin just after a wait

for 'kksfbc child completion'. The spin occurs with

a stack including kksSearchChildList -> kkshgnc where

kksSearchChildList loops forever.

This problem can also lead to internal error such as any of

ORA-600 [kksSearchChildList1], ORA-600 [kksSearchChildList2]

ORA-600 [kksSearchChildList3], ORA-600 [kkshgnc-nextchild]

Or Trace dumps on kksSearchChildList4 .

Note:

This fix is disabled by default in 10g.

To enable this fix you must explicitly set the following

parameter for instance startup:

"_cursor_features_enabled" = 10

HOOKS CAUSES:8575528 SUPERCEDED:8575528 PSE:A204 PSE:A203 SPIN:kksSearchChildList OERI:kksSearchChildList1 OERI:kksSearchChildList2 OERI:kksSearchChildList3 OERI:kkshgnc-nextchild PARAMETER:_CURSOR_FEATURES_ENABLED "WAITEVENT:cursor: pin S" "WAITEVENT:kksfbc child completion" DUMP:kksSearchChildList4 LIKELYAFFECTS XAFFECTS_10.2.0.1 XAFFECTS_V10020001 AFFECTS=10.2.0.1 XAFFECTS_10.2.0.2 XAFFECTS_V10020002 AFFECTS=10.2.0.2 XAFFECTS_10.2.0.3 XAFFECTS_V10020003 AFFECTS=10.2.0.3 XAFFECTS_10.2.0.4 XAFFECTS_V10020004 AFFECTS=10.2.0.4 XAFFECTS_11.1.0.6 XAFFECTS_V11010006 AFFECTS=11.1.0.6 XPRODID_5 PRODUCT_ID=5 PRODID-5 RDBMS XCOMP_RDBMS COMPONENT=RDBMS TAG_DISABLED TAG_HANG/SPIN TAG_OERI DISABLED HANG/SPIN OERI FIXED_10.2.0.5 FIXED_11.1.0.7 FIXED_WIN:A204P05

Please note: The above is a summary description only. Actual symptoms can vary. Matching to any symptoms here does not confirm that you are encountering this problem. Always consult with Oracle Support for advice.

References

Bug:6795880 (This link will only work for PUBLISHED bugs)
Note:245840.1 Information on the sections in this article

该Bug的描述是在发生’kksfbc child completion’等待事件后会话陷入无休止的自旋(spins)中,这种自旋(spins)发生在由堆栈调用(stack call)kksSearchChildList->kkshgnc陷入对kksSearchChildList函数的无限循环中;如下我们看到更详细的stack call

SQL> oradebug setospid 3723390

SQL> oradebug unlimit;

Statement processed.

SQL> oradebug short_stack;

ksdxfstk+002c

SQL> oradebug dump processstate 10;

Statement processed.

SQL> oradebug dump systemstate 266;

Statement processed.

可以从以上trace中看到会话确实曾长时间处于’kksfbc child completion’等待中,之后陷入无限自旋(spins)中消耗了大量CPU时间。但这里实际的表现又存有差异,引发无限循环的函数是kksfbc而不是kksSearchChildList(常规的调用序列是:kksParseCursor->kkspsc0->kksfbc ->kksSearchChildList->kkshgnc);

根据metalink的方案我们安装补丁bug:8575528,之后修改参数 "_cursor_features_enabled" = 10 来修复该问题(做的时候备份好代码);

解决方法:

于7月28日晚上,安装oracle Metalink提供解决此问题的补丁p8575528_10204_AIX5L.zip(之前客户的Oracle没有Opatch,因此先安装了oracle 的打补丁工具Opatch:p6880880_102000_AIX64-5L.zip)

打补丁后,目前观察正常,没再出现之前的占用cpu的进程;

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/14710393/viewspace-1241509/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/14710393/viewspace-1241509/

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符  | 博主筛选后可见
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值