文章版权所有 Jusin Hao(luckyfriends) ,支持原创,转载请注明
客户提出数据库服务器会出现CPU使用在30%到50%的情况:该问题自3月份开始出现,至5月小机CPU已高达70%,因问题49(分公司问题记录):多单据保存错误。检查排除原因时已做KILL进程处理。CPU使用正常处于5%以下。自6月初开始,oracle空会话占用CPU,每个无用的进程占6%的CPU,现已存在5个,占用系统CPU30%;
根据我7月21日看到的情况,如下,数据库服务器rac1的cpu基本上在10~20%;rac2上的cpu会在20~30%;
上图为rac1(其中可以看到进程893258始终占用cpu 6%左右,同样rac2上也有类似的进程)
rac1:$ps -ef|grep 438764
oracle 893258 1 78 Jul 09 - 62356:02 oraclenfsnnc1 (LOCAL=NO)
上图我们可以到这个oracle进程相关情况,占用内存并不高,只是cpu始终为6%
select c.addr, c.INST_ID,a.SQL_TEXT,a.SQL_ID,v.SID,v.EVENT
from gv$process c, gv$session v, gv$sqltext a
where c.ADDR=v.PADDR
and a.HASH_VALUE=v.SQL_HASH_VALUE and a.ADDRESS=v.SQL_ADDRESS
and c.spid = 1020608;
通过上图我们看到对应的操作系统进程及其等待事件;
SELECT oid, tablename, dsName, tableType, dropFlag, activeTime, createTime, threadName, hostId FROM qdi_temptables WHERE dropflag = 1 or ( activeTime + 36000000 < 1308628115345 )
在rac2上我们也可以看到类似的操作系统进程,占用cpu,如下我们可以看到对应数据库sql机器等待事件:
1)
2)
3)
我们可以看到这些sql的等待事件都是kksfbc child completion,我们可以看到这样等待时间会话及其操作系统进程:
awr报告也可以看到这样的等待事件:
我通过查Oracle metalink技术支持可以看到引起这样的等待事件的情况是oracle 的bug6795880,如下所示是metalinke对该bug的描述
Bug 6795880 Session spins / OERI after 'kksfbc child completion' wait - superceded
This note gives a brief overview of bug 6795880.
The content was last updated on: 05-APR-2011
Click here for details of each of the sections below.
Affects:
| Product (Component) | Oracle Server (Rdbms) |
| Range of versions believed to be affected | Versions BELOW 11.2 |
| Versions confirmed as being affected | |
| Platforms affected | Generic (all / most platforms affected) |
Note that this fix can cause / expose the problem described in Bug:8575528
Note that this fix has been superceded by the fix in Bug:8575528
Fixed:
| This issue is fixed in | ||||
| Symptoms: | Related To: | |||
| Internal Error May Occur (ORA-600) Dump in or under kksSearchChildList4 Waits for "kksfbc child completion" | This fix is DISABLED by default _CURSOR_FEATURES_ENABLED | |||
Description
This fix has been superseded by bug:8575528.
A session may go into an infinite spin just after a wait
for 'kksfbc child completion'. The spin occurs with
a stack including kksSearchChildList -> kkshgnc where
kksSearchChildList loops forever.
This problem can also lead to internal error such as any of
ORA-600 [kksSearchChildList1], ORA-600 [kksSearchChildList2]
ORA-600 [kksSearchChildList3], ORA-600 [kkshgnc-nextchild]
Or Trace dumps on kksSearchChildList4 .
Note:
This fix is disabled by default in 10g.
To enable this fix you must explicitly set the following
parameter for instance startup:
"_cursor_features_enabled" = 10
HOOKS CAUSES:8575528 SUPERCEDED:8575528 PSE:A204 PSE:A203 SPIN:kksSearchChildList OERI:kksSearchChildList1 OERI:kksSearchChildList2 OERI:kksSearchChildList3 OERI:kkshgnc-nextchild PARAMETER:_CURSOR_FEATURES_ENABLED "WAITEVENT:cursor: pin S" "WAITEVENT:kksfbc child completion" DUMP:kksSearchChildList4 LIKELYAFFECTS XAFFECTS_10.2.0.1 XAFFECTS_V10020001 AFFECTS=10.2.0.1 XAFFECTS_10.2.0.2 XAFFECTS_V10020002 AFFECTS=10.2.0.2 XAFFECTS_10.2.0.3 XAFFECTS_V10020003 AFFECTS=10.2.0.3 XAFFECTS_10.2.0.4 XAFFECTS_V10020004 AFFECTS=10.2.0.4 XAFFECTS_11.1.0.6 XAFFECTS_V11010006 AFFECTS=11.1.0.6 XPRODID_5 PRODUCT_ID=5 PRODID-5 RDBMS XCOMP_RDBMS COMPONENT=RDBMS TAG_DISABLED TAG_HANG/SPIN TAG_OERI DISABLED HANG/SPIN OERI FIXED_10.2.0.5 FIXED_11.1.0.7 FIXED_WIN:A204P05
| Please note: The above is a summary description only. Actual symptoms can vary. Matching to any symptoms here does not confirm that you are encountering this problem. Always consult with Oracle Support for advice. |
References
Bug:6795880 (This link will only work for PUBLISHED bugs)
Note:245840.1 Information on the sections in this article
该Bug的描述是在发生’kksfbc child completion’等待事件后会话陷入无休止的自旋(spins)中,这种自旋(spins)发生在由堆栈调用(stack call)kksSearchChildList->kkshgnc陷入对kksSearchChildList函数的无限循环中;如下我们看到更详细的stack call
SQL> oradebug setospid 3723390
SQL> oradebug unlimit;
Statement processed.
SQL> oradebug short_stack;
ksdxfstk+002c
SQL> oradebug dump processstate 10;
Statement processed.
SQL> oradebug dump systemstate 266;
Statement processed.
可以从以上trace中看到会话确实曾长时间处于’kksfbc child completion’等待中,之后陷入无限自旋(spins)中消耗了大量CPU时间。但这里实际的表现又存有差异,引发无限循环的函数是kksfbc而不是kksSearchChildList(常规的调用序列是:kksParseCursor->kkspsc0->kksfbc ->kksSearchChildList->kkshgnc);
根据metalink的方案我们安装补丁bug:8575528,之后修改参数 "_cursor_features_enabled" = 10 来修复该问题(做的时候备份好代码);
解决方法:
于7月28日晚上,安装oracle Metalink提供解决此问题的补丁p8575528_10204_AIX5L.zip(之前客户的Oracle没有Opatch,因此先安装了oracle 的打补丁工具Opatch:p6880880_102000_AIX64-5L.zip)
打补丁后,目前观察正常,没再出现之前的占用cpu的进程;
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/14710393/viewspace-1241509/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/14710393/viewspace-1241509/
本文解决了一个Oracle数据库服务器CPU使用率异常升高的问题,通过分析发现是由于特定等待事件导致的无限自旋,最终通过安装补丁并调整参数解决了问题。









被折叠的 条评论
为什么被折叠?



