OUI-67124,一次史无前例的打补丁遭遇
一、事件起因
oracle 19c rac,客户集团侧进行单一业务场景压力测试,频繁的进行logon/logoff,产生了大量的library cache lock
EVENT COUNT(*)
----------------------------------- ----------
library cache load lock 409
library cache lock 206
此等待事件p1=handle address
p2=lock address
p3=100*mode+namespace
当时p3值为8323074
SQL> select round(bitand(8323074,to_number('FF0000','xxxxxxxxx'))/to_number('FFFF','xxxxxxxxx')) from dual;
ROUND(BITAND(8323074,TO_NUMBER('FF0000','XXXXXXXXX'))/TO_NUMBER('FFFF','XXXXXXXXX'))
------------------------------------------------------------------------------------
select distinct KGLHDNSP,KGLHDNSD from x$kglob where KGLHDNSP = 127:
KGLHDNSP KGLHDNSD
----------------- -----------------------------------------------
127 Last successful Logon Time
Last successful Logon Time会记录用户最后一次成功登陆的时间,很多场景是比较实用的,没成想在此遇到了问题
v$session中sql_id都是空的,但prev_sql_id都指向如下sql
update user$ set spare6=DECODE(to_char(:2, 'YYYY-MM-DD'), '0000-00-00', to_date(NULL), :2) where user#=:1;
到这里问题应该捋顺了,下面得使用mos帮助了
反复查找找到了具有倾向性的帖子
Bug 32164034 - Database Hang Updating USER$ When LSLT (LAST SUCCESSFUL LOGIN TIME) Is Enabled (Doc ID 32164034.8)
那就按mos的指示,打补丁吧p33121934
真正的问题来了~~
二、主菜上桌了
1)正面解决,遇到了问题
这个补丁很小,只有859K,readme中也仅是常规的路数,自感觉是毫无工作量可言,可人生就是这样,越简单越不容小觑
oracle@db1:[/oraclelog/pat/33121934] opatch apply
Oracle Interim Patch Installer version 12.2.0.1.44
Copyright (c) 2024, Oracle Corporation. All rights reserved.
Oracle Home : /oracle/app/oracle/product/19c/db_1
Central Inventory : /oracle/app/oraInventory
from : /oracle/app/oracle/product/19c/db_1/oraInst.loc
OPatch version : 12.2.0.1.44
OUI version : 12.2.0.7.0
Log file location : /oracle/app/oracle/product/19c/db_1/cfgtoollogs/opatch/opatch2024-12-25_01-26-43AM_1.log
Verifying environment and performing prerequisite checks...
Skip patch 33121934 from list of patches to apply: This patch is not needed.
After skipping patches with missing components, there are no patches to apply.
OPatch Session completed with warnings.
Log file location: /oracle/app/oracle/product/19c/db_1/cfgtoollogs/opatch/opatch2024-12-25_01-26-43AM_1.log
OPatch completed with warnings.
Skip patch 33121934 from list of patches to apply:This patch is not needed
这里居然提示此补丁被忽略
oracle@db1:[/home/oracle] cat /oracle/app/oracle/product/19c/db_1/cfgtoollogs/opatch/opatch2024-12-25_01-26-43AM_1.log
[Dec 25, 2024 1:26:43 AM] [INFO] CAS Dynamic Loading :
[Dec 25, 2024 1:26:43 AM] [INFO] CUP_LOG: Trying to load HomeOperations object
[Dec 25, 2024 1:26:43 AM] [INFO] CUP_LOG: HomeOperations object created. CUP1.0 is enabled
[Dec 25, 2024 1:26:43 AM] [INFO] OPatch invoked as follows: 'apply -oh /oracle/app/oracle/product/19c/db_1 -local -invPtrLoc /oracle/app/oracle/product/19c/db_1/oraInst.loc '
[Dec 25, 2024 1:26:43 AM] [INFO] Runtime args: [-Xverify:none, -Xmx3072m, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=/oracle/app/oracle/product/19c/db_1/cfgtoollogs/opatch, -DCommonLog.LOG_SESSION_ID=, -DCommonLog.COMMAND_NAME=apply, -DOPatch.ORACLE_HOME=/oracle/app/oracle/product/19c/db_1, -DOPatch.DEBUG=false, -DOPatch.MAKE=false, -DOPatch.RUNNING_DIR=/oracle/app/oracle/product/19c/db_1/OPatch, -DOPatch.MW_HOME=, -DOPatch.WL_HOME=, -DOPatch.COMMON_COMPONENTS_HOME=, -DOPatch.OUI_LOCATION=/oracle/app/oracle/product/19c/db_1/oui, -DOPatch.FMW_COMPONENT_HOME=, -DOPatch.OPATCH_CLASSPATH=, -DOPatch.WEBLOGIC_CLASSPATH=, -DOPatch.SKIP_OUI_VERSION_CHECK=, -DOPatch.NEXTGEN_HOME_CHECK=false, -DOPatch.PARALLEL_ON_FMW_OH=]
[Dec 25, 2024 1:26:43 AM] [INFO] Heap in use : 123 MB
Total memory: 1963 MB
Free memory : 1839 MB
Max memory : 2731 MB
[Dec 25, 2024 1:26:43 AM] [INFO] Oracle Home : /oracle/app/oracle/product/19c/db_1
Central Inventory : /oracle/app/oraInventory
from : /oracle/app/oracle/product/19c/db_1/oraInst.loc
OPatch version : 12.2.0.1.44
OUI version : 12.2.0.7.0
OUI location : /oracle/app/oracle/product/19c/db_1/oui
Log file location : /oracle/app/oracle/product/19c/db_1/cfgtoollogs/opatch/opatch2024-12-25_01-26-43AM_1.log
[Dec 25, 2024 1:26:43 AM] [INFO] Patch history file: /oracle/app/oracle/product/19c/db_1/cfgtoollogs/opatch/opatch_history.txt
[Dec 25, 2024 1:26:45 AM] [INFO] [OPSR-TIME] Loading raw inventory
[Dec 25, 2024 1:26:45 AM] [INFO] [OPSR-MEMORY] Loaded all components from inventory. Heap memory in use: 164 (MB)
[Dec 25, 2024 1:26:45 AM] [INFO] [OPSR-MEMORY] Loaded all one offs from inventory. Heap memory in use: 174 (MB)
[Dec 25, 2024 1:26:45 AM] [INFO] [OPSR-TIME] Raw inventory loaded successfully
[Dec 25, 2024 1:26:45 AM] [INFO] NApply::no CAS enabled, OPatch runs with legacy process.
[Dec 25, 2024 1:26:45 AM] [INFO] Verifying environment and performing prerequisite checks...
[Dec 25, 2024 1:26:45 AM] [INFO] [OPSR-TIME] Running prerequisite checks
[Dec 25, 2024 1:26:45 AM] [INFO] opatch-external.jar is in /oracle/app/oracle/product/19c/db_1/OPatch/jlib/opatch-external.jar
[Dec 25, 2024 1:26:46 AM] [INFO] [OPSR-TIME] Loading cooked inventory
[Dec 25, 2024 1:26:46 AM] [INFO] [OPSR-MEMORY] : Loading cooked one offs. Heap memory used 481 (MB)
[Dec 25, 2024 1:26:46 AM] [INFO] [OPSR-MEMORY] : Loaded cooked oneoffs. Heap memory used : 286 (MB)
[Dec 25, 2024 1:26:46 AM] [INFO] [OPSR-TIME] Cooked inventory loaded successfully
[Dec 25, 2024 1:26:47 AM] [INFO] CUP_LOG: Found poh PLO 30396974 implicitly overlays poh CUP: 29517242
[Dec 25, 2024 1:26:47 AM] [INFO] CUP_LOG: Found poh PLO 30486436 implicitly overlays poh CUP: 29517242
[Dec 25, 2024 1:26:47 AM] [INFO] CUP_LOG: Found poh PLO 30998759 implicitly overlays poh CUP: 29517242
[Dec 25, 2024 1:26:47 AM] [INFO] CUP_LOG: Found poh PLO 31013127 implicitly overlays poh CUP: 29517242
[Dec 25, 2024 1:26:47 AM] [INFO] CUP_LOG: Found poh PLO 31393600 implicitly overlays poh CUP: 29517242
[Dec 25, 2024 1:26:47 AM] [INFO] CUP_LOG: Found poh PLO 26749785 implicitly overlays poh CUP: 29517242
[Dec 25, 2024 1:26:47 AM] [INFO] CUP_LOG: Found poh PLO 30396974 implicitly overlays poh CUP: 30869156
[Dec 25, 2024 1:26:47 AM] [INFO] CUP_LOG: Found poh PLO 30486436 implicitly overlays poh CUP: 30869156
[Dec 25, 2024 1:26:47 AM] [INFO] CUP_LOG: Found poh PLO 30998759 implicitly overlays poh CUP: 30869156
[Dec 25, 2024 1:26:47 AM] [INFO] CUP_LOG: Found poh PLO 31013127 implicitly overlays poh CUP: 30869156
[Dec 25, 2024 1:26:47 AM] [INFO] CUP_LOG: Found poh PLO 31393600 implicitly overlays poh CUP: 30869156
[Dec 25, 2024 1:26:47 AM] [INFO] CUP_LOG: Found poh PLO 26749785 implicitly overlays poh CUP: 30869156
[Dec 25, 2024 1:26:47 AM] [INFO] OPatch invoked in CLI. Run Init Prereqs
[Dec 25, 2024 1:26:47 AM] [INFO] Running Initial prerequisite checks...
[Dec 25, 2024 1:26:47 AM] [INFO] Checking if Oracle Home has components required by patches...
[Dec 25, 2024 1:26:47 AM] [INFO] Skip patch 33121934 from list of patches to apply: This patch is not needed.
[Dec 25, 2024 1:26:47 AM] [INFO] CheckMissingComps: Cached file does not exist or is invalid, re-build prereq result.
[Dec 25, 2024 1:26:47 AM] [WARNING] OUI-67124:
After skipping patches with missing components, there are no patches to apply.
[Dec 25, 2024 1:26:47 AM] [INFO] UtilSession: N-Apply done.
[Dec 25, 2024 1:26:47 AM] [INFO] --------------------------------------------------------------------------------
[Dec 25, 2024 1:26:47 AM] [INFO] The following warnings have occurred during OPatch execution:
[Dec 25, 2024 1:26:47 AM] [INFO] 1) OUI-67124:
After skipping patches with missing components, there are no patches to apply.
[Dec 25, 2024 1:26:47 AM] [INFO] --------------------------------------------------------------------------------
[Dec 25, 2024 1:26:47 AM] [SEVERE] OUI-67008:OPatch Session completed with warnings.
[Dec 25, 2024 1:26:47 AM] [INFO] Finishing UtilSession at Wed Dec 25 01:26:47 CST 2024
[Dec 25, 2024 1:26:47 AM] [INFO] Log file location: /oracle/app/oracle/product/19c/db_1/cfgtoollogs/opatch/opatch2024-12-25_01-26-43AM_1.log
oracle@db1:[/home/oracle]
神奇不?dba生涯中第一次遇见,关键是在测试库是成功的
那就对比生产系统与测试系统的差异吧
2)问题复现
这个问题我们在测试库进行反复重现、试验,分析了多日,最终发现了原因所在
生产环境比测试环境多了个小补丁p31359215,这是个针对OJVM的修复补丁,在测试库上将此OJVM补丁应用之后,再应用p33121934就会复现此问题
感觉距离真相越来越近了
将p31359215 rollback后对比$ORACLE_HOME/inventory/ContentsXML/comps.xml内容发现了差异:
--正常情况下的内容
oracle@edb1:[/oracle/app/oracle/product/19c/db_1/inventory/ContentsXML] cat comps.xml |grep 'COMP NAME="oracle.rdbms"'
<COMP NAME="oracle.rdbms" VER="19.0.0.0.0" BUILD_NUMBER="0" BUILD_TIME="20190417.011331" PATCH_LEVEL="3" REP_VER="0.0.0.0.0" RELEASE="Production" INV_LOC="Components/oracle.rdbms/19.0.0.0.0/1/" LANGS="ALL_LANGS" XML_INV_LOC="Components21/oracle.rdbms/19.0.0.0.0/" ACT_INST_VER="12.2.0.7.0" DEINST_VER="11.2.0.0.0" INSTALL_TIME="2020.Sep.08 02:32:32 CST" INST_LOC="/oracle/app/oracle/product/19c/db_1/oracle_rdbms">
--应用p31359215后的内容
oracle@edb1:[/oracle/app/oracle/product/19c/db_1/inventory/ContentsXML] cat comps.xml |grep 'COMP NAME="oracle.rdbms"'
<COMP NAME="oracle.rdbms" VER="19.0.0.0.0" BUILD_NUMBER="0" BUILD_TIME="20190417.011331" REP_VER="0.0.0.0.0" RELEASE="Production" INV_LOC="Components/oracle.rdbms/19.0.0.0.0/1/" LANGS="ALL_LANGS" XML_INV_LOC="Components21/oracle.rdbms/19.0.0.0.0/" ACT_INST_VER="12.2.0.7.0" DEINST_VER="11.2.0.0.0" INSTALL_TIME="2020.Sep.08 02:32:32 CST" INST_LOC="/oracle/app/oracle/product/19c/db_1/oracle_rdbms">
可以看到应用OJVM补丁p31359215后comps.xml文件中的参数PATCH_LEVEL="3"会被删除
3)解决办法
既然原因找到了,解决办法也就自然地有了
方法一:回退OJVM补丁p31359215,待应用完正主p33121934后再应用一次p31359215,这样太麻烦了
方法二:手工添加参数PATCH_LEVEL="3"后直接应用补丁,事实证明这步是完全可以的,简单、省事
三、总结一下
1.问题的引发还是集团压测工具频繁logon/logoff,为啥不用个连接池呢?
2.打补丁前做好备份,方便后期遇到问题时进行差异对比
3.以前应用补丁时遭遇回退也很正常,但是报错或冲突都很明显,这次没有明显提示
4.至于为什么应用完OJVM补丁后comps.xml会被篡改,这个有待研究,可能咱们也没那实力去论证
最后感谢我的好朋友weisong大佬,一起痛苦了好几天,加个鸡腿~~
这个问题你遇到过吗?