三个节点的rac,打补丁在做analyze冲突检测时发生了错误
[root@db01 etc]# /oracle/app/19.0.0/grid/OPatch/opatchauto apply /oracle/soft/35642822 -analyze
OPatchauto session is initiated at Sun Apr 7 14:01:19 2024
System initialization log file is /oracle/app/19.0.0/grid/cfgtoollogs/opatchautodb/systemconfig2024-04-07_02-01-26PM.log.
OPATCHAUTO-72050: System instance creation failed.
OPATCHAUTO-72050: Failed while retrieving system information.
OPATCHAUTO-72050: Please check log file for more details.
OPatchauto session completed at Sun Apr 7 14:01:41 2024
Time taken to complete the session 0 minute, 15 seconds
Topology creation failed.
日志做了简化处理,如下:
2024-04-07 10:18:04,876 SEVERE [1] com.oracle.glcm.patch.auto.db.integration.model.productsupport.topology.TopologyCreator - Not able to retrieve system instance details :: Unable to determine if “/oracle/app/19.0.0/grid” is a shared oracle home.
Failed:
Verifying ‘/tmp/’ …FAILED (PRVF-7546)
Verifying Shared Storage Accessibility:/oracle/app/19.0.0/grid/crs/install …PASSED
Verification of shared storage accessibility was unsuccessful.
Checks did not pass for the following nodes:
db03,db02
NODE_STATUS::db03:EFAIL
The result of cluvfy command contain EFAIL NODE_STATUS::db03:EFAIL
oracle.dbsysmodel.driver.sdk.productdriver.ProductDriverException: Unable to determine if “/oracle/app/19.0.0/grid” is a shared oracle home.
Failed:
Verifying ‘/tmp/’ …FAILED (PRVF-7546)
Verifying Shared Storage Accessibility:/oracle/app/19.0.0/grid/crs/install …PASSED
Verification of shared storage accessibility was unsuccessful.
Checks did not pass for the following nodes:
db03,db02
NODE_STATUS::db03:EFAIL
The result of cluvfy command contain EFAIL NODE_STATUS::db03:EFAIL
at oracle.dbsysmodel.driver.sdk.util.OsysUtility.isSharedPath(OsysUtility.java:190)
at com.oracle.glcm.patch.auto.db.product.driver.crs.AbstractCrsProductDriver.isShared(AbstractCrsProductDriver.java:624)
at com.oracle.glcm.patch.auto.db.product.driver.crs.AbstractCrsProductDriver.makeHome(AbstractCrsProductDriver.java:397)
at com.oracle.glcm.patch.auto.db.product.driver.crs.AbstractCrsProductDriver.makeCrsHomes(AbstractCrsProductDriver.java:477)
at com.oracle.glcm.patch.auto.db.product.driver.crs.CrsProductDriver.buildCRSSystemInstance(CrsProductDriver.java:548)
at com.oracle.glcm.patch.auto.db.product.driver.crs.CrsProductDriver.buildSystemInstance(CrsProductDriver.java:605)
at com.oracle.glcm.patch.auto.db.integration.model.productsupport.topology.TopologyCreator.createSystemInstance(TopologyCreator.java:270)
at com.oracle.glcm.patch.auto.db.integration.model.productsupport.topology.TopologyCreator.process(TopologyCreator.java:186)
at com.oracle.glcm.patch.auto.db.integration.model.productsupport.topology.TopologyCreator.main(TopologyCreator.java:125)
Caused by: java.lang.Exception: The result of cluvfy command contain EFAIL NODE_STATUS::db03:EFAIL
at oracle.dbsysmodel.driver.sdk.util.OsysUtility.isSharedPath(OsysUtility.java:151)
… 8 more
2024-04-07 10:18:04,877 SEVERE [1] com.oracle.glcm.patch.auto.db.integration.model.productsupport.topology.TopologyCreator - Failure reason::java.lang.Exception: The result of cluvfy command contain EFAIL NODE_STATUS::db03:EFAIL
2024-04-07 10:18:04,883 INFO [1] com.oracle.glcm.patch.auto.db.product.executor.GISystemCall - System Call command is: [/bin/su, root, -m, -c, chown -R grid:oinstall /oracle/app/19.0.0/grid/cfgtoollogs/opatchautodb]
2024-04-07 10:18:04,907 INFO [1] com.oracle.glcm.patch.auto.db.product.executor.GISystemCall - Output message:
2024-04-07 10:18:04,907 INFO [1] com.oracle.glcm.patch.auto.db.product.executor.GISystemCall - Return code: 0
2024-04-07 10:18:04,911 INFO [1] com.oracle.glcm.patch.auto.db.product.executor.GISystemCall - System Call command is: [/bin/su, root, -m, -c, chown -R grid:oinstall /oracle/app/19.0.0/grid/cfgtoollogs/opatchautodb]
2024-04-07 10:18:04,932 INFO [1] com.oracle.glcm.patch.auto.db.product.executor.GISystemCall - Output message:
2024-04-07 10:18:04,933 INFO [1] com.oracle.glcm.patch.auto.db.product.executor.GISystemCall - Return code: 0
以上问题经过各种分析,最终诊断为ssh版本=9过高,降至7.4后日志又报如下错误:
Failed:
Verifying User Equivalence …FAILED (PRVG-2019, PRKN-1038)
PRVF-4009 : User equivalence is not set for nodes: db03
Verification will proceed with nodes: db02, db01
Verifying Shared Storage Accessibility:/oracle/app/19.0.0/grid/crs/install …FAILED
oracle.dbsysmodel.driver.sdk.productdriver.ProductDriverException: Unable to determine if “/oracle/app/19.0.0/grid” is a shared oracle home.
PRVF-4009 : User equivalence is not set for nodes: db03
Verification will proceed with nodes: db02, db01
直观的感觉是到db03节点的用户等效性未设置,即ssh互信,可是经过测试三个节点之间的互信是没有问题的!
使用验证工具检测如下:
grid@db02:/home/grid$ cluvfy comp admprv -n all -o user_equiv -verbose
Verifying User Equivalence …
Node Name Status
db03 failed
db01 passed
Verifying Checking user equivalence for user “grid” on all cluster nodes …PASSED
From node To node Status
db01 db02 SUCCESSFUL
Verifying User Equivalence …FAILED (PRVG-2019, PRKN-1038)
Verification of administrative privileges was unsuccessful.
Checks did not pass for the following nodes:
db03
Failures were encountered during execution of CVU verification request “administrative privileges”.
Verifying User Equivalence …FAILED
db03: PRVG-2019 : Check for equivalence of user “grid” from node “db02”
to node “db03” failed
PRKN-1038 : The command “/usr/bin/ssh -o FallBackToRsh=no -o
PasswordAuthentication=no -o StrictHostKeyChecking=yes -o
NumberOfPasswordPrompts=0 db03 -n /bin/true” run on node
“db02” gave an unexpected output: "Warning: the ECDSA host key
for ‘db03’ differs from the key for the IP address
'133.193.102.47’Offending key for IP in
/home/grid/.ssh/known_hosts:16Matching host key in
/home/grid/.ssh/known_hosts:9Exiting, you have requested strict
checking.Host key verification failed."
清晰了,ssh key值验证失败,
在失败节点上执行如下操作:
su - grid
cd .ssh/
vi know_hosts
将db03节点的所有key值 删除,(包括主机名、IP分别开头的所有行)
然后从db01、db02分别重新ssh db03,之后会重新生成对应的ssh key值 ,再次验证便可成功