-- ORA-15064 ORA-03113测试库案例
错误日志 2013-09-12 16:20:00
###############
alert_prod1.log
###############
Thu Sep 12 16:19:17 2013
NOTE: ASMB terminating
Errors in file /prod/oracle/diag/rdbms/prod/prod2/trace/prod2_asmb_26138.trc:
ORA-15064: communication failure with ASM instance
ORA-03113: end-of-file on communication channel
Process ID:
Session ID: 119 Serial number: 42229
Errors in file /prod/oracle/diag/rdbms/prod/prod2/trace/prod2_asmb_26138.trc:
ORA-15064: communication failure with ASM instance
ORA-03113: end-of-file on communication channel
Process ID:
Session ID: 119 Serial number: 42229
ASMB (ospid: 26138): terminating the instance due to error 15064
Instance terminated by ASMB, pid = 26138
###############
alert_+ASM2.log
###############
Thu Sep 12 16:19:17 2013
NOTE: client exited [4512]
Thu Sep 12 16:19:17 2013
NOTE: ASMB process exiting, either shutdown is in progress
NOTE: or foreground connected to ASMB was killed.
Thu Sep 12 16:19:18 2013
Received an instance abort message from instance 1
Please check instance 1 alert and LMON trace files for detail.
Thu Sep 12 16:19:18 2013
Received an instance abort message from instance 1
Please check instance 1 alert and LMON trace files for detail.
LMS0 (ospid: 4477): terminating the instance due to error 481
Instance terminated by LMS0, pid = 4477
Thu Sep 12 16:20:06 2013
MEMORY_TARGET defaulting to 285212672.
* instance_number obtained from CSS = 2, checking for the existence of node 0...
* node 0 does not exist. instance_number = 2
Starting ORACLE instance (normal)
#######################
grid alertq1ebsdb02.log
#######################
2013-09-12 16:19:00.781
[cssd(4138)]CRS-1612:Network communication with node q1ebsdb01 (1) missing for 50% of timeout interval. Removal of this node from cluster in 14.300 seconds
2013-09-12 16:19:07.801
[cssd(4138)]CRS-1611:Network communication with node q1ebsdb01 (1) missing for 75% of timeout interval. Removal of this node from cluster in 7.280 seconds
2013-09-12 16:19:12.811
[cssd(4138)]CRS-1610:Network communication with node q1ebsdb01 (1) missing for 90% of timeout interval. Removal of this node from cluster in 2.270 seconds
2013-09-12 16:19:15.088
[cssd(4138)]CRS-1609:This node is unable to communicate with other nodes in the cluster and is going down to preserve cluster integrity; details at (:CSSNM00008:) in /prod/grid/product/11.2.0
/crs_1/log/q1ebsdb02/cssd/ocssd.log.
2013-09-12 16:19:15.088
[cssd(4138)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /prod/grid/product/11.2.0/crs_1/log/q1ebsdb02/cssd/ocssd.log
2013-09-12 16:19:15.225
[cssd(4138)]CRS-1652:Starting clean up of CRSD resources.
2013-09-12 16:19:15.520
[/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin(4659)]CRS-5016:Process "/prod/grid/product/11.2.0/crs_1/bin/lsnrctl" spawned by agent "/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin" for
action "check" failed: details at "(:CLSN00010:)" in "/prod/grid/product/11.2.0/crs_1/log/q1ebsdb02/agent/crsd/oraagent_oragrid/oraagent_oragrid.log"
2013-09-12 16:19:15.519
[/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin(4822)]CRS-5016:Process "/prod/oracle/product/11.2.0/db_1/bin/lsnrctl" spawned by agent "/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin" for
action "check" failed: details at "(:CLSN00010:)" in "/prod/grid/product/11.2.0/crs_1/log/q1ebsdb02/agent/crsd/oraagent_oraprod/oraagent_oraprod.log"
2013-09-12 16:19:16.088
[cssd(4138)]CRS-1608:This node was evicted by node 1, q1ebsdb01; details at (:CSSNM00005:) in /prod/grid/product/11.2.0/crs_1/log/q1ebsdb02/cssd/ocssd.log.
2013-09-12 16:19:16.512
[/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin(4659)]CRS-5016:Process "/prod/grid/product/11.2.0/crs_1/opmn/bin/onsctli" spawned by agent "/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin"
for action "check" failed: details at "(:CLSN00010:)" in "/prod/grid/product/11.2.0/crs_1/log/q1ebsdb02/agent/crsd/oraagent_oragrid/oraagent_oragrid.log"
2013-09-12 16:19:17.128
[/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin(4659)]CRS-5016:Process "/prod/grid/product/11.2.0/crs_1/bin/lsnrctl" spawned by agent "/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin" for
action "check" failed: details at "(:CLSN00010:)" in "/prod/grid/product/11.2.0/crs_1/log/q1ebsdb02/agent/crsd/oraagent_oragrid/oraagent_oragrid.log"
2013-09-12 16:19:17.141
[cssd(4138)]CRS-1654:Clean up of CRSD resources finished successfully.
2013-09-12 16:19:17.142
[cssd(4138)]CRS-1655:CSSD on node q1ebsdb02 detected a problem and started to shutdown.
错误日志 2013-09-17 18:24:00
###############
alert_prod2.log
###############
Tue Sep 17 18:24:56 2013
Errors in file /prod/oracle/diag/rdbms/prod/prod2/trace/prod2_asmb_22443.trc:
ORA-15064: communication failure with ASM instance
ORA-03113: end-of-file on communication channel
Errors in file /prod/oracle/diag/rdbms/prod/prod2/trace/prod2_asmb_22443.trc:
ORA-15064: communication failure with ASM instance
ORA-03113: end-of-file on communication channel
###############
alert_+ASM2.log
###############
Tue Sep 17 18:24:56 2013
NOTE: client exited [3309]
Tue Sep 17 18:24:56 2013
NOTE: ASMB process exiting, either shutdown is in progress
NOTE: or foreground connected to ASMB was killed.
opidcl aborting process unknown ospid (3320) as a result of ORA-29709
Tue Sep 17 18:24:57 2013
PMON (ospid: 3248): terminating the instance due to error 481
Instance terminated by PMON, pid = 3248
Tue Sep 17 18:25:30 2013
MEMORY_TARGET defaulting to 285212672.
* instance_number obtained from CSS = 2, checking for the existence of node 0...
* node 0 does not exist. instance_number = 2
#######################
grid alertq1ebsdb02.log
#######################
[cssd(3003)]CRS-1612:Network communication with node q1ebsdb01 (1) missing for 50% of timeout interval. Removal of this node from cluster in 14.210 seconds
2013-09-17 18:24:42.621
[cssd(3003)]CRS-1611:Network communication with node q1ebsdb01 (1) missing for 75% of timeout interval. Removal of this node from cluster in 7.180 seconds
2013-09-17 18:24:47.631
[cssd(3003)]CRS-1610:Network communication with node q1ebsdb01 (1) missing for 90% of timeout interval. Removal of this node from cluster in 2.170 seconds
2013-09-17 18:24:49.808
[cssd(3003)]CRS-1609:This node is unable to communicate with other nodes in the cluster and is going down to preserve cluster integrity; details at (:CSSNM00008:) in /prod/grid/product/11.2.0
/crs_1/log/q1ebsdb02/cssd/ocssd.log.
2013-09-17 18:24:49.808
[cssd(3003)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /prod/grid/product/11.2.0/crs_1/log/q1ebsdb02/cssd/ocssd.log
2013-09-17 18:24:49.951
[cssd(3003)]CRS-1652:Starting clean up of CRSD resources.
2013-09-17 18:24:50.955
[cssd(3003)]CRS-1608:This node was evicted by node 1, q1ebsdb01; details at (:CSSNM00005:) in /prod/grid/product/11.2.0/crs_1/log/q1ebsdb02/cssd/ocssd.log.
2013-09-17 18:24:51.345
[/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin(3428)]CRS-5016:Process "/prod/grid/product/11.2.0/crs_1/opmn/bin/onsctli" spawned by agent "/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin"
for action "check" failed: details at "(:CLSN00010:)" in "/prod/grid/product/11.2.0/crs_1/log/q1ebsdb02/agent/crsd/oraagent_oragrid/oraagent_oragrid.log"
2013-09-17 18:24:51.999
[/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin(3428)]CRS-5016:Process "/prod/grid/product/11.2.0/crs_1/bin/lsnrctl" spawned by agent "/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin" for
action "check" failed: details at "(:CLSN00010:)" in "/prod/grid/product/11.2.0/crs_1/log/q1ebsdb02/agent/crsd/oraagent_oragrid/oraagent_oragrid.log"
2013-09-17 18:24:52.002
[/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin(3428)]CRS-5016:Process "/prod/grid/product/11.2.0/crs_1/bin/lsnrctl" spawned by agent "/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin" for
action "check" failed: details at "(:CLSN00010:)" in "/prod/grid/product/11.2.0/crs_1/log/q1ebsdb02/agent/crsd/oraagent_oragrid/oraagent_oragrid.log"
2013-09-17 18:24:56.280
[/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin(3612)]CRS-5016:Process "/prod/oracle/product/11.2.0/db_1/bin/lsnrctl" spawned by agent "/prod/grid/product/11.2.0/crs_1/bin/oraagent.bin" for
action "check" failed: details at "(:CLSN00010:)" in "/prod/grid/product/11.2.0/crs_1/log/q1ebsdb02/agent/crsd/oraagent_oraprod/oraagent_oraprod.log"
2013-09-17 18:24:56.289
[cssd(3003)]CRS-1654:Clean up of CRSD resources finished successfully.
2013-09-17 18:24:56.290
[cssd(3003)]CRS-1655:CSSD on node q1ebsdb02 detected a problem and started to shutdown.
原因分析:1,grid的日志文件其实已经显示真正的原因了,节点2心跳网络与节点1失去联系,超过了30秒,导致rac出现了隔离,节点2被强制驱离,rac重启节点2的所有资源,包括ASM与DB的实例。
2,17号当天傍晚有网络调试,出现了短暂网络中断。