节点2 asm dismount导致redo写报错(ORA-00340,ORA-00345),经过分析asm和系统日志,确认是由于多路径异常导致io异常
| 2022-01-24T23:44:39.966602+08:00 WARNING: group 4 is being dismounted. WARNING: ASMB force dismounting group 4 (REDO) due to ASM server dismount SUCCESS: diskgroup REDO was dismounted 2022-01-24T23:44:41.103783+08:00 Errors in file /u01/app/oracle/diag/rdbms/xff/XFF2/trace/XFF2_lgwr_228507.trc: ORA-00345: redo log write error block 11961764 count 6 ORA-00312: online log 10 thread 2: '+REDO/XFF/ONLINELOG/group_10.261.1074690685' 2022-01-24T23:44:41.156809+08:00 Errors in file /u01/app/oracle/diag/rdbms/xff/XFF2/trace/XFF2_lgwr_228507.trc: ORA-00340: IO error processing online log 10 of thread 2 ORA-00345: redo log write error block 11961764 count 6 ORA-00312: online log 10 thread 2: '+REDO/XFF/ONLINELOG/group_10.261.1074690685' Errors in file /u01/app/oracle/diag/rdbms/xff/XFF2/trace/XFF2_lgwr_228507.trc (incident=1341402): ORA-340 [] [] [] [] [] [] [] [] [] [] [] [] Incident details in: /u01/app/oracle/diag/rdbms/xff/XFF2/incident/incdir_1341402/XFF2_lgwr_228507_i1341402.trc 2022-01-24T23:44:41.505251+08:00 USER (ospid: 133928): terminating the instance due to error 340 |
由于节点2是突然crash,节点1做实例恢复失败,由于节点2的redo发生了写丢失,导致节点1实例恢复后库crash,进而是的该集群的相关数据库节点全部crash
| 2022-01-24T23:46:08.440519+08:00 Slave encountered ORA-10388 exception during crash recovery 2022-01-24T23:46:08.442854+08:00 Slave encountered ORA-10388 exception during crash recovery Abort recovery for domain 0, flags 4 2022-01-24T23:46:08.444531+08:00 Aborting crash recovery due to error 742 2022-01-
|