Oracle 11g RAC 节点异常重启问题分析

一、背景

在国庆期间巡检的时候,发现数据库alert日志中出现了异常重启的信息,当即对该报错进行分析处理。

二、处理过程

(1)数据库告警日志分析

node1 alert:
Sat Oct 05 13:05:14 2024
Thread 1 advanced to log sequence 6981 (LGWR switch)
  Current log# 11 seq# 6981 mem# 0: +DATA/ybqddb/onlinelog/group_11.302.1144593261
Sat Oct 05 13:05:15 2024
Archived Log entry 12130 added for thread 1 sequence 6980 ID 0x8d497377 dest 1:
Sat Oct 05 14:50:48 2024
Reconfiguration started (old inc 27, new inc 29)
List of instances:
 1 (myinst: 1) 
 Global Resource Directory frozen
 * dead instance detected - domain 0 invalid = TRUE 
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Sat Oct 05 14:50:48 2024
Sat Oct 05 14:50:48 2024
 LMS 3: 1 GCS shadows cancelled, 1 closed, 0 Xw survived
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Sat Oct 05 14:50:48 2024
 LMS 1: 1 GCS shadows cancelled, 0 closed, 0 Xw survived
Sat Oct 05 14:50:48 2024
 LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info 
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
Sat Oct 05 14:50:48 2024
Instance recovery: looking for dead threads
Beginning instance recovery of 1 threads
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
 parallel recovery started with 32 processes
Started redo scan
Completed redo scan
 read 76 KB redo, 16 data blocks need recovery
Started redo application at
 Thread 2: logseq 5168, block 86069
Sat Oct 05 14:50:53 2024
Setting Resource Manager plan SCHEDULER[0x32DE]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Recovery of Online Redo Log: Thread 2 Group 10 Seq 5168 Reading mem 0
  Mem# 0: +DATA/ybqddb/onlinelog/group_10.300.1144593255
Completed redo application of 0.01MB
Completed instance recovery at
 Thread 2: logseq 5168, block 86222, scn 245107035
 16 data blocks read, 16 data blocks written, 76 redo k-bytes read
Sat Oct 05 14:50:53 2024
minact-scn: master found reconf/inst-rec before recscn scan old-inc#:29 new-inc#:29
Thread 2 advanced to log sequence 5169 (thread recovery)
Redo thread 2 internally disabled at seq 5169 (SMON)
Sat Oct 05 14:50:54 2024
Archived Log entry 12131 added for thread 2 sequence 5168 ID 0x8d497377 dest 1:
Sat Oct 05 14:50:54 2024
ARC2: Archiving disabled thread 2 sequence 5169
Archived Log entry 12132 added for thread 2 sequence 5169 ID 0x8d497377 dest 1:
minact-scn: master continuing after IR
minact-scn: Master considers inst:2 dead
Sat Oct 05 14:51:49 2024
Decreasing number of real time LMS from 4 to 0
Sat Oct 05 14:52:11 2024
Reconfiguration started (old inc 29, new inc 31)
List of instances:
 1 2 (myinst: 1) 
 Global Resource Directory frozen
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Sat Oct 05 14:52:11 2024
 LMS 3: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Sat Oct 05 14:52:11 2024
 LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Sat Oct 05 14:52:11 2024
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Sat Oct 05 14:52:11 2024
 LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info 
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
Sat Oct 05 14:52:11 2024
minact-scn: Master returning as live inst:2 has inc# mismatch instinc:0 cur:31 errcnt:0
 Submitted all GCS remote-cache requests
 Fix write in gcs resources
Reconfiguration complete
Sat Oct 05 14:53:26 2024
Increasing number of real time LMS from 0 to 4
Sat Oct 05 17:05:20 2024
ALTER SYSTEM ARCHIVE LOG
Sat Oct 05 17:05:21 2024
Thread 1 advanced to log sequence 6982 (LGWR switch)
  Current log# 5 seq# 6982 mem# 0: +DATA/ybqddb/onlinelog/group_5.290.1144593225
Sat Oct 05 17:05:21 2024
Archived Log entry 12134 added for thread 1 sequence 6981 ID 0x8d497377 dest 1:
Sat Oct 05 21:05:22 2024
ALTER SYSTEM ARCHIVE LOG
Sat Oct 05 21:05:22 2024
Thread 1 advanced to log sequence 6983 (LGWR switch)
  Current log# 7 seq# 6983 mem# 0: +DATA/ybqddb/onlinelog/group_7.294.1144593235
Sat Oct 05 21:05:23 2024
Archived Log entry 12135 added for thread 1 sequence 6982 ID 0x8d497377 dest 1:
Sun Oct 06 01:08:47 2024
ALTER SYSTEM ARCHIVE LOG
Sun Oct 06 01:08:49 2024
Thread 1 advanced to log sequence 6984 (LGWR switch)
  Current log# 9 seq# 6984 mem# 0: +DATA/ybqddb/onlinelog/group_9.298.1144593249
Sun Oct 06 01:08:49 2024
Archived Log entry 12138 added for thread 1 sequence 6983 ID 0x8d497377 dest 1:
Sun Oct 06 05:05:18 2024
ALTER SYSTEM ARCHIVE LOG
Sun Oct 06 05:05:18 2024
Thread 1 advanced to log sequence 6985 (LGWR switch)
  Current log# 11 seq# 6985 mem# 0: +DATA/ybqddb/onlinelog/group_11.302.1144593261
Archived Log entry 12139 added for thread 2 sequence 5173 ID 0x8d497377 dest 1:
Sun Oct 06 05:05:19 2024

node2 alert:

Sat Oct 05 13:05:14 2024
Archived Log entry 12129 added for thread 2 sequence 5167 ID 0x8d497377 dest 1:
Sat Oct 05 14:50:47 2024
NOTE: ASMB terminating
Errors in file /u01/app/oracle/diag/rdbms/ybqddb/ybqddb2/trace/ybqddb2_asmb_15097.trc:
ORA-15064: communication failure with ASM instance
ORA-03113: end-of-file on communication channel
Process ID: 
Session ID: 2109 Serial number: 3
Errors in file /u01/app/oracle/diag/rdbms/ybqddb/ybqddb2/trace/ybqddb2_asmb_15097.trc:
ORA-15064: communication failure with ASM instance
ORA-03113: end-of-file on communication channel
Process ID: 
Session ID: 2109 Serial number: 3
ASMB (ospid: 15097): terminating the instance due to error 15064
Instance terminated by ASMB, pid = 15097
Sat Oct 05 14:51:59 2024
Starting ORACLE instance (normal)
************************ Large Pages Information *******************
Per process system memlock (soft) limit = UNLIMITED
 
Total Shared Global Region in Large Pages = 0 KB (0%)
 
Large Pages used by this instance: 0 (0 KB)
Large Pages unused system wide =

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值