ORA-600 krhpfh_03-1210故障处理---惜分飞

rac数据库多个节点均处于open状态,数据查询正常,但是应用入库有些时候会失败报类似ORA-01187: cannot read from file because it failed verification tests错误:
 

ora-01187


故障最初原因是由于有坏盘,换盘之后,有两个节点数据实例crash

Mon Aug 19 21:16:47 2024

Read of datafile '+DATA/xifenfei99.dbf' (fno 1399) header failed with ORA-01207

Rereading datafile 1399 header failed with ORA-01207

Errors in file /u01/app/oracle/diag/rdbms/xff/xff5/trace/xff5_ckpt_75779.trc:

ORA-01242: data file suffered media failure: database in NOARCHIVELOG mode

ORA-01122: database file 1399 failed verification check

ORA-01110: data file 1399: '+DATA/xifenfei99.dbf'

ORA-01207: file is more recent than control file - old control file

Errors in file /u01/app/oracle/diag/rdbms/xff/xff5/trace/xff5_ckpt_75779.trc:

ORA-01242: data file suffered media failure: database in NOARCHIVELOG mode

ORA-01122: database file 1399 failed verification check

ORA-01110: data file 1399: '+DATA/xifenfei99.dbf'

ORA-01207: file is more recent than control file - old control file

CKPT (ospid: 75779): terminating the instance due to error 1242

Mon Aug 19 21:16:47 2024

System state dump requested by (instance=5, osid=75779 (CKPT)), summary=[abnormal instance termination].

System State dumped to trace file /u01/app/oracle/diag/rdbms/xff/xff5/trace/xff5_diag_75725.trc

Mon Aug 19 21:16:52 2024

ORA-1092 : opitsk aborting process

Mon Aug 19 21:16:53 2024

ORA-1092 : opitsk aborting process

Mon Aug 19 21:16:53 2024

License high water mark = 131

Termination issued to instance processes. Waiting for the processes to exit

Mon Aug 19 21:17:02 2024

Instance termination failed to kill one or more processes

Instance terminated by CKPT, pid = 75779

Mon Aug 19 21:17:03 2024

USER (ospid: 33495): terminating the instance

Termination issued to instance processes. Waiting for the processes to exit

Mon Aug 19 21:17:13 2024

Instance termination failed to kill one or more processes

Instance terminated by USER, pid = 33495

但是数据库人工启动成功,查询所有数据文件均处于online状态


可是有部分入库进程非常慢大量等待在enq:HW – contention
 

20240826-120804


所有数据库节点alert日志偶尔报ORA-01186: file 1399 failed verification tests等错

Tue Aug 20 21:30:02 2024

Read of datafile '+DATA/xifenfei99.dbf' (fno 1399) header failed with ORA-01207

Rereading datafile 1399 header failed with ORA-01207

Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_dbw0_43828.trc:

ORA-01186: file 1399 failed verification tests

ORA-01122: database file 1399 failed verification check

ORA-01110: data file 1399: '+DATA/xifenfei99.dbf'

ORA-01207: file is more recent than control file - old control file

File 1399 not verified due to error ORA-01122

Read of datafile '+DATA/xifenfei99.dbf' (fno 1399) header failed with ORA-01207

Rereading datafile 1399 header failed with ORA-01207

Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_dbw0_43828.trc:

ORA-01186: file 1399 failed verification tests

ORA-01122: database file 1399 failed verification check

ORA-01110: data file 1399: '+DATA/xifenfei99.dbf'

ORA-01207: file is more recent than control file - old control file

File 1399 not verified due to error ORA-01122

基于这种情况,初步判断:
1. 是由于该集群本身多节点(6个节点),只要有节点是open状态,其他节点关闭再启动依旧可以正常启动,但是无法写入数据到报ORA-01207错误的数据文件中(可以读取数据).
2. 如果所有节点关闭关闭,然后数据库无法正常启动会报ORA-01207: file is more recent than control file错误

这样的情况,根据以往经验,ORA-01207: file is more recent than control file通过重建ctl即可恢复,先关闭所有节点,然后尝试启动一个节点

SQL> alter database open;

alter database open

*

ERROR at line 1:

ORA-01122: database file 1399 failed verification check

ORA-01110: data file 1399: '+DATA/xifenfei99.dbf'

ORA-01207: file is more recent than control file - old control file

alter database open

Wed Aug 21 14:14:22 2024

SUCCESS: diskgroup REDO was mounted

Wed Aug 21 14:14:22 2024

NOTE: dependency between database xff and diskgroup resource ora.REDO.dg is established

Wed Aug 21 14:14:27 2024

Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_47884.trc:

ORA-01122: database file 1399 failed verification check

ORA-01110: data file 1399: '+DATA/xifenfei99.dbf'

ORA-01207: file is more recent than control file - old control file

ORA-1122 signalled during: alter database open...

和预期的一样,重试重建ctl,然后数据库报ORA-00600 [krhpfh_03-1210]错误

SQL> shutdown immediate;

ORA-01109: database not open

Database dismounted.

ORACLE instance shut down.

SQL> startup nomount pfile='/tmp/xff/pfile';

ORACLE instance started.

Total System Global Area 1.3255E+11 bytes

Fixed Size          2244832 bytes

Variable Size        9.7442E+10 bytes

Database Buffers     3.4897E+10 bytes

Redo Buffers          208654336 bytes

SQL> @rectl

Control file created.

SQL>

SQL>

SQL>

SQL> recover database;

ORA-00283: recovery session canceled due to errors

ORA-01610: recovery using the BACKUP CONTROLFILE option must be done

SQL> recover database using backup controlfile;

ORA-00283: recovery session canceled due to errors

ORA-00600: internal error code, arguments: [krhpfh_03-1210], [fno =], [1399],

[fhcpc =], [274968], [fhccc =], [274983], [], [], [], [], []

ORA-01110: data file 1399: '+DATA/xifenfei99.dbf'

这里的提示是有fhcpc和fhccc值不对导致,通过bbed查看相关值

BBED> set file 1399

    FILE#           1399

BBED> p kcvfhccc

ub4 kcvfhccc                                @148      0x00043227 ===>274983(10进制)

BBED> p kcvfhcpc

ub4 kcvfhcpc                                @140      0x00043218 ===>274968(10进制)

报错比较明显通过bbed修改这两个值

BBED> m /x 2a390400 offset 148

Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y

 File: /tmp/xff/1399.dbf.header (1399)

 Block: 1                Offsets:  148 to  659           Dba:0x5dc00001

------------------------------------------------------------------------

 2a390400 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 0c000000 0f004441

 5441315f 5442535f 45515f30 31000000 00000000 00000000 00000000 78010000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 cfebdd33 01000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 419333df 81001c0a 6ab13046 06000000

 c1520400 02000000 10000000 7e000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 0d000d00 0d000100 00000000 00000000

 <32 bytes per line>

BBED> m /x 2b390400 offset 140

 File: /tmp/xff/1399.dbf.header (1399)

 Block: 1                Offsets:  140 to  651           Dba:0x5dc00001

------------------------------------------------------------------------

 2b390400 e6ef524d 2a390400 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 0c000000 0f004441 5441315f 5442535f 45515f30 31000000 00000000 00000000

 00000000 78010000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 cfebdd33 01000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 419333df 81001c0a

 6ab13046 06000000 c1520400 02000000 10000000 7e000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 00000000 00000000 00000000 00000000 00000000 00000000 0d000d00 0d000100

 <32 bytes per line>

修改好这些值之后,recover database和open数据库成功,检查字典正常,业务读写也正常,完成本次恢复任务

SQL> @hcheck

HCheck Version 07MAY18 on 21-AUG-2024 15:13:02

----------------------------------------------

Catalog Version 11.2.0.3.0 (1102000300)

db_name: XFF

                   Catalog   Fixed

Procedure Name             Version    Vs Release    Timestamp

Result

------------------------------ ... ---------- -- ---------- --------------

------

.- LobNotInObj             ... 1102000300 <=  *All Rel* 08/21 15:13:02 PASS

.- MissingOIDOnObjCol          ... 1102000300 <=  *All Rel* 08/21 15:13:02 PASS

.- SourceNotInObj          ... 1102000300 <=  *All Rel* 08/21 15:13:02 PASS

.- OversizedFiles          ... 1102000300 <=  *All Rel* 08/21 15:13:02 PASS

.- PoorDefaultStorage          ... 1102000300 <=  *All Rel* 08/21 15:13:02 PASS

.- PoorStorage             ... 1102000300 <=  *All Rel* 08/21 15:13:02 PASS

.- TabPartCountMismatch        ... 1102000300 <=  *All Rel* 08/21 15:13:02 PASS

.- OrphanedTabComPart          ... 1102000300 <=  *All Rel* 08/21 15:13:03 PASS

.- MissingSum$             ... 1102000300 <=  *All Rel* 08/21 15:13:03 PASS

.- MissingDir$             ... 1102000300 <=  *All Rel* 08/21 15:13:03 PASS

.- DuplicateDataobj        ... 1102000300 <=  *All Rel* 08/21 15:13:03 PASS

.- ObjSynMissing           ... 1102000300 <=  *All Rel* 08/21 15:13:03 PASS

.- ObjSeqMissing           ... 1102000300 <=  *All Rel* 08/21 15:13:03 PASS

.- OrphanedUndo            ... 1102000300 <=  *All Rel* 08/21 15:13:03 PASS

.- OrphanedIndex           ... 1102000300 <=  *All Rel* 08/21 15:13:03 PASS

.- OrphanedIndexPartition      ... 1102000300 <=  *All Rel* 08/21 15:13:03 PASS

.- OrphanedIndexSubPartition   ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- OrphanedTable           ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- OrphanedTablePartition      ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- OrphanedTableSubPartition   ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- MissingPartCol          ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- OrphanedSeg$            ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- OrphanedIndPartObj#         ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- DuplicateBlockUse           ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- FetUet              ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- Uet0Check               ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- SeglessUET              ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- BadInd$             ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- BadTab$             ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- BadIcolDepCnt           ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- ObjIndDobj              ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- TrgAfterUpgrade         ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- ObjType0            ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- BadOwner            ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- StmtAuditOnCommit           ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- BadPublicObjects        ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- BadSegFreelist          ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- BadDepends              ... 1102000300 <=  *All Rel* 08/21 15:13:04 PASS

.- CheckDual               ... 1102000300 <=  *All Rel* 08/21 15:13:05 PASS

.- ObjectNames             ... 1102000300 <=  *All Rel* 08/21 15:13:05 PASS

.- BadCboHiLo              ... 1102000300 <=  *All Rel* 08/21 15:13:05 PASS

.- ChkIotTs            ... 1102000300 <=  *All Rel* 08/21 15:13:05 PASS

.- NoSegmentIndex          ... 1102000300 <=  *All Rel* 08/21 15:13:05 PASS

.- BadNextObject           ... 1102000300 <=  *All Rel* 08/21 15:13:05 PASS

.- DroppedROTS             ... 1102000300 <=  *All Rel* 08/21 15:13:05 PASS

.- FilBlkZero              ... 1102000300 <=  *All Rel* 08/21 15:13:05 PASS

.- DbmsSchemaCopy          ... 1102000300 <=  *All Rel* 08/21 15:13:05 PASS

.- OrphanedObjError        ... 1102000300 >  1102000000 08/21 15:13:05 PASS

.- ObjNotLob               ... 1102000300 <=  *All Rel* 08/21 15:13:05 PASS

.- MaxControlfSeq          ... 1102000300 <=  *All Rel* 08/21 15:13:05 PASS

.- SegNotInDeferredStg         ... 1102000300 >  1102000000 08/21 15:13:06 PASS

.- SystemNotRfile1         ... 1102000300 >   902000000 08/21 15:13:06 PASS

.- DictOwnNonDefaultSYSTEM     ... 1102000300 <=  *All Rel* 08/21 15:13:07 PASS

.- OrphanTrigger           ... 1102000300 <=  *All Rel* 08/21 15:13:07 PASS

.- ObjNotTrigger           ... 1102000300 <=  *All Rel* 08/21 15:13:07 PASS

---------------------------------------

21-AUG-2024 15:13:07  Elapsed: 5 secs

---------------------------------------

Found 0 potential problem(s) and 0 warning(s)

PL/SQL procedure successfully completed.

Statement processed.

Complete output is in trace file:

/u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_70961_HCHECK.trc

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值