WCDMA测试库故障处理过程
1.问题定位
初步观察,数据库进程全部关闭,ipc资源也被释放,但alert里面没有关闭和报错信息。
2.尝试启动数据库
SQL>sqlplus / as sysdba
SQL>startup
Errors in file /opt/oracle/admin/GDWDB1/udump/gdwdb1_ora_13740.trc:
ORA-00320: cannot read file header from log 1 of thread 1
ORA-00312: online log 1 thread 1: '/data/GDWDB1/GDWDB1/redo01.log'
ORA-27091: unable to queue I/O
ORA-27069: attempt to do I/O beyond the range of the file
Additional information: 1
Additional information: 1
Thu Oct 8 16:53:48 2009
Aborting crash recovery due to error 320
Thu Oct 8 16:53:48 2009
Errors in file /opt/oracle/admin/GDWDB1/udump/gdwdb1_ora_13740.trc:
ORA-00320: cannot read file header from log 1 of thread 1
ORA-00312: online log 1 thread 1: '/data/GDWDB1/GDWDB1/redo01.log'
ORA-27091: unable to queue I/O
ORA-27069: attempt to do I/O beyond the range of the file
Additional information: 1
Additional information: 1
ORA-320 signalled during: ALTER DATABASE OPEN...
报的是redo文件错误,无法读取文件头。
查看了一下redo文件,发现三个redo文件的大小都是0。
再查看跟踪文件/opt/oracle/admin/GDWDB1/udump/gdwdb1_ora_13740.trc,也没有看到什么线索。
Ioctl ASYNC_CONFIG error, errno = 1
/opt/oracle/admin/GDWDB1/udump/gdwdb1_ora_17405.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /opt/oracle/OraHome_1
System name: HP-UX
Node name: GDWDB
Release: B.11.31
Version: U
Machine: ia64
Instance name: GDWDB1
Redo thread mounted by this instance: 1
Oracle process number: 0
Unix process pid: 17405, image: oracle@GDWDB
数据库没有归档也没有备份。没办法,XX的东西,见怪不怪了。
3.既然redo文件都没有了,就无法保证数据的一致性了,先清理一下redo,故障依然,但具体错误也忘记了。
SQL>alter database clear unarchived logfile group 1;
SQL>alter database clear unarchived logfile group 2;
SQL>alter database clear unarchived logfile group 3;
4.使用resetlogs打开数据库
SQL> shutdown immediate
ORA-01507: database not mounted
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.
Total System Global Area 2.1475E+10 bytes
Fixed Size 2241992 bytes
Variable Size 5192837688 bytes
Database Buffers 1.6274E+10 bytes
Redo Buffers 5857280 bytes
Database mounted.
SQL> recover database using backup controlfile until cancel;
ORA-00279: change 62463922 generated at 10/08/2009 18:52:37 needed for thread 1
ORA-00289: suggestion : /opt/oracle/OraHome_1/dbs/arch1_1_699734980.dbf
ORA-00280: change 62463922 for thread 1 is in sequence #1
Specify log: {=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/data/GDWDB1/GDWDB1/system01.dbf'
ORA-01112: media recovery not started
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
5.使用隐含参数_allow_resetlogs_corruption 以resetlogs方式打开数据库
SQL> shutdown immediate
ORA-01507: database not mounted
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.
Total System Global Area 2.1475E+10 bytes
Fixed Size 2241992 bytes
Variable Size 5192837688 bytes
Database Buffers 1.6274E+10 bytes
Redo Buffers 5857280 bytes
Database mounted.
SQL> recover database using backup controlfile until cancel;
ORA-00279: change 62463922 generated at 10/08/2009 18:52:37 needed for thread 1
ORA-00289: suggestion : /opt/oracle/OraHome_1/dbs/arch1_1_699734980.dbf
ORA-00280: change 62463922 for thread 1 is in sequence #1
Specify log: {=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/data/GDWDB1/GDWDB1/system01.dbf'
ORA-01112: media recovery not started
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
查看alert日志,发现如下错误
Errors in file /opt/oracle/admin/GDWDB1/udump/gdwdb1_ora_1370.trc:
ORA-00600: internal error code, arguments: [2662], [0], [62464000], [0], [62512366], [8388617], [], []
Thu Oct 8 20:12:00 2009
Errors in file /opt/oracle/admin/GDWDB1/udump/gdwdb1_ora_1370.trc:
ORA-00600: internal error code, arguments: [2662], [0], [62464000], [0], [62512366], [8388617], [], []
Thu Oct 8 20:12:00 2009
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 1370
ORA-1092 signalled during: ALTER DATABASE OPEN...
看看关于这个错误的资料,原因是scn不一致引起的
ORA-00600: internal error code, arguments: [2662], [0], [62464000], [0], [62512366], [8388617], [], []
就是说当前scn:62464000 < 文件scn:62512366
6.强行推进scn
SQL>alter system set "_ALLOW_RESETLOGS_CORRUPTION"=false scope=spfile;
SQL>shutdown immediate
SQL>startup mount
SQL>alter session set events 'IMMEDIATE trace name ADJUST_SCN level 2';--将scn推进2百万。
SQL>alter database open;
新的错误又出来了,
ksedmp: internal or fatal errorORA-00600: internal error code, arguments: [4193], [1171], [1187], [], [], [], [], []
Current SQL statement for this session:
UPDATE SMON_SCN_TIME SET SCN_WRP=:1, SCN_BAS=:2, TIME_MP=:3, TIME_DP=:4
WHERE TIME_MP = :5 AND THREAD = :6 AND ROWNUM <= 1
还有部分错误报undo的错误。
再看看资料,说这错误是上一错误ORA-00600: internal error code, arguments: [2662]引起的副作用,需要重建一下undo表空间。
7.重建undo表空间
SQL>create undo tablespace undotbs02 as datafile '/data/GDWDB1/undotbs02.dbf' size 500M;
SQL>alter system set undo_tablespace = undotbs02 scope=both;
SQL>shutdown immediate
SQL>startup
数据库成功打开。
8.使用expdp将数据备份。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/13605188/viewspace-616196/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/13605188/viewspace-616196/