ora_00600 [4042] [ktssdrp1]
2012-06-01 10:12
2012-06-01 10:12
| 1 OS及版本Red Hat Enterprise Linux AS release 4 (Nahant Update 7) Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production 2 现象 Errors in file /data/oracle9/admin/JDB/bdump/msdb_j000_18901.trc: ORA-00600: internal error code, arguments: [4042], [18785], [], [], [], [], [], [] Thu May 31 16:15:10 2012 Thread 1 advanced to log sequence 49321 Current log# 22 seq# 49321 mem# 0: /dev2/oradata/log22.ora Thu May 31 16:15:10 2012 ARC0: Evaluating archive log 21 thread 1 sequence 49320 ARC0: Beginning to archive log 21 thread 1 sequence 49320 Creating archive destination LOG_ARCHIVE_DEST_1: '/dev2/arch/1_49320.dbf' ARC0: Completed archiving log 21 thread 1 sequence 49320 Thu May 31 16:25:19 2012 Errors in file /data/oracle9/admin/JDB/bdump/msdb_j042_17320.trc: ORA-00600: internal error code, arguments: [4042], [18785], [], [], [], [], [], [] Thu May 31 16:29:22 2012 Thread 1 advanced to log sequence 49322 Current log# 23 seq# 49322 mem# 0: /dev2/oradata/log23.ora Thu May 31 16:29:22 2012 ARC1: Evaluating archive log 22 thread 1 sequence 49321 ARC1: Beginning to archive log 22 thread 1 sequence 49321 Creating archive destination LOG_ARCHIVE_DEST_1: '/dev2/arch/1_49321.dbf' ARC1: Completed archiving log 22 thread 1 sequence 49321 Thu May 31 16:29:27 2012 Errors in file /data/oracle9/admin/JDB/bdump/msdb_j004_10918.trc: ORA-00600: internal error code, arguments: [4042], [18785], [], [], [], [], [], [] Thu May 31 16:36:41 2012 3 处理过程3.1 查找资料3.1.1 MOS
In this Document Goal Solution References Applies to: Oracle Server - Enterprise Edition - Version: 8.1.7.0 to 11.2.0.2.0 - Release: 8.1.7 to 11.2 Information in this document applies to any platform. Goal What steps should be followed if an ORA-600 [4042] is encountered in the alert.log or client application? Solution It is important to upload to Global Software Support:
where (ts#, file#, block#) not in (select ts#, file#, block# from seg$ where type# in (1,10)) and status$ > 1; One possible cause of this error is undo segments which do not have corresponding entries in SEG$. The above query will identify if this is the case. If the query returns rows, please report this to Global Software Support. 从mos上看,只提供了一个有用的SQL,其他没多大意义,让你提交给oracle技术支持! 3.1.2 网上其他资料大致的做法: (1) 先新建新的回滚表空间,然后再把切换过去,结合设置隐含参数:_offline_rollback_segments、_corrupted_rollback_segments,然后删除掉这个表空间! (2) 直接改数据字典 具体参考: http://www.killdb.com/2011/07/30/%E6%89%8B%E5%B7%A5%E6%B8%85%E9%99%A4%E5%9B%9E%E6%BB%9A%E6%AE%B5%E7%9A%84%E5%87%A0%E7%A7%8D%E6%96%B9%E6%B3%95.html 3.1.3 查看回滚段的信息根据MOS提供的SQL先查询 SQL> select us#, name, ts#, file#, block#,status$ from undo$ 2 where (ts#, file#, block#) not in 3 (select ts#, file#, block# from seg$ where type# in (1,10)) and status$ > 1; US# NAME TS# FILE# BLOCK# STATUS$ ---------- ------------------------------ ---------- ---------- ---------- ---------- 42 _SYSSMU42$ 11 151 30385 2 3.1.4 创建新的回滚表空间,并切换 /* OracleOEM */ CREATE UNDO TABLESPACE "UNDO06" DATAFILE '/data/oracle9/oradata/JDB/UNDO06.dbf' SIZE 106M AUTOEXTEND ON NEXT 16M MAXSIZE UNLIMITED Thu May 31 16:36:42 2012 Created Undo Segment _SYSSMU86$ …… Completed: /* OracleOEM */ CREATE UNDO TABLESPACE "UNDO06" DA Thu May 31 16:37:14 2012 3.1.5 切换回滚表空间 ALTER SYSTEM SET undo_tablespace='UNDO06' SCOPE=BOTH; Thu May 31 16:37:34 2012 Created Undo Segment _SYSSMU96$ Undo Segment 96 Onlined Thu May 31 16:37:34 2012 Created Undo Segment _SYSSMU97$ Thu May 31 16:37:34 2012 Undo Segment 97 Onlined Thu May 31 16:37:34 2012 …… Undo Segment 117 Onlined Thu May 31 16:38:15 2012 3.1.6 查看原回滚表空间是否有在线的回滚段SELECT a.owner || '.' || a.segment_name roll_name , a.tablespace_name tablespace , TO_CHAR(a.initial_extent) || ' / ' || TO_CHAR(a.next_extent) in_extents , TO_CHAR(a.min_extents) || ' / ' || TO_CHAR(a.max_extents) m_extents , a.status status , b.bytes/1024/1024 "bytes(M)" , b.extents extents , d.shrinks shrinks , d.wraps wraps , d.optsize opt FROM dba_rollback_segs a , dba_segments b , v$rollname c , v$rollstat d WHERE a.segment_name = b.segment_name AND a.segment_name = c.name (+) AND c.usn = d.usn (+) ORDERBY a.tablespace_name,B.segment_name 发现原回滚表空间有两个回滚段在线 3.1.7 看那些应用导致两个回滚段在线 SELECT r.name roll_name , s.osuser || '/' || s.username userID , s.sid || '/' || s.serial# usercode , s.program program , s.status status , s.machine machine FROM v$lock l , v$rollname r , v$session s WHERE s.sid = l.sid AND TRUNC (l.id1(+)/65536) = r.usn AND l.type(+) = 'TX' AND l.lmode(+) = 6 ORDERBY r.name / 结果发现是两个在执行查询的连个会话,可以kill掉 SELECT 'kill -9 '||spid,v1.* FROM v$session v1,v$process v2 WHERE v1.PADDR=v2.addr 3.1.8 再次观察是原回滚表空间是否有在线回滚段 运行3.1.6发现原回滚表空间没有在线回滚段。 结合日志看: Undo Tablespace 98 moved to Pending Switch-Out state. Thu May 31 16:43:22 2012 Undo Tablespace 98 moved to Pending Switch-Out state. Thu May 31 16:43:43 2012 3.1.9 删除原回滚表空间 发现删不了,并继续报错! /* OracleOEM */ DROP TABLESPACE "UNDOTBS2" INCLUDING CONTENTS AND DATAFILES CASCADE CONSTRAINTS Thu May 31 18:21:48 2012 Errors in file /data/oracle9/admin/JDB/udump/msdb_ora_13977.trc: ORA-00600: internal error code, arguments: [ktssdrp1], [11], [151], [30385], [], [], [], [] ORA-600 signalled during: /* OracleOEM */ DROP TABLESPACE "UNDOTBS2" INCLUDI... Thu May 31 18:24:35 2012 /* OracleOEM */ DROP TABLESPACE "UNDOTBS2" INCLUDING CONTENTS AND DATAFILES CASCADE CONSTRAINTS Thu May 31 18:24:35 2012 Errors in file /data/oracle9/admin/JDB/udump/msdb_ora_20126.trc: ORA-00600: internal error code, arguments: [ktssdrp1], [11], [151], [30385], [], [], [], [] Thu May 31 18:24:35 2012 ORA-600 signalled during: /* OracleOEM */ DROP TABLESPACE "UNDOTBS2" INCLUDI... Thu May 31 18:31:29 2012 Errors in file /data/oracle9/admin/JDB/bdump/msdb_j000_18901.trc: ORA-00600: internal error code, arguments: [4042], [18785], [], [], [], [], [], [] Thu May 31 18:40:32 2012 3.1.10 发通知准备晚上维护 为使应用正常过渡,目前虽然报错,但暂时不应用应用,准备在晚上9点左右加隐含参数,重启后删除原回滚表空间。 Thread 1 advanced to log sequence 49330 Current log# 27 seq# 49330 mem# 0: /dev2/oradata/log27.ora Thu May 31 18:40:32 2012 ARC1: Evaluating archive log 26 thread 1 sequence 49329 ARC1: Beginning to archive log 26 thread 1 sequence 49329 Creating archive destination LOG_ARCHIVE_DEST_1: '/dev2/arch/1_49329.dbf' Thu May 31 18:40:33 2012 Errors in file /data/oracle9/admin/JDB/bdump/msdb_j007_32665.trc: 3.1.11 计划赶不上变化 8点不到就出现数据库异常 ü 客户端:连不上数据库 ü 服务端:执行命令报“-bash: start_pipeline: pgrp pipe: Too many open files in system”,执行任何命令都报类似错误,命令已经不可运行。 ü 后台日志:连续出现大量的ORA-00600错误 Errors in file /data/oracle9/admin/JDB/udump/msdb_ora_7905.trc: ORA-00600: internal error code, arguments: [4042], [18785], [], [], [], [], [], [] Thu May 31 19:52:16 2012 Errors in file /data/oracle9/admin/JDB/bdump/msdb_j000_18901.trc: ORA-00600: internal error code, arguments: [4042], [18785], [], [], [], [], [], [] Thu May 31 19:53:44 2012 Errors in file /data/oracle9/admin/JDB/bdump/msdb_j001_18903.trc: ORA-00600: internal error code, arguments: [4042], [18785], [], [], [], [], [], [] Thu May 31 19:54:19 2012 Errors in file /data/oracle9/admin/JDB/bdump/msdb_j000_18901.trc: ORA-00600: internal error code, arguments: [4042], [18785], [], [], [], [], [], [] Thu May 31 19:55:00 2012 3.1.12 关闭数据库,准备重启数据库 PS:init.ora中的隐含参数_offline_rollback_segments、_corrupted_rollback_segments在2小时前准备好,所以这里直接重启就好了 由于在oracle账号下无法执行SQLPLUS命令,所以准备用root账号登录再执行! 3.1.12.1 关闭数据库,发现有长时间的等待提示 errors in file /data/oracle9/admin/JDB/bdump/msdb_j004_10918.trc: ORA-00600: internal error code, arguments: [4042], [18785], [], [], [], [], [], [] Thu May 31 19:55:41 2012 Shutting down instance: further logons disabled Thu May 31 19:55:59 2012 Shutting down instance (immediate) License high water mark = 295 Thu May 31 20:01:06 2012 Active call for process 18879 user 'oracle' program 'oracle@FDB02 (S000)' Active call for process 1421 user 'oracle' program 'oracle@FDB02 (S014)' Active call for process 31909 user 'oracle' program 'oracle@FDB02 (S008)' ...... Active call for process 3943 user 'oracle' program 'oracle@FDB02 (S019)' SHUTDOWN: waiting for active calls to complete. 3.1.12.2 Kill掉非后台进程后 Archive process shutdown avoided: 0 active ARCH: Archiving is disabled Shutting down archive processes Archiving is disabled Archive process shutdown avoided: 0 active Thu May 31 20:12:01 2012 Starting ORACLE instance (normal) 3.1.13 重新启动数据库 仍然报ORA-00600 [4042],看来隐含参数不起作用,类似问题处理过不少,这次是第一次出现隐含参数无效的情况!而且删除原回滚表空间时仍然是原来的提示,删不掉! Successfully onlined Undo Tablespace 99. Errors in file /data/oracle9/admin/JDB/bdump/msdb_j000_19610.trc: ORA-00600: internal error code, arguments: [4042], [18785], [], [], [], [], [], [] Thu May 31 20:12:36 2012 Thu May 31 20:13:33 2012 …… Errors in file /data/oracle9/admin/JDB/udump/msdb_ora_20138.trc: ORA-00600: internal error code, arguments: [ktssdrp1], [11], [151], [30385], [], [], [], [] 试了几次仍然是报上述错误! 3.1.14 死马当活马医 实现已经对undo$中us#=42的记录做备份 US# NAME TS# FILE# BLOCK# STATUS$ ---------- ------------------------------ ---------- ---------- ---------- ---------- 42 _SYSSMU42$ 11 151 30385 2 delete from undo$ where us#=42; commit; 直接删除undo$中这条记录发现正常 | |||||||||
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/751051/viewspace-731720/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/751051/viewspace-731720/
本文记录了一次处理 Oracle 数据库 ORA-00600 错误的过程,包括创建新的回滚表空间、尝试删除故障表空间以及最终采取非常规手段解决问题的经历。
4402

被折叠的 条评论
为什么被折叠?



