
只读数据文件损坏恢复实验记录
只读表空间的数据文件指为只读数据文件(read only),当一个表空间从read write更改为read only时,该数据文件里已经产生的脏块儿都会由DBWn写到磁盘,完成一次不完整的完全检查点。从这一刻起,该数据文件数据块和文件头信息都不再更新,包括检查点。在以后open数据库时实例也忽略该只读数据文件文件头的检查点SCN与其他数据文件或联机REDO日志的同步。
我们可以通过以下命令在read only和read write之间进行更改
read write –> read only:
alter tablespace [tablespace_name] read only;
read only –> read write:
alter tablespace [tablespace_name] read write;
查看系统里的只读表空间及其只读数据文件:
sys@MAA> col TABLESPACE_NAME for a20
sys@MAA> col FILE_NAME for a65
sys@MAA> select t.tablespace_name, d.file_id, d.file_name from dba_tablespaces t, dba_data_files d
2 where t.tablespace_name=d.tablespace_name
3 and t.status='READ ONLY'
4 /
TABLESPACE_NAME FILE_ID FILE_NAME
-------------------- ---------- -----------------------------------------------------------------
LTB 12 /u01/app/oracle/oradata/ltb01.dbf
下面我们针对以下几种场景进行恢复:
1)控制文件无损,数据库运行时丢失了只读数据文件,访问其内数据报错
2)控制文件无损,只读数据文件丢失,数据库无法OPEN
3)控制文件无损,只读数据文件头部损坏
4)控制文件无损,有读写状态时候的备份,后来更改为只读状态,只读状态时数据文件损坏
5)控制文件无损,有只读状态时候的备份,后来更改为读写状态,读写状态时数据文件损坏
6)控制文件损坏,只读数据文件损坏
7)控制文件损坏,有只读数据文件状态时候的控制文件备份,后来数据文件改为读写,读写状态时数据文件损坏
8)控制文件损坏,在线日志文件没有损坏,归档日志丢失,有旧的跟踪控制文件trace,数据库里有只读数据文件
前提条件,有相应的备份。
操作环境:
• OS : Oracle Enterprise Linux 5.5 64Bit
• DB Type : Oracle Restart
• DB Version : 11.2.0.3
准备一个实验对象只读表空间
sys@MAA> create tablespace ltb datafile '/u01/app/oracle/oradata/ltb01.dbf' size 50M;
sys@MAA> create user l identified by oracle default tablespace ltb;
sys@MAA> grant resource, connect to l;
sys@MAA> create table l.ning(domain varchar2(100));
sys@MAA> insert into l.ning values('Create your own social network with the best community website builder - NING');
sys@MAA> commit;
sys@MAA> alter tablespace ltb read only;
场景1:控制文件无损,数据库运行时只读数据文件损坏,访问其内数据报错
备份只读表空间ltb
RMAN> backup tablespace ltb;
Starting backup at 23-DEC-2012 18:29:30
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=44 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00012 name=/u01/app/oracle/oradata/ltb01.dbf
channel ORA_DISK_1: starting piece 1 at 23-DEC-2012 18:29:31
channel ORA_DISK_1: finished piece 1 at 23-DEC-2012 18:29:32
piece handle=/u01/recovery/MAA/backupset/2012_12_23/o1_mf_nnndf_TAG20121223T182931_8ffq8cbz_.bkp tag=TAG20121223T182931 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 23-DEC-2012 18:29:32
数据库运行状态下,破坏只读数据文件以模拟损坏
[oracle@maa3 ~]$ dd if=/dev/zero of=/u01/app/oracle/oradata/ltb01.dbf bs=10M count=5
5+0 records in
5+0 records out
52428800 bytes (52 MB) copied, 0.04918 seconds, 1.1 GB/s
查看v$datafile内容时,返回正常
sys@MAA> col NAME for a65
sys@MAA> select checkpoint_change# from v$datafile where file#=12;
CHECKPOINT_CHANGE#
------------------
3567837
但日志里将报错:
Hex dump of (file 12, block 1) in trace file /u01/app/oracle/diag/rdbms/maa/maa/trace/maa_m000_5343.trc
Corrupt block relative dba: 0x03000001 (file 12, block 1)
Completely zero block found during kcvxfh v8
Reading datafile '/u01/app/oracle/oradata/ltb01.dbf' for corruption at rdba: 0x03000001 (file 12, block 1)
Reread (file 12, block 1) found different corrupt data (no logical check)
Hex dump of (file 12, block 1) in trace file /u01/app/oracle/diag/rdbms/maa/maa/trace/maa_m000_5343.trc
Corrupt block relative dba: 0x03000001 (file 12, block 1)
Completely zero block found during reread
访问该数据文件里的数据:
sys@MAA> select * from l.ning;
select * from l.ning
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 12, block # 138)
ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'
告警日志也会抛出相应错误信息:
Hex dump of (file 12, block 2) in trace file /u01/app/oracle/diag/rdbms/maa/maa/trace/maa_ora_5220.trc
Corrupt block relative dba: 0x03000002 (file 12, block 2)
Completely zero block found during buffer read
Reading datafile '/u01/app/oracle/oradata/ltb01.dbf' for corruption at rdba: 0x03000002 (file 12, block 2)
Reread (file 12, block 2) found same corrupt data (no logical check)
Sun Dec 23 19:56:00 2012
Corrupt Block Found
TSN = 7, TSNAME = LTB
RFN = 12, BLK = 2, RDBA = 50331650
OBJN = -1, OBJD = -1, OBJECT = , SUBOBJECT =
SEGMENT OWNER = , SEGMENT TYPE =
Errors in file /u01/app/oracle/diag/rdbms/maa/maa/trace/maa_ora_5220.trc (incident=31391):
ORA-01578: ORACLE data block corrupted (file # 12, block # 2)
ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'
Hex dump of (file 12, block 1) in trace file /u01/app/oracle/diag/rdbms/maa/maa/trace/maa_ora_5220.trc
Corrupt block relative dba: 0x00000001 (file 12, block 1)
Completely zero block found during validating datafile for block range
Reread of blocknum=1, file=/u01/app/oracle/oradata/ltb01.dbf. found same corrupt data
Reread of blocknum=1, file=/u01/app/oracle/oradata/ltb01.dbf. found same corrupt data
Reread of blocknum=1, file=/u01/app/oracle/oradata/ltb01.dbf. found same corrupt data
Reread of blocknum=1, file=/u01/app/oracle/oradata/ltb01.dbf. found same corrupt data
Reread of blocknum=1, file=/u01/app/oracle/oradata/ltb01.dbf. found same corrupt data
Errors in file /u01/app/oracle/diag/rdbms/maa/maa/trace/maa_ora_5220.trc:
ORA-19563: datafile header validation failed for file /u01/app/oracle/oradata/ltb01.dbf
ORA-01251: Unknown File Header Version read for file number 12
ORA-01578: ORACLE data block corrupted (file # 12, block # 2)
ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'
Sun Dec 23 19:56:01 2012
Sweep [inc][31391]: completed
但此时只读数据文件状态还是ONLINE
sys@MAA> col FILE_NAME for a65
sys@MAA> select file_id, file_name, online_status from dba_data_files where tablespace_name='LTB';
FILE_ID FILE_NAME ONLINE_STATUS
---------- ----------------------------------------------------------------- --------------
12 /u01/app/oracle/oradata/ltb01.dbf ONLINE
这里恢复需要注意,如果只是有数据讹误块儿导致数据访问报错,那可以简单通过RMAN的块恢复命令进行块级别恢复。
但这里我是将整个数据文件破坏掉了,因此需要RMAN来还原数据文件,恢复可以在数据文件OPEN模式进行
这时候我们直接RESTORE DATAFILE是不可行的,我们需要先将只读数据文件脱机处理,还原之后再ONLINE
RMAN> restore tablespace ltb;
Starting restore at 23-DEC-2012 19:46:43
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00012 to /u01/app/oracle/oradata/ltb01.dbf
channel ORA_DISK_1: reading from backup piece /u01/recovery/MAA/backupset/2012_12_23/o1_mf_nnndf_TAG20121223T184851_8ffrdmhg_.bkp
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 12/23/2012 19:46:44
ORA-19870: error while restoring backup piece /u01/recovery/MAA/backupset/2012_12_23/o1_mf_nnndf_TAG20121223T184851_8ffrdmhg_.bkp
ORA-19573: cannot obtain exclusive enqueue for datafile 12
正确的做法
RMAN> run{
2> sql 'alter database datafile 12 offline';
3> restore tablespace ltb;
4> sql 'alter database datafile 12 online';
5> }
sql statement: alter database datafile 12 offline
Starting restore at 23-DEC-2012 19:50:41
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00012 to /u01/app/oracle/oradata/ltb01.dbf
channel ORA_DISK_1: reading from backup piece /u01/recovery/MAA/backupset/2012_12_23/o1_mf_nnndf_TAG20121223T184851_8ffrdmhg_.bkp
channel ORA_DISK_1: piece handle=/u01/recovery/MAA/backupset/2012_12_23/o1_mf_nnndf_TAG20121223T184851_8ffrdmhg_.bkp tag=TAG20121223T184851
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
Finished restore at 23-DEC-2012 19:50:44
sql statement: alter database datafile 12 online
恢复后数据访问正常:
sys@MAA> select * from l.ning;
DOMAIN
----------------------------------------------------------------------------------------------------
Create your own social network with the best community website builder - NING
当然我们要是有镜像备份文件,也可以SWITCH
RMAN> backup as copy tablespace ltb;
Starting backup at 23-DEC-2012 19:54:34
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile copy
input datafile file number=00012 name=/u01/app/oracle/oradata/ltb01.dbf
output file name=/u01/recovery/MAA/datafile/o1_mf_ltb_8ffw7tk0_.dbf tag=TAG20121223T195434 RECID=23 STAMP=802814075
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
Finished backup at 23-DEC-2012 19:54:35
在发现只读数据文件损坏的情况下
RMAN> run{
2> sql 'alter database datafile 12 offline';
3> switch datafile 12 to datafilecopy '/u01/recovery/MAA/datafile/o1_mf_ltb_8ffw7tk0_.dbf';
4> sql 'alter database datafile 12 online';
5> }
sql statement: alter database datafile 12 offline
datafile 12 switched to datafile copy
input datafile copy RECID=23 STAMP=802814075 file name=/u01/recovery/MAA/datafile/o1_mf_ltb_8ffw7tk0_.dbf
sql statement: alter database datafile 12 online
已经成功SWITCH,可查看v$datafile进行确认
sys@MAA> select checkpoint_change#, name from v$datafile where file#=12;
CHECKPOINT_CHANGE# NAME
------------------ -----------------------------------------------------------------
3567837 /u01/recovery/MAA/datafile/o1_mf_ltb_8ffw7tk0_.dbf
查看数据也正常
sys@MAA> select * from l.ning;
DOMAIN
----------------------------------------------------------------------------------------------------
Create your own social network with the best community website builder - NING
最后别忘了将损坏了SWITCH之前的数据文件进行删除
[oracle@maa3 ~]$ rm -rf /u01/app/oracle/oradata/ltb01.dbf
那有人可能会奇怪,如果直接删除掉只读数据文件,那不也是一样的效果吗?但其实不然
删除只读数据文件以模拟丢失
[oracle@maa3 ~]$ rm -rf /u01/app/oracle/oradata/ltb01.dbf
sys@MAA> alter system checkpoint;
System altered.
sys@MAA> alter system flush buffer_cache;
System altered.
sys@MAA> select * from l.ning;
DOMAIN
----------------------------------------------------------------------------------------------------
Create your own social network with the best community website builder - NING
可见我们数据依然正常访问,这里需要了解文件描述符的知识
EYGLE的《数据安全警示录》里有通过文件描述符进行恢复的案例,其中有下面这么一段话:
在UNIX、Linux系统中,误删除数据文件后,虽然该文件已从操作系统中删除,但是其文件句柄仍由数据库进程打开持有,所以在数据库层面仍然不会释放其链表信息,因而也就能够从进程的地址信息中,通过复制将其直接恢复。但是请注意,这要求数据库不能中途关闭,如果关闭了数据库,则所有文件句柄被释放,文件就真的难以回归了。
下面我从文件描述符中进行数据文件的恢复:
[oracle@maa3 ~]$ ps -ef | grep dbw | grep -v grep
grid 4817 1 0 Dec19 ? 00:00:00 asm_dbw0_+ASM
oracle 6578 1 0 20:39 ? 00:00:00 ora_dbw0_maa
[root@maa3 ~]# ls /proc/6578/fd
0 1 10 11 12 13 14 15 16 17 2 20 21 22 23 24 25 256 257 26 27 28 3 4 5 6 7 8 9
[root@maa3 ~]# ls -l /proc/6578/fd/12
lr-x------ 1 root root 64 Dec 23 21:05 /proc/6578/fd/12 -> /dev/null
[root@maa3 ~]# ls -l /proc/6578/fd/ | grep oracle
lrwx------ 1 root root 64 Dec 23 21:05 11 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_maa.dat
lr-x------ 1 root root 64 Dec 23 21:05 13 -> /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb
lrwx------ 1 root root 64 Dec 23 21:05 15 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_maa.dat
lrwx------ 1 root root 64 Dec 23 21:05 16 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/lkMAA
lrwx------ 1 root root 64 Dec 23 21:05 17 -> /dev/oracleasm/iid/000000000000000F
lrwx------ 1 root root 64 Dec 23 21:05 256 -> /u01/app/oracle/oradata/luocs01.dbf
lrwx------ 1 root root 64 Dec 23 21:05 257 -> /u01/app/oracle/oradata/ltb01.dbf (deleted)
lr-x------ 1 root root 64 Dec 23 21:05 27 -> /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb
lrwx------ 1 root root 64 Dec 23 21:05 5 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_maa.dat
-- 可以发现/u01/app/oracle/oradata/ltb01.dbf (deleted)已删除标记
我从文件描述符中将文件拷贝到原位置:
[root@maa3 ~]# cp /proc/6578/fd/257 /u01/app/oracle/oradata/ltb01.dbf -p
[oracle@maa3 ~]$ ls /u01/app/oracle/oradata/ltb01.dbf
-rw-r----- 1 oracle asmadmin 52436992 Dec 23 20:50 /u01/app/oracle/oradata/ltb01.dbf
重启数据库,正常
sys@MAA> startup force
ORACLE instance started.
Total System Global Area 1252663296 bytes
Fixed Size 2227944 bytes
Variable Size 855638296 bytes
Database Buffers 385875968 bytes
Redo Buffers 8921088 bytes
Database mounted.
Database opened.
sys@MAA> select checkpoint_change#, name from v$datafile where file#=12;
CHECKPOINT_CHANGE# NAME
------------------ -----------------------------------------------------------------
3654318 /u01/app/oracle/oradata/ltb01.dbf
数据访问正常
sys@MAA> select * from l.ning;
DOMAIN
----------------------------------------------------------------------------------------------------
Create your own social network with the best community website builder - NING
这里之所以不需要RECOVER数据库,是因为我们操作的对象为只读数据文件。
如果是读写状态下的数据文件,将文件拷贝过来之后需要先offline datafile后recover datafile,然后再online datafile。
另外,只读数据文件删除之后如果重启数据库,依然会报ORA-01110错误,这个在场景2中说明。
场景2:控制文件无损,只读数据文件丢失,数据库无法OPEN
删除只读数据文件以模拟丢失
[oracle@maa3 ~]$ rm -rf /u01/app/oracle/oradata/ltb01.dbf
数据库启动时报错
sys@MAA> startup force
ORACLE instance started.
Total System Global Area 1252663296 bytes
Fixed Size 2227944 bytes
Variable Size 855638296 bytes
Database Buffers 385875968 bytes
Redo Buffers 8921088 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 12 - see DBWR trace file
ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'
只读数据文件恢复时我们不需要recover过程,数据库启动时已经到MOUNT状态,通过RMAN还原该数据文件
RMAN> restore tablespace ltb;
Starting restore at 23-DEC-2012 18:36:54
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=22 device type=DISK
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00012 to /u01/app/oracle/oradata/ltb01.dbf
channel ORA_DISK_1: reading from backup piece /u01/recovery/MAA/backupset/2012_12_23/o1_mf_nnndf_TAG20121223T182931_8ffq8cbz_.bkp
channel ORA_DISK_1: piece handle=/u01/recovery/MAA/backupset/2012_12_23/o1_mf_nnndf_TAG20121223T182931_8ffq8cbz_.bkp tag=TAG20121223T182931
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 23-DEC-2012 18:36:57
打开数据文件
RMAN> alter database open;
database opened
sys@MAA> select * from ning;
DOMAIN
----------------------------------------------------------------------------------------------------
Create your own social network with the best community website builder - NING
场景3:控制文件无损,只读数据文件头部损坏
我使用BBED工具破坏头部信息
[oracle@maa3 ~]$ cat bbed.par
12 /u01/app/oracle/oradata/ltb01.dbf
[oracle@maa3 ~]$ cd $ORACLE_HOME/rdbms/lib
[oracle@maa3 lib]$ bbed listfile=/home/oracle/bbed.par blocksize=8192 mode=edit
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Sun Dec 23 21:42:22 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> info
File# Name Size(blks)
----- ---- ----------
12 /u01/app/oracle/oradata/ltb01.dbf 0
BBED> m /c Create your own social network with the best community website builder - NING
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) Y
File: /u01/app/oracle/oradata/ltb01.dbf (12)
Block: 1 Offsets: 0 to 511 Dba:0x03000001
------------------------------------------------------------------------
7777772e 6c756f63 732e636f 6d000104 ea850000 00000000 0000200b b90df827
4d414100 00000000 fd140000 00190000 00200000 0c000300 00000000 00000000
54414732 30313231 32323354 32313234 33320000 00000000 00000000 00000000
00000000 ba163800 00000000 470dda2f 791c352f 01000000 00000000 00000000
00000000 00000000 00000000 03000000 00000000 02000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 07000000 03004c54 42000000 00000000 00000000
00000000 00000000 00000000 00000000 0c000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 f5173800 00000000 780dda2f 01000020 5e000000 83010000 1000a600
<32 bytes per line>
BBED> sum apply
Check value for File 12, Block 1:
current = 0x2a9e, required = 0x2a9e
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/app/oracle/oradata/ltb01.dbf
BLOCK = 1
Block 1 is corrupt
Corrupt block relative dba: 0x63400001 (file 0, block 1)
Bad header found during verification
Data in bad block:
type: 119 format: 7 rdba: 0x636f756c
last change scn: 0x006d.6f632e73 seq: 0x1 flg: 0x04
spare1: 0x77 spare2: 0x2e spare3: 0x0
consistency value in tail: 0x00000b01
check value in block header: 0x2a9e
computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing (Index): 0
Total Blocks Empty : 0
Total Blocks Marked Corrupt : 1
Total Blocks Influx : 0
Message 531 not found; product=RDBMS; facility=BBED
访问v$datafile里查看检查点
sys@MAA> select checkpoint_change# from v$datafile where file#=12;
CHECKPOINT_CHANGE#
------------------
3676149
-- 能够正常返回值
告警日志里去报错:
Corrupt block relative dba: 0x03000001 (file 12, block 1)
Bad header found during kcvxfh v10
这时候数据库启动,会收到头部损坏错误
sys@MAA> startup force
ORACLE instance started.
Total System Global Area 1252663296 bytes
Fixed Size 2227944 bytes
Variable Size 855638296 bytes
Database Buffers 385875968 bytes
Redo Buffers 8921088 bytes
Database mounted.
ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'
ORA-01210: data file header is media corrupt
恢复方法非常简单:
run{
startup force mount;
restore tablespace ltb;
alter database open;
}
场景4:控制文件无损,有读写状态时候的备份,后来更改为只读状态,只读状态时数据文件损坏
首先我重新构造这样一个环境
sys@MAA> select * from l.ning;
DOMAIN
----------------------------------------------------------------------------------------------------
Create your own social network with the best community website builder - NING
将表空间恢复到读写状态
sys@MAA> alter tablespace ltb read write;
Tablespace altered.
我们有当前读写状态时候的备份
RMAN> backup tablespace ltb;
Starting backup at 24-DEC-2012 03:06:12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=52 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00012 name=/u01/app/oracle/oradata/ltb01.dbf
channel ORA_DISK_1: starting piece 1 at 24-DEC-2012 03:06:13
channel ORA_DISK_1: finished piece 1 at 24-DEC-2012 03:06:14
piece handle=/u01/recovery/MAA/backupset/2012_12_24/o1_mf_nnndf_TAG20121224T030613_8fgok5v7_.bkp tag=TAG20121224T030613 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 24-DEC-2012 03:06:14
我们又产生一些数据:
sys@MAA> insert into l.ning values('Create your own social network with the best community website builder - NING');
1 row created.
sys@MAA> commit;
Commit complete.
sys@MAA> select * from l.ning;
DOMAIN
----------------------------------------------------------------------------------------------------
Create your own social network with the best community website builder - NING
Create your own social network with the best community website builder - NING
然后将表空间更改为只读状态:
sys@MAA> alter tablespace ltb read only;
Tablespace altered.
现在只读数据文件损坏
[oracle@maa3 ~]$ dd if=/dev/zero of=/u01/app/oracle/oradata/ltb01.dbf bs=1M count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.002543 seconds, 412 MB/s
sys@MAA> alter system flush buffer_cache;
System altered.
现在无法读取数据
sys@MAA> select * from l.ning;
select * from l.ning
*
ERROR at line 1:
ORA-01115: IO error reading block from file (block # )
ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'
ORA-27072: File I/O error
Additional information: 4
Additional information: 130
但该只读数据文件依然是在线的
sys@MAA> select file_id, file_name, online_status from dba_data_files where tablespace_name='LTB';
FILE_ID FILE_NAME ONLINE_STATUS
---------- ----------------------------------------------------------------- --------------
12 /u01/app/oracle/oradata/ltb01.dbf ONLINE
这时候恢复过程如下:
RMAN> run{
2> sql 'alter database datafile 12 offline';
3> restore datafile 12 force;
4> recover datafile 12;
5> sql 'alter database datafile 12 online';
6> }
using target database control file instead of recovery catalog
sql statement: alter database datafile 12 offline
Starting restore at 24-DEC-2012 03:15:07
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=27 device type=DISK
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00012 to /u01/app/oracle/oradata/ltb01.dbf
channel ORA_DISK_1: reading from backup piece /u01/recovery/MAA/backupset/2012_12_24/o1_mf_nnndf_TAG20121224T030613_8fgok5v7_.bkp
channel ORA_DISK_1: piece handle=/u01/recovery/MAA/backupset/2012_12_24/o1_mf_nnndf_TAG20121224T030613_8fgok5v7_.bkp tag=TAG20121224T030613
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 24-DEC-2012 03:15:08
Starting recover at 24-DEC-2012 03:15:09
using channel ORA_DISK_1
starting media recovery
media recovery complete, elapsed time: 00:00:00
Finished recover at 24-DEC-2012 03:15:09
sql statement: alter database datafile 12 online
说明:
在第三步restore datafile 12 force(注:这里加force关键字是为应付特殊环境的,比如虽文件受损却物理上依然存在的情况等)还原之后,12号数据文件头部具有读写文件特征,但控制文件和数据字典SYS.TS$上描述12号数据文件是只读的,所以需要增加第四步recover过程,对12号数据文件应用重做日志,一直到文件头部更改为只读状态。
场景5:控制文件无损,有只读状态时候的备份,后来更改为读写状态,读写状态时数据文件损坏
场景5和场景4相比,发生环境有所改变,但其恢复过程却一样,如下:
sys@MAA> select tablespace_name, status from dba_tablespaces where tablespace_name='LTB';
TABLESPACE_NAME STATUS
--------------- ------------------
LTB READ ONLY
我有只读状态时候的备份
RMAN> backup tablespace ltb;
Starting backup at 24-DEC-2012 03:34:09
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00012 name=/u01/app/oracle/oradata/ltb01.dbf
channel ORA_DISK_1: starting piece 1 at 24-DEC-2012 03:34:09
channel ORA_DISK_1: finished piece 1 at 24-DEC-2012 03:34:10
piece handle=/u01/recovery/MAA/backupset/2012_12_24/o1_mf_nnndf_TAG20121224T033409_8fgq5kl7_.bkp tag=TAG20121224T033409 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 24-DEC-2012 03:34:10
之后将数据文件更改为读写状态,并产生一些数据
sys@MAA> alter tablespace ltb read write;
Tablespace altered.
sys@MAA> insert into l.ning values('Create your own social network with the best community website builder - NING');
1 row created.
sys@MAA> commit;
Commit complete.
sys@MAA> select * from l.ning;
DOMAIN
----------------------------------------------------------------------------------------------------
Create your own social network with the best community website builder - NING
Create your own social network with the best community website builder - NING
Create your own social network with the best community website builder - NING
这时候后数据文件损坏
[oracle@maa3 ~]$ dd if=/dev/zero of=/u01/app/oracle/oradata/ltb01.dbf bs=1M count=50
50+0 records in
50+0 records out
52428800 bytes (52 MB) copied, 0.148137 seconds, 354 MB/s
sys@MAA> alter system flush buffer_cache;
System altered.
sys@MAA> select * from l.ning;
select * from l.ning
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 12, block # 130)
ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'
恢复过程如场景4一样:
run{
sql 'alter database datafile 12 offline';
restore datafile 12 force;
recover datafile 12;
sql 'alter database datafile 12 online';
}
说明:
在第三步restore datafile还原之后,12号数据文件头部标识着文件只读状态,但控制文件和数据字典SYS.TS$上却描述该数据文件是读写的,所以需要增加第四步recover过程,对12号数据文件应用重做日志,使其正常恢复。
场景6:控制文件损坏,只读数据文件损坏
查看当前LTB表空间状态,发现是读写状态
sys@MAA> select tablespace_name, status from dba_tablespaces where tablespace_name='LTB';
TABLESPACE_NAME STATUS
--------------- ------------------
LTB ONLINE
将其改为只读状态
sys@MAA> alter tablespace ltb read only;
Tablespace altered.
我有当前状态的全备份:
RMAN> backup database;
...部分内容略 ...
including current control file in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 24-DEC-2012 03:50:15
channel ORA_DISK_1: finished piece 1 at 24-DEC-2012 03:50:16
piece handle=/u01/recovery/MAA/backupset/2012_12_24/o1_mf_ncsnf_TAG20121224T034924_8fgr3q96_.bkp tag=TAG20121224T034924 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 24-DEC-2012 03:50:16
这时候我的控制文件和只读数据文件都损坏
sys@MAA> shutdown abort
ORACLE instance shut down.
控制文件丢失
[grid@maa3 ~]$ asmcmd -p
ASMCMD [+] > find --type controlfile . *
+MSDATA/MAA/CONTROLFILE/Current.256.792009855
ASMCMD [+] > rm +MSDATA/MAA/CONTROLFILE/Current.256.792009855
数据文件损坏
[oracle@maa3 ~]$ dd if=/dev/zero of=/u01/app/oracle/oradata/ltb01.dbf bs=1M count=50
50+0 records in
50+0 records out
52428800 bytes (52 MB) copied, 0.283675 seconds, 185 MB/s
启动数据库,报ORA-00205错误
sys@MAA> startup
ORACLE instance started.
Total System Global Area 1252663296 bytes
Fixed Size 2227944 bytes
Variable Size 855638296 bytes
Database Buffers 385875968 bytes
Redo Buffers 8921088 bytes
ORA-00205: error in identifying control file, check alert log for more info
这时候恢复过程如下:
1)将数据库启动到NOMOUNT状态(也只能如此)
2)从备份里还原控制文件
3)将数据库启动到MOUNT状态
4)还原只读数据文件
5)recover恢复整个数据库
6)resetlogs打开数据库
数据库在启动的时候到NOMOUNT状态
RMAN> run{
2> restore controlfile from '/u01/recovery/MAA/backupset/2012_12_24/o1_mf_ncsnf_TAG20121224T034924_8fgr3q96_.bkp';
3> mount database;
4> restore datafile 12 force;
5> recover database;
6> alter database open resetlogs;
7> }
Starting restore at 24-DEC-2012 04:02:19
using channel ORA_DISK_1
channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
output file name=+MSDATA/maa/controlfile/current.256.802843341
Finished restore at 24-DEC-2012 04:02:21
database mounted
released channel: ORA_DISK_1
Starting restore at 24-DEC-2012 04:02:26
Starting implicit crosscheck backup at 24-DEC-2012 04:02:26
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=22 device type=DISK
Crosschecked 7 objects
Finished implicit crosscheck backup at 24-DEC-2012 04:02:27
Starting implicit crosscheck copy at 24-DEC-2012 04:02:27
using channel ORA_DISK_1
Crosschecked 1 objects
Finished implicit crosscheck copy at 24-DEC-2012 04:02:27
searching for all files in the recovery area
cataloging files...
cataloging done
List of Cataloged Files
=======================
File Name: /u01/recovery/MAA/backupset/2012_12_24/o1_mf_ncsnf_TAG20121224T034924_8fgr3q96_.bkp
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00012 to /u01/app/oracle/oradata/ltb01.dbf
channel ORA_DISK_1: reading from backup piece /u01/recovery/MAA/backupset/2012_12_24/o1_mf_nnndf_TAG20121224T034924_8fgr3clo_.bkp
channel ORA_DISK_1: piece handle=/u01/recovery/MAA/backupset/2012_12_24/o1_mf_nnndf_TAG20121224T034924_8fgr3clo_.bkp tag=TAG20121224T034924
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 24-DEC-2012 04:02:29
Starting recover at 24-DEC-2012 04:02:29
using channel ORA_DISK_1
datafile 12 not processed because file is read-only
starting media recovery
archived log for thread 1 with sequence 96 is already on disk as file +MSDATA/maa/onlinelog/group_3.259.792009857
archived log file name=+MSDATA/maa/onlinelog/group_3.259.792009857 thread=1 sequence=96
media recovery complete, elapsed time: 00:00:01
Finished recover at 24-DEC-2012 04:02:31
database opened
恢复完成,数据也访问正常
sys@MAA> select * from l.ning;
DOMAIN
----------------------------------------------------------------------------------------------------
Create your own social network with the best community website builder - NING
Create your own social network with the best community website builder - NING
Create your own social network with the best community website builder - NING
场景7:控制文件损坏,有只读数据文件状态时候的控制文件备份,后来数据文件改为读写,读写状态时数据文件损坏
这次比场景6稍微复杂点的情况:
sys@MAA> select tablespace_name, status from dba_tablespaces where tablespace_name='LTB';
TABLESPACE_NAME STATUS
--------------- ------------------
LTB READ ONLY
-- 当前表空间状态为只读
有当前状态的所有备份
RMAN> backup database;
输出信息略
之后只读数据文件改为读写状态,并产生一些数据:
sys@MAA> alter tablespace ltb read write;
Tablespace altered.
sys@MAA> insert into l.ning values('Create your own social network with the best community website builder - NING');
sys@MAA> commit;
sys@MAA> select * from l.ning;
DOMAIN
----------------------------------------------------------------------------------------------------
Create your own social network with the best community website builder - NING
Create your own social network with the best community website builder - NING
Create your own social network with the best community website builder - NING
Create your own social network with the best community website builder - NING
这时候控制文件、数据文件都损坏
sys@MAA> shutdown abort
ORACLE instance shut down.
ASMCMD [+] > rm -rf +MSDATA/MAA/CONTROLFILE/current.256.802843341
[oracle@maa3 ~]$ dd if=/dev/zero of=/u01/app/oracle/oradata/ltb01.dbf bs=1M count=50
50+0 records in
50+0 records out
52428800 bytes (52 MB) copied, 0.122551 seconds, 428 MB/s
数据库启动,报ORA-00205
sys@MAA> startup
ORACLE instance started.
Total System Global Area 1252663296 bytes
Fixed Size 2227944 bytes
Variable Size 855638296 bytes
Database Buffers 385875968 bytes
Redo Buffers 8921088 bytes
ORA-00205: error in identifying control file, check alert log for more info
这时候我们先尝试按照场景6的恢复方法进行恢复:
RMAN> run{
2> restore controlfile from '/u01/recovery/MAA/backupset/2012_12_24/o1_mf_ncsnf_TAG20121224T041651_8fgsq3cv_.bkp';
3> mount database;
4> restore datafile 12 force;
5> recover database;
6> alter database open resetlogs;
7> }
… 部分信息略 ...
Finished restore at 24-DEC-2012 04:22:17
Starting recover at 24-DEC-2012 04:22:17
using channel ORA_DISK_1
datafile 12 not processed because file is read-only
starting media recovery
archived log for thread 1 with sequence 1 is already on disk as file +MSDATA/maa/onlinelog/group_1.257.792009855
archived log file name=+MSDATA/maa/onlinelog/group_1.257.792009855 thread=1 sequence=1
Oracle Error:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01190: control file or data file 12 is from before the last RESETLOGS
ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'
media recovery complete, elapsed time: 00:00:00
Finished recover at 24-DEC-2012 04:22:19
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of alter db command at 12/24/2012 04:22:19
ORA-01190: control file or data file 12 is from before the last RESETLOGS
ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'
在上面输出信息中我们可以看到datafile 12 not processed because file is read-only一句,这是因为在还原的控制文件内12号数据文件被描述为只读文件,因此在recover database的时候被忽略。但在应用重做日志的时候,发现LTB表空间更改为读写状态的记录,因此也需要进行恢复,但此时为时已晚,因此最后报错。那这时候到什么阶段了?控制文件和数据字典上对12号数据文件的描述已改变,也就是已将其认为是读写状态了。
解决方法是,再重复一下recover database过程
RMAN> run{
2> restore controlfile from '/u01/recovery/MAA/backupset/2012_12_24/o1_mf_ncsnf_TAG20121224T041651_8fgsq3cv_.bkp';
3> mount database;
4> restore datafile 12 force;
5> recover database;
6> recover database;
7> alter database open resetlogs;
8> }
... 部分内容略 ...
Starting recover at 24-DEC-2012 04:41:33
using channel ORA_DISK_1
datafile 12 not processed because file is read-only
starting media recovery
archived log for thread 1 with sequence 1 is already on disk as file +MSDATA/maa/onlinelog/group_1.257.792009855
archived log file name=+MSDATA/maa/onlinelog/group_1.257.792009855 thread=1 sequence=1
Oracle Error:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01190: control file or data file 12 is from before the last RESETLOGS
ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'
media recovery complete, elapsed time: 00:00:00
Finished recover at 24-DEC-2012 04:41:35
Starting recover at 24-DEC-2012 04:41:35
using channel ORA_DISK_1
starting media recovery
archived log for thread 1 with sequence 1 is already on disk as file +MSDATA/maa/onlinelog/group_1.257.792009855
archived log file name=+MSDATA/maa/onlinelog/group_1.257.792009855 thread=1 sequence=1
media recovery complete, elapsed time: 00:00:01
Finished recover at 24-DEC-2012 04:41:37
database opened
OK,顺利恢复完成,数据也访问正常
sys@MAA> select * from l.ning;
DOMAIN
----------------------------------------------------------------------------------------------------
Create your own social network with the best community website builder - NING
Create your own social network with the best community website builder - NING
Create your own social network with the best community website builder - NING
Create your own social network with the best community website builder - NING
场景8:控制文件损坏,在线日志文件没有损坏,归档日志丢失,有旧的跟踪控制文件trace,数据库里有只读数据文件
这个实验在做控制文件损坏恢复时已做过,请阅读:
《丢失控制文件恢复实验记录 – 4 : 在线日志文件没有损坏,归档日志丢失,直接重建控制文件(跟踪控制文件trace是旧的情况) 》
《丢失控制文件恢复实验记录 – 6 : 实验4的基础上,如果luocs表空间为只读表空间的情况》