oracle为表空间增加数据文件_只读数据文件损坏恢复实验记录-优快云博客

只读数据文件损坏恢复实验记录

只读表空间的数据文件指为只读数据文件(read only)，当一个表空间从read write更改为read only时，该数据文件里已经产生的脏块儿都会由DBWn写到磁盘，完成一次不完整的完全检查点。从这一刻起，该数据文件数据块和文件头信息都不再更新，包括检查点。在以后open数据库时实例也忽略该只读数据文件文件头的检查点SCN与其他数据文件或联机REDO日志的同步。

我们可以通过以下命令在read only和read write之间进行更改

read write –> read only:

alter tablespace [tablespace_name] read only;

read only –> read write:

alter tablespace [tablespace_name] read write;

查看系统里的只读表空间及其只读数据文件：

sys@MAA> col TABLESPACE_NAME for a20

sys@MAA> col FILE_NAME for a65

sys@MAA> select t.tablespace_name, d.file_id, d.file_name from dba_tablespaces t, dba_data_files d

2 where t.tablespace_name=d.tablespace_name

3 and t.status='READ ONLY'

4 /

TABLESPACE_NAME FILE_ID FILE_NAME

-------------------- ---------- -----------------------------------------------------------------

LTB 12 /u01/app/oracle/oradata/ltb01.dbf

下面我们针对以下几种场景进行恢复：

1）控制文件无损，数据库运行时丢失了只读数据文件，访问其内数据报错

2）控制文件无损，只读数据文件丢失，数据库无法OPEN

3）控制文件无损，只读数据文件头部损坏

4）控制文件无损，有读写状态时候的备份，后来更改为只读状态，只读状态时数据文件损坏

5）控制文件无损，有只读状态时候的备份，后来更改为读写状态，读写状态时数据文件损坏

6）控制文件损坏，只读数据文件损坏

7）控制文件损坏，有只读数据文件状态时候的控制文件备份，后来数据文件改为读写，读写状态时数据文件损坏

8）控制文件损坏，在线日志文件没有损坏，归档日志丢失，有旧的跟踪控制文件trace，数据库里有只读数据文件

前提条件，有相应的备份。

操作环境：

• OS : Oracle Enterprise Linux 5.5 64Bit

• DB Type : Oracle Restart

• DB Version : 11.2.0.3

准备一个实验对象只读表空间

sys@MAA> create tablespace ltb datafile '/u01/app/oracle/oradata/ltb01.dbf' size 50M;

sys@MAA> create user l identified by oracle default tablespace ltb;

sys@MAA> grant resource, connect to l;

sys@MAA> create table l.ning(domain varchar2(100));

sys@MAA> insert into l.ning values('Create your own social network with the best community website builder - NING');

sys@MAA> commit;

sys@MAA> alter tablespace ltb read only;

场景1：控制文件无损，数据库运行时只读数据文件损坏，访问其内数据报错

备份只读表空间ltb

RMAN> backup tablespace ltb;

Starting backup at 23-DEC-2012 18:29:30

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=44 device type=DISK

channel ORA_DISK_1: starting full datafile backup set

channel ORA_DISK_1: specifying datafile(s) in backup set

input datafile file number=00012 name=/u01/app/oracle/oradata/ltb01.dbf

channel ORA_DISK_1: starting piece 1 at 23-DEC-2012 18:29:31

channel ORA_DISK_1: finished piece 1 at 23-DEC-2012 18:29:32

piece handle=/u01/recovery/MAA/backupset/2012_12_23/o1_mf_nnndf_TAG20121223T182931_8ffq8cbz_.bkp tag=TAG20121223T182931 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01

Finished backup at 23-DEC-2012 18:29:32

数据库运行状态下，破坏只读数据文件以模拟损坏

[oracle@maa3 ~]$ dd if=/dev/zero of=/u01/app/oracle/oradata/ltb01.dbf bs=10M count=5

5+0 records in

5+0 records out

52428800 bytes (52 MB) copied, 0.04918 seconds, 1.1 GB/s

查看v$datafile内容时，返回正常

sys@MAA> col NAME for a65

sys@MAA> select checkpoint_change# from v$datafile where file#=12;

CHECKPOINT_CHANGE#

------------------

3567837

但日志里将报错：

Hex dump of (file 12, block 1) in trace file /u01/app/oracle/diag/rdbms/maa/maa/trace/maa_m000_5343.trc

Corrupt block relative dba: 0x03000001 (file 12, block 1)

Completely zero block found during kcvxfh v8

Reading datafile '/u01/app/oracle/oradata/ltb01.dbf' for corruption at rdba: 0x03000001 (file 12, block 1)

Reread (file 12, block 1) found different corrupt data (no logical check)

Hex dump of (file 12, block 1) in trace file /u01/app/oracle/diag/rdbms/maa/maa/trace/maa_m000_5343.trc

Corrupt block relative dba: 0x03000001 (file 12, block 1)

Completely zero block found during reread

访问该数据文件里的数据：

sys@MAA> select * from l.ning;

select * from l.ning

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 12, block # 138)

ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'

告警日志也会抛出相应错误信息：

Hex dump of (file 12, block 2) in trace file /u01/app/oracle/diag/rdbms/maa/maa/trace/maa_ora_5220.trc

Corrupt block relative dba: 0x03000002 (file 12, block 2)

Completely zero block found during buffer read

Reading datafile '/u01/app/oracle/oradata/ltb01.dbf' for corruption at rdba: 0x03000002 (file 12, block 2)

Reread (file 12, block 2) found same corrupt data (no logical check)

Sun Dec 23 19:56:00 2012

Corrupt Block Found

TSN = 7, TSNAME = LTB

RFN = 12, BLK = 2, RDBA = 50331650

OBJN = -1, OBJD = -1, OBJECT = , SUBOBJECT =

SEGMENT OWNER = , SEGMENT TYPE =

Errors in file /u01/app/oracle/diag/rdbms/maa/maa/trace/maa_ora_5220.trc (incident=31391):

ORA-01578: ORACLE data block corrupted (file # 12, block # 2)

ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'

Hex dump of (file 12, block 1) in trace file /u01/app/oracle/diag/rdbms/maa/maa/trace/maa_ora_5220.trc

Corrupt block relative dba: 0x00000001 (file 12, block 1)

Completely zero block found during validating datafile for block range

Reread of blocknum=1, file=/u01/app/oracle/oradata/ltb01.dbf. found same corrupt data

Errors in file /u01/app/oracle/diag/rdbms/maa/maa/trace/maa_ora_5220.trc:

ORA-19563: datafile header validation failed for file /u01/app/oracle/oradata/ltb01.dbf

ORA-01251: Unknown File Header Version read for file number 12

ORA-01578: ORACLE data block corrupted (file # 12, block # 2)

ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'

Sun Dec 23 19:56:01 2012

Sweep [inc][31391]: completed

但此时只读数据文件状态还是ONLINE

sys@MAA> col FILE_NAME for a65

sys@MAA> select file_id, file_name, online_status from dba_data_files where tablespace_name='LTB';

FILE_ID FILE_NAME ONLINE_STATUS

---------- ----------------------------------------------------------------- --------------

12 /u01/app/oracle/oradata/ltb01.dbf ONLINE

这里恢复需要注意，如果只是有数据讹误块儿导致数据访问报错，那可以简单通过RMAN的块恢复命令进行块级别恢复。

但这里我是将整个数据文件破坏掉了，因此需要RMAN来还原数据文件，恢复可以在数据文件OPEN模式进行

这时候我们直接RESTORE DATAFILE是不可行的，我们需要先将只读数据文件脱机处理，还原之后再ONLINE

RMAN> restore tablespace ltb;

Starting restore at 23-DEC-2012 19:46:43

using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore

channel ORA_DISK_1: specifying datafile(s) to restore from backup set

channel ORA_DISK_1: restoring datafile 00012 to /u01/app/oracle/oradata/ltb01.dbf

channel ORA_DISK_1: reading from backup piece /u01/recovery/MAA/backupset/2012_12_23/o1_mf_nnndf_TAG20121223T184851_8ffrdmhg_.bkp

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure of restore command at 12/23/2012 19:46:44

ORA-19870: error while restoring backup piece /u01/recovery/MAA/backupset/2012_12_23/o1_mf_nnndf_TAG20121223T184851_8ffrdmhg_.bkp

ORA-19573: cannot obtain exclusive enqueue for datafile 12

正确的做法

RMAN> run{

2> sql 'alter database datafile 12 offline';

3> restore tablespace ltb;

4> sql 'alter database datafile 12 online';

5> }

sql statement: alter database datafile 12 offline

Starting restore at 23-DEC-2012 19:50:41

using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore

channel ORA_DISK_1: specifying datafile(s) to restore from backup set

channel ORA_DISK_1: restoring datafile 00012 to /u01/app/oracle/oradata/ltb01.dbf

channel ORA_DISK_1: reading from backup piece /u01/recovery/MAA/backupset/2012_12_23/o1_mf_nnndf_TAG20121223T184851_8ffrdmhg_.bkp

channel ORA_DISK_1: piece handle=/u01/recovery/MAA/backupset/2012_12_23/o1_mf_nnndf_TAG20121223T184851_8ffrdmhg_.bkp tag=TAG20121223T184851

channel ORA_DISK_1: restored backup piece 1

channel ORA_DISK_1: restore complete, elapsed time: 00:00:03

Finished restore at 23-DEC-2012 19:50:44

sql statement: alter database datafile 12 online

恢复后数据访问正常：

sys@MAA> select * from l.ning;

DOMAIN

----------------------------------------------------------------------------------------------------

Create your own social network with the best community website builder - NING

当然我们要是有镜像备份文件，也可以SWITCH

RMAN> backup as copy tablespace ltb;

Starting backup at 23-DEC-2012 19:54:34

using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile copy

input datafile file number=00012 name=/u01/app/oracle/oradata/ltb01.dbf

output file name=/u01/recovery/MAA/datafile/o1_mf_ltb_8ffw7tk0_.dbf tag=TAG20121223T195434 RECID=23 STAMP=802814075

channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01

Finished backup at 23-DEC-2012 19:54:35

在发现只读数据文件损坏的情况下

RMAN> run{

2> sql 'alter database datafile 12 offline';

3> switch datafile 12 to datafilecopy '/u01/recovery/MAA/datafile/o1_mf_ltb_8ffw7tk0_.dbf';

4> sql 'alter database datafile 12 online';

5> }

sql statement: alter database datafile 12 offline

datafile 12 switched to datafile copy

input datafile copy RECID=23 STAMP=802814075 file name=/u01/recovery/MAA/datafile/o1_mf_ltb_8ffw7tk0_.dbf

sql statement: alter database datafile 12 online

已经成功SWITCH，可查看v$datafile进行确认

sys@MAA> select checkpoint_change#, name from v$datafile where file#=12;

CHECKPOINT_CHANGE# NAME

------------------ -----------------------------------------------------------------

3567837 /u01/recovery/MAA/datafile/o1_mf_ltb_8ffw7tk0_.dbf

查看数据也正常

sys@MAA> select * from l.ning;

DOMAIN

----------------------------------------------------------------------------------------------------

Create your own social network with the best community website builder - NING

最后别忘了将损坏了SWITCH之前的数据文件进行删除

[oracle@maa3 ~]$ rm -rf /u01/app/oracle/oradata/ltb01.dbf

那有人可能会奇怪，如果直接删除掉只读数据文件，那不也是一样的效果吗？但其实不然

删除只读数据文件以模拟丢失

[oracle@maa3 ~]$ rm -rf /u01/app/oracle/oradata/ltb01.dbf

sys@MAA> alter system checkpoint;

System altered.

sys@MAA> alter system flush buffer_cache;

System altered.

sys@MAA> select * from l.ning;

DOMAIN

----------------------------------------------------------------------------------------------------

Create your own social network with the best community website builder - NING

可见我们数据依然正常访问，这里需要了解文件描述符的知识

EYGLE的《数据安全警示录》里有通过文件描述符进行恢复的案例，其中有下面这么一段话：

在UNIX、Linux系统中，误删除数据文件后，虽然该文件已从操作系统中删除，但是其文件句柄仍由数据库进程打开持有，所以在数据库层面仍然不会释放其链表信息，因而也就能够从进程的地址信息中，通过复制将其直接恢复。但是请注意，这要求数据库不能中途关闭，如果关闭了数据库，则所有文件句柄被释放，文件就真的难以回归了。

下面我从文件描述符中进行数据文件的恢复：

[oracle@maa3 ~]$ ps -ef | grep dbw | grep -v grep

grid 4817 1 0 Dec19 ? 00:00:00 asm_dbw0_+ASM

oracle 6578 1 0 20:39 ? 00:00:00 ora_dbw0_maa

[root@maa3 ~]# ls /proc/6578/fd

0 1 10 11 12 13 14 15 16 17 2 20 21 22 23 24 25 256 257 26 27 28 3 4 5 6 7 8 9

[root@maa3 ~]# ls -l /proc/6578/fd/12

lr-x------ 1 root root 64 Dec 23 21:05 /proc/6578/fd/12 -> /dev/null

[root@maa3 ~]# ls -l /proc/6578/fd/ | grep oracle

lrwx------ 1 root root 64 Dec 23 21:05 11 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_maa.dat

lr-x------ 1 root root 64 Dec 23 21:05 13 -> /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb

lrwx------ 1 root root 64 Dec 23 21:05 15 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_maa.dat

lrwx------ 1 root root 64 Dec 23 21:05 16 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/lkMAA

lrwx------ 1 root root 64 Dec 23 21:05 17 -> /dev/oracleasm/iid/000000000000000F

lrwx------ 1 root root 64 Dec 23 21:05 256 -> /u01/app/oracle/oradata/luocs01.dbf

lrwx------ 1 root root 64 Dec 23 21:05 257 -> /u01/app/oracle/oradata/ltb01.dbf (deleted)

lr-x------ 1 root root 64 Dec 23 21:05 27 -> /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb

lrwx------ 1 root root 64 Dec 23 21:05 5 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_maa.dat

-- 可以发现/u01/app/oracle/oradata/ltb01.dbf (deleted)已删除标记

我从文件描述符中将文件拷贝到原位置：

[root@maa3 ~]# cp /proc/6578/fd/257 /u01/app/oracle/oradata/ltb01.dbf -p

[oracle@maa3 ~]$ ls /u01/app/oracle/oradata/ltb01.dbf

-rw-r----- 1 oracle asmadmin 52436992 Dec 23 20:50 /u01/app/oracle/oradata/ltb01.dbf

重启数据库，正常

sys@MAA> startup force

ORACLE instance started.

Total System Global Area 1252663296 bytes

Fixed Size 2227944 bytes

Variable Size 855638296 bytes

Database Buffers 385875968 bytes

Redo Buffers 8921088 bytes

Database mounted.

Database opened.

sys@MAA> select checkpoint_change#, name from v$datafile where file#=12;

CHECKPOINT_CHANGE# NAME

------------------ -----------------------------------------------------------------

3654318 /u01/app/oracle/oradata/ltb01.dbf

数据访问正常

sys@MAA> select * from l.ning;

DOMAIN

----------------------------------------------------------------------------------------------------

Create your own social network with the best community website builder - NING

这里之所以不需要RECOVER数据库，是因为我们操作的对象为只读数据文件。

如果是读写状态下的数据文件，将文件拷贝过来之后需要先offline datafile后recover datafile，然后再online datafile。

另外，只读数据文件删除之后如果重启数据库，依然会报ORA-01110错误，这个在场景2中说明。

场景2：控制文件无损，只读数据文件丢失，数据库无法OPEN

删除只读数据文件以模拟丢失

[oracle@maa3 ~]$ rm -rf /u01/app/oracle/oradata/ltb01.dbf

数据库启动时报错

sys@MAA> startup force

ORACLE instance started.

Total System Global Area 1252663296 bytes

Fixed Size 2227944 bytes

Variable Size 855638296 bytes

Database Buffers 385875968 bytes

Redo Buffers 8921088 bytes

Database mounted.

ORA-01157: cannot identify/lock data file 12 - see DBWR trace file

ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'

只读数据文件恢复时我们不需要recover过程，数据库启动时已经到MOUNT状态，通过RMAN还原该数据文件

RMAN> restore tablespace ltb;

Starting restore at 23-DEC-2012 18:36:54

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=22 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore

channel ORA_DISK_1: specifying datafile(s) to restore from backup set

channel ORA_DISK_1: restoring datafile 00012 to /u01/app/oracle/oradata/ltb01.dbf

channel ORA_DISK_1: reading from backup piece /u01/recovery/MAA/backupset/2012_12_23/o1_mf_nnndf_TAG20121223T182931_8ffq8cbz_.bkp

channel ORA_DISK_1: piece handle=/u01/recovery/MAA/backupset/2012_12_23/o1_mf_nnndf_TAG20121223T182931_8ffq8cbz_.bkp tag=TAG20121223T182931

channel ORA_DISK_1: restored backup piece 1

channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

Finished restore at 23-DEC-2012 18:36:57

打开数据文件

RMAN> alter database open;

database opened

sys@MAA> select * from ning;

DOMAIN

----------------------------------------------------------------------------------------------------

Create your own social network with the best community website builder - NING

场景3：控制文件无损，只读数据文件头部损坏

我使用BBED工具破坏头部信息

[oracle@maa3 ~]$ cat bbed.par

12 /u01/app/oracle/oradata/ltb01.dbf

[oracle@maa3 ~]$ cd $ORACLE_HOME/rdbms/lib

[oracle@maa3 lib]$ bbed listfile=/home/oracle/bbed.par blocksize=8192 mode=edit

Password:

BBED: Release 2.0.0.0.0 - Limited Production on Sun Dec 23 21:42:22 2012

************* !!! For Oracle Internal Use only !!! ***************

BBED> info

File# Name Size(blks)

----- ---- ----------

12 /u01/app/oracle/oradata/ltb01.dbf 0

BBED> m /c Create your own social network with the best community website builder - NING

Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) Y

File: /u01/app/oracle/oradata/ltb01.dbf (12)

Block: 1 Offsets: 0 to 511 Dba:0x03000001

------------------------------------------------------------------------

7777772e 6c756f63 732e636f 6d000104 ea850000 00000000 0000200b b90df827

4d414100 00000000 fd140000 00190000 00200000 0c000300 00000000 00000000

54414732 30313231 32323354 32313234 33320000 00000000 00000000 00000000

00000000 ba163800 00000000 470dda2f 791c352f 01000000 00000000 00000000

00000000 00000000 00000000 03000000 00000000 02000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 07000000 03004c54 42000000 00000000 00000000

00000000 00000000 00000000 00000000 0c000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 f5173800 00000000 780dda2f 01000020 5e000000 83010000 1000a600

<32 bytes per line>

BBED> sum apply

Check value for File 12, Block 1:

current = 0x2a9e, required = 0x2a9e

BBED> verify

DBVERIFY - Verification starting

FILE = /u01/app/oracle/oradata/ltb01.dbf

BLOCK = 1

Block 1 is corrupt

Corrupt block relative dba: 0x63400001 (file 0, block 1)

Bad header found during verification

Data in bad block:

type: 119 format: 7 rdba: 0x636f756c

last change scn: 0x006d.6f632e73 seq: 0x1 flg: 0x04

spare1: 0x77 spare2: 0x2e spare3: 0x0

consistency value in tail: 0x00000b01

check value in block header: 0x2a9e

computed block checksum: 0x0

DBVERIFY - Verification complete

Total Blocks Examined : 1

Total Blocks Processed (Data) : 0

Total Blocks Failing (Data) : 0

Total Blocks Processed (Index): 0

Total Blocks Failing (Index): 0

Total Blocks Empty : 0

Total Blocks Marked Corrupt : 1

Total Blocks Influx : 0

Message 531 not found; product=RDBMS; facility=BBED

访问v$datafile里查看检查点

sys@MAA> select checkpoint_change# from v$datafile where file#=12;

CHECKPOINT_CHANGE#

------------------

3676149

-- 能够正常返回值

告警日志里去报错：

Corrupt block relative dba: 0x03000001 (file 12, block 1)

Bad header found during kcvxfh v10

这时候数据库启动，会收到头部损坏错误

sys@MAA> startup force

ORACLE instance started.

Total System Global Area 1252663296 bytes

Fixed Size 2227944 bytes

Variable Size 855638296 bytes

Database Buffers 385875968 bytes

Redo Buffers 8921088 bytes

Database mounted.

ORA-01122: database file 12 failed verification check

ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'

ORA-01210: data file header is media corrupt

恢复方法非常简单:

run{

startup force mount;

restore tablespace ltb;

alter database open;

}

场景4：控制文件无损，有读写状态时候的备份，后来更改为只读状态，只读状态时数据文件损坏

首先我重新构造这样一个环境

sys@MAA> select * from l.ning;

DOMAIN

----------------------------------------------------------------------------------------------------

Create your own social network with the best community website builder - NING

将表空间恢复到读写状态

sys@MAA> alter tablespace ltb read write;

Tablespace altered.

我们有当前读写状态时候的备份

RMAN> backup tablespace ltb;

Starting backup at 24-DEC-2012 03:06:12

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=52 device type=DISK

channel ORA_DISK_1: starting full datafile backup set

channel ORA_DISK_1: specifying datafile(s) in backup set

input datafile file number=00012 name=/u01/app/oracle/oradata/ltb01.dbf

channel ORA_DISK_1: starting piece 1 at 24-DEC-2012 03:06:13

channel ORA_DISK_1: finished piece 1 at 24-DEC-2012 03:06:14

piece handle=/u01/recovery/MAA/backupset/2012_12_24/o1_mf_nnndf_TAG20121224T030613_8fgok5v7_.bkp tag=TAG20121224T030613 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01

Finished backup at 24-DEC-2012 03:06:14

我们又产生一些数据：

sys@MAA> insert into l.ning values('Create your own social network with the best community website builder - NING');

1 row created.

sys@MAA> commit;

Commit complete.

sys@MAA> select * from l.ning;

DOMAIN

----------------------------------------------------------------------------------------------------

Create your own social network with the best community website builder - NING

然后将表空间更改为只读状态：

sys@MAA> alter tablespace ltb read only;

Tablespace altered.

现在只读数据文件损坏

[oracle@maa3 ~]$ dd if=/dev/zero of=/u01/app/oracle/oradata/ltb01.dbf bs=1M count=1

1+0 records in

1+0 records out

1048576 bytes (1.0 MB) copied, 0.002543 seconds, 412 MB/s

sys@MAA> alter system flush buffer_cache;

System altered.

现在无法读取数据

sys@MAA> select * from l.ning;

select * from l.ning

ERROR at line 1:

ORA-01115: IO error reading block from file (block # )

ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'

ORA-27072: File I/O error

Additional information: 4

Additional information: 130

但该只读数据文件依然是在线的

sys@MAA> select file_id, file_name, online_status from dba_data_files where tablespace_name='LTB';

FILE_ID FILE_NAME ONLINE_STATUS

---------- ----------------------------------------------------------------- --------------

12 /u01/app/oracle/oradata/ltb01.dbf ONLINE

这时候恢复过程如下：

RMAN> run{

2> sql 'alter database datafile 12 offline';

3> restore datafile 12 force;

4> recover datafile 12;

5> sql 'alter database datafile 12 online';

6> }

using target database control file instead of recovery catalog

sql statement: alter database datafile 12 offline

Starting restore at 24-DEC-2012 03:15:07

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=27 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore

channel ORA_DISK_1: specifying datafile(s) to restore from backup set

channel ORA_DISK_1: restoring datafile 00012 to /u01/app/oracle/oradata/ltb01.dbf

channel ORA_DISK_1: reading from backup piece /u01/recovery/MAA/backupset/2012_12_24/o1_mf_nnndf_TAG20121224T030613_8fgok5v7_.bkp

channel ORA_DISK_1: piece handle=/u01/recovery/MAA/backupset/2012_12_24/o1_mf_nnndf_TAG20121224T030613_8fgok5v7_.bkp tag=TAG20121224T030613

channel ORA_DISK_1: restored backup piece 1

channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

Finished restore at 24-DEC-2012 03:15:08

Starting recover at 24-DEC-2012 03:15:09

using channel ORA_DISK_1

starting media recovery

media recovery complete, elapsed time: 00:00:00

Finished recover at 24-DEC-2012 03:15:09

sql statement: alter database datafile 12 online

说明：

在第三步restore datafile 12 force(注:这里加force关键字是为应付特殊环境的，比如虽文件受损却物理上依然存在的情况等)还原之后，12号数据文件头部具有读写文件特征，但控制文件和数据字典SYS.TS$上描述12号数据文件是只读的，所以需要增加第四步recover过程，对12号数据文件应用重做日志，一直到文件头部更改为只读状态。

场景5：控制文件无损，有只读状态时候的备份，后来更改为读写状态，读写状态时数据文件损坏

场景5和场景4相比，发生环境有所改变，但其恢复过程却一样，如下：

sys@MAA> select tablespace_name, status from dba_tablespaces where tablespace_name='LTB';

TABLESPACE_NAME STATUS

--------------- ------------------

LTB READ ONLY

我有只读状态时候的备份

RMAN> backup tablespace ltb;

Starting backup at 24-DEC-2012 03:34:09

using channel ORA_DISK_1

channel ORA_DISK_1: starting full datafile backup set

channel ORA_DISK_1: specifying datafile(s) in backup set

input datafile file number=00012 name=/u01/app/oracle/oradata/ltb01.dbf

channel ORA_DISK_1: starting piece 1 at 24-DEC-2012 03:34:09

channel ORA_DISK_1: finished piece 1 at 24-DEC-2012 03:34:10

piece handle=/u01/recovery/MAA/backupset/2012_12_24/o1_mf_nnndf_TAG20121224T033409_8fgq5kl7_.bkp tag=TAG20121224T033409 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01

Finished backup at 24-DEC-2012 03:34:10

之后将数据文件更改为读写状态，并产生一些数据

sys@MAA> alter tablespace ltb read write;

Tablespace altered.

sys@MAA> insert into l.ning values('Create your own social network with the best community website builder - NING');

1 row created.

sys@MAA> commit;

Commit complete.

sys@MAA> select * from l.ning;

DOMAIN

----------------------------------------------------------------------------------------------------

Create your own social network with the best community website builder - NING

这时候后数据文件损坏

[oracle@maa3 ~]$ dd if=/dev/zero of=/u01/app/oracle/oradata/ltb01.dbf bs=1M count=50

50+0 records in

50+0 records out

52428800 bytes (52 MB) copied, 0.148137 seconds, 354 MB/s

sys@MAA> alter system flush buffer_cache;

System altered.

sys@MAA> select * from l.ning;

select * from l.ning

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 12, block # 130)

ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'

恢复过程如场景4一样：

run{

sql 'alter database datafile 12 offline';

restore datafile 12 force;

recover datafile 12;

sql 'alter database datafile 12 online';

}

说明：

在第三步restore datafile还原之后，12号数据文件头部标识着文件只读状态，但控制文件和数据字典SYS.TS$上却描述该数据文件是读写的，所以需要增加第四步recover过程，对12号数据文件应用重做日志，使其正常恢复。

场景6：控制文件损坏，只读数据文件损坏

查看当前LTB表空间状态，发现是读写状态

sys@MAA> select tablespace_name, status from dba_tablespaces where tablespace_name='LTB';

TABLESPACE_NAME STATUS

--------------- ------------------

LTB ONLINE

将其改为只读状态

sys@MAA> alter tablespace ltb read only;

Tablespace altered.

我有当前状态的全备份：

RMAN> backup database;

...部分内容略 ...

including current control file in backup set

including current SPFILE in backup set

channel ORA_DISK_1: starting piece 1 at 24-DEC-2012 03:50:15

channel ORA_DISK_1: finished piece 1 at 24-DEC-2012 03:50:16

piece handle=/u01/recovery/MAA/backupset/2012_12_24/o1_mf_ncsnf_TAG20121224T034924_8fgr3q96_.bkp tag=TAG20121224T034924 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01

Finished backup at 24-DEC-2012 03:50:16

这时候我的控制文件和只读数据文件都损坏

sys@MAA> shutdown abort

ORACLE instance shut down.

控制文件丢失

[grid@maa3 ~]$ asmcmd -p

ASMCMD [+] > find --type controlfile . *

+MSDATA/MAA/CONTROLFILE/Current.256.792009855

ASMCMD [+] > rm +MSDATA/MAA/CONTROLFILE/Current.256.792009855

数据文件损坏

[oracle@maa3 ~]$ dd if=/dev/zero of=/u01/app/oracle/oradata/ltb01.dbf bs=1M count=50

50+0 records in

50+0 records out

52428800 bytes (52 MB) copied, 0.283675 seconds, 185 MB/s

启动数据库，报ORA-00205错误

sys@MAA> startup

ORACLE instance started.

Total System Global Area 1252663296 bytes

Fixed Size 2227944 bytes

Variable Size 855638296 bytes

Database Buffers 385875968 bytes

Redo Buffers 8921088 bytes

ORA-00205: error in identifying control file, check alert log for more info

这时候恢复过程如下：

1）将数据库启动到NOMOUNT状态(也只能如此)

2）从备份里还原控制文件

3）将数据库启动到MOUNT状态

4）还原只读数据文件

5）recover恢复整个数据库

6）resetlogs打开数据库

数据库在启动的时候到NOMOUNT状态

RMAN> run{

2> restore controlfile from '/u01/recovery/MAA/backupset/2012_12_24/o1_mf_ncsnf_TAG20121224T034924_8fgr3q96_.bkp';

3> mount database;

4> restore datafile 12 force;

5> recover database;

6> alter database open resetlogs;

7> }

Starting restore at 24-DEC-2012 04:02:19

using channel ORA_DISK_1

channel ORA_DISK_1: restoring control file

channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

output file name=+MSDATA/maa/controlfile/current.256.802843341

Finished restore at 24-DEC-2012 04:02:21

database mounted

released channel: ORA_DISK_1

Starting restore at 24-DEC-2012 04:02:26

Starting implicit crosscheck backup at 24-DEC-2012 04:02:26

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=22 device type=DISK

Crosschecked 7 objects

Finished implicit crosscheck backup at 24-DEC-2012 04:02:27

Starting implicit crosscheck copy at 24-DEC-2012 04:02:27

using channel ORA_DISK_1

Crosschecked 1 objects

Finished implicit crosscheck copy at 24-DEC-2012 04:02:27

searching for all files in the recovery area

cataloging files...

cataloging done

List of Cataloged Files

=======================

File Name: /u01/recovery/MAA/backupset/2012_12_24/o1_mf_ncsnf_TAG20121224T034924_8fgr3q96_.bkp

using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore

channel ORA_DISK_1: specifying datafile(s) to restore from backup set

channel ORA_DISK_1: restoring datafile 00012 to /u01/app/oracle/oradata/ltb01.dbf

channel ORA_DISK_1: reading from backup piece /u01/recovery/MAA/backupset/2012_12_24/o1_mf_nnndf_TAG20121224T034924_8fgr3clo_.bkp

channel ORA_DISK_1: piece handle=/u01/recovery/MAA/backupset/2012_12_24/o1_mf_nnndf_TAG20121224T034924_8fgr3clo_.bkp tag=TAG20121224T034924

channel ORA_DISK_1: restored backup piece 1

channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

Finished restore at 24-DEC-2012 04:02:29

Starting recover at 24-DEC-2012 04:02:29

using channel ORA_DISK_1

datafile 12 not processed because file is read-only

starting media recovery

archived log for thread 1 with sequence 96 is already on disk as file +MSDATA/maa/onlinelog/group_3.259.792009857

archived log file name=+MSDATA/maa/onlinelog/group_3.259.792009857 thread=1 sequence=96

media recovery complete, elapsed time: 00:00:01

Finished recover at 24-DEC-2012 04:02:31

database opened

恢复完成，数据也访问正常

sys@MAA> select * from l.ning;

DOMAIN

----------------------------------------------------------------------------------------------------

Create your own social network with the best community website builder - NING

场景7：控制文件损坏，有只读数据文件状态时候的控制文件备份，后来数据文件改为读写，读写状态时数据文件损坏

这次比场景6稍微复杂点的情况：

sys@MAA> select tablespace_name, status from dba_tablespaces where tablespace_name='LTB';

TABLESPACE_NAME STATUS

--------------- ------------------

LTB READ ONLY

-- 当前表空间状态为只读

有当前状态的所有备份

RMAN> backup database;

输出信息略

之后只读数据文件改为读写状态，并产生一些数据：

sys@MAA> alter tablespace ltb read write;

Tablespace altered.

sys@MAA> insert into l.ning values('Create your own social network with the best community website builder - NING');

sys@MAA> commit;

sys@MAA> select * from l.ning;

DOMAIN

----------------------------------------------------------------------------------------------------

Create your own social network with the best community website builder - NING

这时候控制文件、数据文件都损坏

sys@MAA> shutdown abort

ORACLE instance shut down.

ASMCMD [+] > rm -rf +MSDATA/MAA/CONTROLFILE/current.256.802843341

[oracle@maa3 ~]$ dd if=/dev/zero of=/u01/app/oracle/oradata/ltb01.dbf bs=1M count=50

50+0 records in

50+0 records out

52428800 bytes (52 MB) copied, 0.122551 seconds, 428 MB/s

数据库启动，报ORA-00205

sys@MAA> startup

ORACLE instance started.

Total System Global Area 1252663296 bytes

Fixed Size 2227944 bytes

Variable Size 855638296 bytes

Database Buffers 385875968 bytes

Redo Buffers 8921088 bytes

ORA-00205: error in identifying control file, check alert log for more info

这时候我们先尝试按照场景6的恢复方法进行恢复：

RMAN> run{

2> restore controlfile from '/u01/recovery/MAA/backupset/2012_12_24/o1_mf_ncsnf_TAG20121224T041651_8fgsq3cv_.bkp';

3> mount database;

4> restore datafile 12 force;

5> recover database;

6> alter database open resetlogs;

7> }

… 部分信息略 ...

Finished restore at 24-DEC-2012 04:22:17

Starting recover at 24-DEC-2012 04:22:17

using channel ORA_DISK_1

datafile 12 not processed because file is read-only

starting media recovery

archived log for thread 1 with sequence 1 is already on disk as file +MSDATA/maa/onlinelog/group_1.257.792009855

archived log file name=+MSDATA/maa/onlinelog/group_1.257.792009855 thread=1 sequence=1

Oracle Error:

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below

ORA-01190: control file or data file 12 is from before the last RESETLOGS

ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'

media recovery complete, elapsed time: 00:00:00

Finished recover at 24-DEC-2012 04:22:19

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure of alter db command at 12/24/2012 04:22:19

ORA-01190: control file or data file 12 is from before the last RESETLOGS

ORA-01110: data file 12: '/u01/app/oracle/oradata/ltb01.dbf'

在上面输出信息中我们可以看到datafile 12 not processed because file is read-only一句，这是因为在还原的控制文件内12号数据文件被描述为只读文件，因此在recover database的时候被忽略。但在应用重做日志的时候，发现LTB表空间更改为读写状态的记录，因此也需要进行恢复，但此时为时已晚，因此最后报错。那这时候到什么阶段了？控制文件和数据字典上对12号数据文件的描述已改变，也就是已将其认为是读写状态了。

解决方法是，再重复一下recover database过程

RMAN> run{

2> restore controlfile from '/u01/recovery/MAA/backupset/2012_12_24/o1_mf_ncsnf_TAG20121224T041651_8fgsq3cv_.bkp';

3> mount database;

4> restore datafile 12 force;

5> recover database;

6> recover database;

7> alter database open resetlogs;

8> }

... 部分内容略 ...

Starting recover at 24-DEC-2012 04:41:33

using channel ORA_DISK_1