If a media failure has affected the online redo logs of a database, then the appropriate recovery procedure depends on the following considerations:
- The configuration of the online redo log: mirrored or non-mirrored
- The type of media failure: temporary or permanent
- The types of online redo log files affected by the media failure: current, active, unarchived, or inactive
通过V$LOG查看REDO LOG的状态
SQL> select group#,sequence#, status, archived from v$log;
GROUP# SEQUENCE# STATUS ARCHIVED
---------- ---------- ---------------- --------
1 102 INACTIVE YES
2 105 ACTIVE YES
3 101 INACTIVE YES
4 104 INACTIVE YES
5 106 CURRENT NO
STATUS列为ACTIVE表示对应数据还未写到磁盘, INACTIVE表示对应数据已写到磁盘,是否已归档要看ARCHIVED列。即发生实例恢复需要使用ACTIVE到CURRENT的日志组
Table 31-3 STATUS Column of V$LOG
Status | Description |
UNUSED | The online redo log has never been written to. |
CURRENT | The online redo log is active, that is, needed for instance recovery, and it is the log to which the database is currently writing. The redo log can be open or closed. |
ACTIVE | The online redo log is active, that is, needed for instance recovery, but is not the log to which the database is currently writing. It may be in use for block recovery, and may or may not be archived. |
CLEARING | The log is being re-created as an empty log after an ALTER DATABASE CLEAR LOGFILE statement. After the log is cleared, then the status changes to UNUSED. |
CLEARING_CURRENT | The current log is being cleared of a closed thread. The log can stay in this status if there is some failure in the switch such as an I/O error writing the new log header. |
INACTIVE | The log is no longer needed for instance recovery. It may be in use for media recovery, and may or may not be archived. |
(一)Recovering After Losing a Member of a Multiplexed Online Redo Log Group
日志组只要有一个日志文件可用数据库就会正常运行,但会在alert.log中报错
If the online redo log of a database is multiplexed, and if at least one member of each online redo log group is not affected by the media failure, then the database continues functioning as usual, but error messages are written to the log writer trace file and the alert_SID.log of the database.
You can resolve the problem of a missing member of a multiplexed online redo log group by taking one of the following actions:
- If the hardware problem is temporary, then correct it. The log writer process accesses the previously unavailable online redo log files as if the problem never existed.
- If the hardware problem is permanent, then drop the damaged member and add a new member by using the following procedure:
Note: The newly added member provides no redundancy until the log group is reused.
- Locate the file name of the damaged member in V$LOGFILE. The status is INVALID if the file is inaccessible:
SELECT GROUP#, STATUS, MEMBER
FROM V$LOGFILE
WHERE STATUS='INVALID';
GROUP# STATUS MEMBER
------- ----------- ---------------------
0002 INVALID /disk1/oradata/trgt/redo02.log
- Drop the damaged member:
ALTER DATABASE DROP LOGFILE MEMBER '/disk1/oradata/trgt/redo02.log';
- Add a new member to the group:
ALTER DATABASE ADD LOGFILE MEMBER '/disk1/oradata/trgt/redo02b.log'
TO GROUP 2;
If the file to add already exists, then it must be the same size as the other group members, and you must specify the REUSE option. For example:
ALTER DATABASE ADD LOGFILE MEMBER '/disk1/oradata/trgt/redo02b.log'
REUSE TO GROUP 2;
(二)Recovering After Losing All Members of an Online Redo Log Group
If a media failure damages all members of an online redo log group, then different scenarios can occur depending on the type of online redo log group affected by the failure and the archiving mode of the database.
Table 31-4 Recovering After the Loss of an Online Redo Log Group
If the Group Is... | Then... | And You Can... |
Inactive | It is not needed for crash recovery | Clear the archived or unarchived group. |
Active | It is needed for crash recovery | Attempt to issue a checkpoint and clear the log; if impossible, then you must either use Flashback Database or restore a backup and perform incomplete recovery up to the most recent available redo log. |
Current | It is the redo log that the database is currently writing to | Attempt to clear the log; if impossible, then you must either use Flashback Database or restore a backup and perform incomplete recovery up to the most recent available redo log. |
To determine whether the damaged group is active or inactive:
SELECT GROUP#, STATUS, MEMBER FROM V$LOGFILE;
GROUP# STATUS MEMBER
------- ----------- ---------------------
0001 /oracle/dbs/log1a.f
0001 /oracle/dbs/log1b.f
0002 INVALID /oracle/dbs/log2a.f
0002 INVALID /oracle/dbs/log2b.f
0003 /oracle/dbs/log3a.f
0003 /oracle/dbs/log3b.f
Determine which groups are active:
SELECT GROUP#, MEMBERS, STATUS, ARCHIVED
FROM V$LOG;
GROUP# MEMBERS STATUS ARCHIVED
------ ------- --------- -----------
0001 2 INACTIVE YES
0002 2 ACTIVE NO
0003 2 CURRENT NO
如果INACTIVE状态REDO LOG GROUP全挂了恢复取决于是否可以修复media problem,如果不能修复数据库将在下次需要使用此日志组时halt住
If the failure is temporary, then fix the problem. The log writer can reuse the redo log group when required. If the failure is permanent, then the damaged inactive online redo log group eventually halts normal database operation.
Note: ALTER DATABASE CLEAR [UNARCHIVED] LOGFILE只是单纯通知控制文件并重建相应REDO LOG(REDO LOG丢失时可以使用此命令直接重建日志文件),并不会把对应的脏块刷到磁盘或是进行归档,CLEAR后日志状态为UNUSED, 无法对CURRENT LOG GROUP使用
-
- Clearing Inactive, Archived Redo
最好的情况,根本不用恢复,归档文件不断号,以前备份文件仍可用
You can clear an inactive redo log group when the database is open or closed. The procedure depends on whether the damaged group has been archived.
To clear an inactive, online redo log group that has been archived:
- If the database is shut down, then start a new instance and mount the database:
STARTUP MOUNT
- Reinitialize the damaged log group:
ALTER DATABASE CLEAR LOGFILE GROUP 2;
-
- Clearing Inactive, Unarchived Redo
同样不用恢复,但归档断号,之前备份无法恢复当前状态
Clearing a not-yet-archived redo log allows it to be reused without archiving it.
This action makes backups unusable if they were started before the last change in the log, unless the file was taken offline before the first change in the log.
To clear an inactive, online redo log group that has not been archived:
- If the database is shut down, then start a new instance and mount the database:
SQL> STARTUP MOUNT
- Clear the log using the UNARCHIVED keyword.
SQL> ALTER DATABASE CLEAR UNARCHIVED LOGFILE GROUP 2;
如果有OFFLINE的数据文件(数据文件OFFLINE必恢复)重新ONLINE需要此unarchived logfile,则要使用UNRECOVERABLE DATAFILE选项,此时OFFLINE的数据文件不能再使用只能DROP
If there is an offline data file that requires the cleared log to bring it online, then the keywords UNRECOVERABLE DATAFILE are required. The data file must be dropped because the redo logs necessary to bring the data file online are being cleared, and there is no copy of it:
SQL> ALTER DATABASE CLEAR UNARCHIVED LOGFILE GROUP 2 UNRECOVERABLE DATAFILE;
- Immediately back up all data files in the database with an operating system utility, so that you have a backup you can use for complete recovery without relying on the cleared log group:
% cp /disk1/oracle/dbs/*.dbf /disk2/backup
Back up the database's control file with the ALTER DATABASE statement. For example, enter:
SQL> ALTER DATABASE BACKUP CONTROLFILE TO '/oracle/dbs/cf_backup.f';
1.3 Failure of CLEAR LOGFILE Operation
The ALTER DATABASE CLEAR LOGFILE statement can fail with an I/O error due to media failure when it is not possible to:
- Relocate the redo log file onto alternative media by re-creating it under the currently configured redo log file name
- Reuse the currently configured log file name to re-create the redo log file because the name itself is invalid or unusable (for example, due to media failure)
In these cases, the ALTER DATABASE CLEAR LOGFILE statement (before receiving the I/O error) successfully informs the control file that the log is being cleared and does not require archiving. The I/O error occurs at the step in which the CLEAR LOGFILE statement attempts to create the new redo log file and write zeros to it. This fact is reflected in V$LOG.CLEARING_CURRENT.
在恢复ACTIVE日志组时应先执行ALTER SYSTEM CHECKPOINT,如果成功日志组将转为INACTIE
If the database is still running and the lost active redo log is not the current log, then issue the ALTER SYSTEM CHECKPOINT statement. If the operation is successful, then the active redo log becomes inactive, and you can follow the procedure in "Losing an Inactive Online Redo Log Group".
如果ALTER SYSTEM CHECKPOINT执行失败则只能进行不完全恢复(当前数据文件已超过上个日志,即无法把数据文件中已提交事务往回退),另外CURRENT日志组损坏将直到导致宕机
The current log is the one LGWR is currently writing to. If a LGWR I/O operation fails, then LGWR terminates and the instance fails. In this case, you must restore a backup, perform incomplete recovery, and open the database with the RESETLOGS option.
2.1 Recovering from the Loss of Active Logs in NOARCHIVELOG Mode
In this scenario, the database archiving mode is NOARCHIVELOG.
To recover from the loss of an active online log group in NOARCHIVELOG mode:
- If the media failure is temporary, then correct the problem so that the database can reuse the group when required.
- Restore the database from a consistent, whole database backup (data files and control files):
% cp /disk2/backup/*.dbf $ORACLE_HOME/oradata/trgt/
- Mount the database:
STARTUP MOUNT
- To allow the database to reset the online redo logs, you must first mimic incomplete recovery:
RECOVER DATABASE UNTIL CANCEL
CANCEL
- Open the database using the RESETLOGS option:
ALTER DATABASE OPEN RESETLOGS;
- Shut down the database consistently. For example, enter:
SHUTDOWN IMMEDIATE
- Make a whole database backup.
2.2 Recovering from Loss of Active Logs in ARCHIVELOG Mode
In this scenario, the database archiving mode is ARCHIVELOG.
To recover from loss of an active online redo log group in ARCHIVELOG mode:
- Begin incomplete media recovery, recovering up through the log before the damaged log.
- Ensure that the current name of the lost redo log can be used for a newly created file. If not, then rename the members of the damaged online redo log group to a new location. For example, enter:
ALTER DATABASE RENAME FILE "/disk1/oradata/trgt/redo01.log" TO "/tmp/redo01.log";
ALTER DATABASE RENAME FILE "/disk1/oradata/trgt/redo02.log" TO "/tmp/redo02.log";
- Open the database using the RESETLOGS option:
ALTER DATABASE OPEN RESETLOGS;
如果多个日志组损坏使用恢复最坏的日志组方法进行数据库恢复
If you have lost multiple groups of the online redo log, then use the recovery method for the most difficult log to recover.
The order of difficulty, from most difficult to least difficult, is as follows:
The current online redo log
An active online redo log
An unarchived online redo log
An inactive online redo log