在IBM DBA小荷的blog上看到一个用Logminer做数据恢复的例子。虽然对Logminer也了解一点,但是用Logminer做恢复还真没用过,所以也测试一下。原文链接地址如下:
http://www.oracleblog.org/working-case/dba-always-bad-luck-with-careless-customer/
一.在测试之前讲一点理论知识
1.1.补充日志(supplemental logging)
先看一下补充日志都包含哪些信息和特性:
(1)索引簇、链行和迁移行;
(2)直接路径插入;
(3)摘取LogMiner字典到重做日志;
(4)跟踪DDL;
(5)生成键列的SQL_REDO和SQL_UNDO信息;
(6)LONG和LOB数据类型。
这里我们重点看一下:track DDL和generate sql_redo and sql_undo.
Oracle的online redo会记录DB的所有操作,包括DDL和DML。supplemental log支持track DDL。也就是说,我们可以直接去Mining DML的内容。但是如果要去Mining DDL内容,就必须先启动supplemental log,oracle收集有关更多的DDL信息之后,我们才可以去Mining它的信息。
因为我们可以根据归档和online redo去恢复数据,所以这些DDL的内容,即使不启动supplemental log,对与Oracle内部来说肯定是可以识别的,只是我们不能Mining出来。只有启动supplemental log之后,我们也就可以Mining出来了。
默认情况下Oracle并没有启动supplemental log。因为记录太多的内容会增加写log的压力。
SQL_REDO和SQL_UNDO是我们操作的SQL(DDL和DML)和用于回滚的SQL。我们的恢复就是使用SQL_UNDO来进行的。
在我们的DML和DDL操作之前,需要先启动supplemental log。不然生成的SQL_REDO和SQL_UNDO是没有经过数据字典转换过的,这样不具可读性。都是Oracle内部的ID。
启动supplemental log:
SQL>alter database add supplemental log data;
关闭supplemental log:
SQL>alter database drop supplemental log data;
查看supplemental log:
SQL>select supplemental_log_data_min from v$database;
1.2Logminer的三种模式
在我之前整理过的Blog里有详细说明:
Oracle Logminer说明
http://blog.youkuaiyun.com/xujinyang/article/details/6972909
LogMiner dictionary:
The LogMiner dictionary allows LogMiner to provide table and column names, instead of internal object IDs, when it presents the redo log data that you request.
LogMiner uses the dictionary to translate internal object identifiers and datatypes to object names and external data formats. Without a dictionary, LogMiner returns internal object IDs and presents data as binary data.
LogMiner字典用于将内部对象ID号和数据类型转换为对象名和外部数据格式。使用LogMiner分析重做日志和归档日志时,应该生成LogMiner字典,否则将无法读懂分析结果。
INSERT INTO HR.JOBS(JOB_ID, JOB_TITLE, MIN_SALARY, MAX_SALARY)VALUES('IT_WT','Technical Writer', 4000, 11000);
如果没有数据字典进行转换,解析之后的结果是:
insert into "UNKNOWN"."OBJ# 45522"("COL 1","COL 2","COL 3","COL 4") values (HEXTORAW('45465f4748'),HEXTORAW('546563686e6963616c20577269746572'),HEXTORAW('c229'),HEXTORAW('c3020b'));
这个就没有什么可读性了。现在我们来看三种模式。
1.2.1 Online Catalog
直接用DB的数据字典在线进行转换。要求DB必须处于open状态,只能Mining DML。只能反应当前版本表中的信息。即表没有没有进行DDL修改。只能Mining到表自修改之后到现在的数据。之前的不能Mining。
这是效率最高的。但是缺点也摆在这。系统表是关键表,用这种方法会增加DB的压力。
1.2.2 Extracting a LogMiner Dictionary to the Redo Log Files
The process ofextracting the dictionary to the redo log files does consume database resources,but if you limit the extraction to off-peak hours, then this should not be a problem,and it is faster than extracting to a flat file. Depending on the size of the dictionary,it may be contained in multiple redo log files.If the relevant redo log files have been archived, then you can find out which redo log files contain the start and end of an extracted dictionary.
To do so, query the V$ARCHIVED_LOG view, as follows:
SQL>SELECT NAME FROM V$ARCHIVED_LOG WHERE DICTIONARY_BEGIN='YES';
SQL>SELECT NAME FROM V$ARCHIVED_LOG WHERE DICTIONARY_END='YES';
使用这种方法必须启动supplemental log。进程会讲database dictionary的信息extract到online redo log里去,从而减少对在Mining时对数据库资源的消耗。如果database dictionary非常大,这时候在写online redo的时候发生了归档的操作。那么可以通过上面的两个SQL来查看。因为dictionary信息写入了这些log文件,所以在Mining时,这些文件是必须包含在Mining里的,不然会报ORA-1371的错误。
To extract a LogMiner dictionary to the redo log files, the database must be open and in ARCHIVELOG mode and archiving must be enabled.While the dictionary is being extracted to the redo log stream, no DDL statements can be executed.Therefore, the dictionary extracted to the redo log files is guaranteed to be consistent (whereas the dictionary extracted to a flat file is not).
这个还有一个很大的问题在这。就是在进行extract的时候,所以的DDL都会被挂住。即不能执行,只有当extract结束以后,DDL才能执行。如果extract的时间很长,那么DDL被挂的时间也就很长。
虽然讲生产库上DDL操作很少,但是这个extract directory to redo操作也还是有风险的。所以可行性最高的还是我们的第三种方法。
1.2.3 Extracting the LogMiner Dictionary to a Flat File
When the LogMiner dictionary is in a flat file, fewer system resources are used than when it is contained in the redo log files. Oracle recommends that you regularly back up the dictionary extract to ensure correct analysis of older redo log files.
Be sure that no DDL operations occur while the dictionary is being built.
同样需要启动supplemental log。在extract to online redo的时候,Oracle会限制DDL执行,直到directory extract结束。而extract to flat file的话,就需要用户来保证这个一致性了。而且每次进行挖掘的时候都需要extract一次,从而保证一致性。
这种方法的致命伤是需要设置UTL_FILE_DIR参数,而该参数的生效必须重启DB。
我们这里就是用extract to flat file来来演示数据恢复。
三.用Logminer恢复的示例
一般生产环境不会启动supplemental log。所以用Logminer方法来做数据恢复不是一种常用的方法。对于DML操作,还是有一定的可行性。DDL就不行。
而且在没有启动supplemental log的情况下,Mining出来的SQL_REDO和SQL_UNDO数据是没有进过数据字典进行转换的,可读性很差。如:
deletefrom"UNKNOWN"."OBJ# 54173"where"COL 1"=HEXTORAW('53434f5454')and"COL 2"=HEXTORAW('504b5f44455054')and"COL 3"ISNULLand"COL 4"=HEXTORAW('c3060c30')and"COL 5"=HEXTORAW('c3060c30')and"COL 6"=HEXTORAW('494e444558')and"COL 7"=HEXTORAW('7869061e14303a')and"COL 8"=HEXTORAW('7869061e14303a')and"COL 9"=HEXTORAW('323030352d30362d33303a31393a34373a3537')and"COL 10"=HEXTORAW('56414c4944')and"COL 11"=HEXTORAW('4e')and"COL 12"=HEXTORAW('4e')and"COL 13"=HEXTORAW('4e')andROWID='AAANOdAABAAATEmAAc';
insertinto"UNKNOWN"."OBJ# 54173"("COL 1","COL 2","COL 3","COL 4","COL 5","COL 6","COL 7","COL 8","COL 9","COL 10","COL 11","COL 12","COL 13")values(HEXTORAW('53434f5454'),HEXTORAW('504b5f44455054'),NULL,HEXTORAW('c3060c30'),HEXTORAW('c3060c30'),HEXTORAW('494e444558'),HEXTORAW('7869061e14303a'),HEXTORAW('7869061e14303a'),HEXTORAW('323030352d30362d33303a31393a34373a3537'),HEXTORAW('56414c4944'),HEXTORAW('4e'),HEXTORAW('4e'),HEXTORAW('4e'));
Logminer可以作为Flashback的一种补充。关于Flashback参考我的Blog:
Oracle Flashback技术总结
http://blog.youkuaiyun.com/xujinyang/article/details/6830438
3.1启动supplemental log
SYS@anqing2(rac2)>alter database add supplemental log data;
Database altered.
SYS@anqing2(rac2)>select supplemental_log_data_min from v$database;
SUPPLEME
--------
YES
这个参数可以动态修改,不需要重启DB。
3.2创建测试表
SYS@anqing2(rac2)> create table huaining as select * from dba_objects;
Table created.
SYS@anqing2(rac2)> select count(*) from huaining;
COUNT(*)
----------
50253
--查看一下时间
SYS@anqing2(rac2)> alter session set nls_date_format='yyyy-mm-dd hh24:mi:ss';
Session altered.
SYS@anqing2(rac2)> select sysdate from dual;
SYSDATE
-------------------
2011-06-19 13:23:29
这个时间之前的是我们的原始数据。下面我们做一些DML操作,做完操作之后,我们可以使用Flashback去进行rollback。这里我们使用Logminer去Mining这些DML,然后用sql语句去进行rollback。
3.3.一些DML操作
SYS@anqing2(rac2)> select distinct owner from huaining;
OWNER
------------------------------
MDSYS
TSMSYS
DMSYS
PUBLIC
OUTLN
CTXSYS
OLAPSYS
SYSTEM
EXFSYS
SCOTT
ORACLE_OCM
DBSNMP
ORDSYS
ORDPLUGINS
SYSMAN
XDB
SYS
WMSYS
SI_INFORMTN_SCHEMA
19 rows selected.
SYS@anqing2(rac2)>delete fromhuaining where owner='SCOTT';
6 rows deleted.
SYS@anqing2(rac2)> commit;
Commit complete.
SYS@anqing2(rac2)>update huaining set owner='DAVE' where object_id<20;
18 rows updated.
SYS@anqing2(rac2)> commit;
Commit complete.
假设N长时间过去了,已经超过了UNDO retention的时间,没办法进行Flashback,这时候就可以用Logminer来把这段时间内的DML操作给挖出来,然后用SQL_UNDO的sql执行一下,就恢复出来。
3.4设置数据字典目录
SYS@anqing2(rac2)>show parameter utl
NAMETYPEVALUE
----------------------------- ----------------------- ----------
create_stored_outlinesstring
utl_file_dirstring
现在为空,没有值
SYS@anqing2(rac2)>alter system set utl_file_dir='/u01/backup' scope=both;
alter system set utl_file_dir='/u01/backup' scope=both
*
ERROR at line 1:
ORA-02095: specified initialization parameter cannot be modified
--必须重启才能生效
SYS@anqing2(rac2)>alter system set utl_file_dir='/u01/backup' scope=spfile;
System altered.
3.5重启实例
[oracle@rac1 u01]$ sh crs_stat.sh
NameTargetStateHost
------------------------------ ---------- ----------------
ora.anqing.anqing1.instONLINEONLINErac1
ora.anqing.anqing2.instONLINEONLINErac2
ora.anqing.dbONLINEONLINErac1
ora.rac1.ASM1.asmONLINEONLINErac1
ora.rac1.LISTENER_RAC1.lsnrONLINEONLINErac1
ora.rac1.gsdONLINEONLINErac1
ora.rac1.onsONLINEONLINErac1
ora.rac1.vipONLINEONLINErac1
ora.rac2.ASM2.asmONLINEONLINErac2
ora.rac2.LISTENER_RAC2.lsnrONLINEONLINErac2
ora.rac2.gsdONLINEONLINErac2
ora.rac2.onsONLINEONLINErac2
ora.rac2.vipONLINEONLINErac2
[oracle@rac1 u01]$srvctl stop database -d anqing
[oracle@rac1 u01]$ sh crs_stat.sh
NameTargetStateHost
------------------------------ ---------- ----------------
ora.anqing.anqing1.instONLINEOFFLINE
ora.anqing.anqing2.instONLINEOFFLINE
ora.anqing.dbOFFLINEOFFLINE
ora.rac1.ASM1.asmONLINEONLINErac1
ora.rac1.LISTENER_RAC1.lsnrONLINEONLINErac1
ora.rac1.gsdONLINEONLINErac1
ora.rac1.onsONLINEONLINErac1
ora.rac1.vipONLINEONLINErac1
ora.rac2.ASM2.asmONLINEONLINErac2
ora.rac2.LISTENER_RAC2.lsnrONLINEONLINErac2
ora.rac2.gsdONLINEONLINErac2
ora.rac2.onsONLINEONLINErac2
ora.rac2.vipONLINEONLINErac2
--杯具的事情发生了,启动报错
[oracle@rac1 u01]$ srvctl start database -d anqing
PRKP-1001 : Error starting instance anqing1 on node rac1
CRS-0215: Could not start resource 'ora.anqing.anqing1.inst'.
PRKP-1001 : Error starting instance anqing2 on node rac2
CRS-0215: Could not start resource 'ora.anqing.anqing2.inst'.
看了下log,没有发现什么有价值的信息,后来把CRS也重启了,这回连ASM都启动不了,突发奇想,用sqlplus连上去,居然ASM和DB都起来了。今天不想研究这个问题,先放一放,启动就好。
[oracle@rac1 u01]$ sh crs_stat.sh
NameTargetStateHost
------------------------------ ---------- ----------------
ora.anqing.anqing1.instONLINEONLINErac1
ora.anqing.anqing2.instONLINEONLINErac2
ora.anqing.dbONLINEONLINErac2
ora.rac1.ASM1.asmONLINEONLINErac1
ora.rac1.LISTENER_RAC1.lsnrONLINEONLINErac1
ora.rac1.gsdONLINEONLINErac1
ora.rac1.onsONLINEONLINErac1
ora.rac1.vipONLINEONLINErac1
ora.rac2.ASM2.asmONLINEONLINErac2
ora.rac2.LISTENER_RAC2.lsnrONLINEONLINErac2
ora.rac2.gsdONLINEONLINErac2
ora.rac2.onsONLINEONLINErac2
ora.rac2.vipONLINEONLINErac2
SYS@anqing2(rac2)> show parameter utl;
NAMETYPEVALUE
------------------------------------ ----------- ------------------------------
create_stored_outlinesstring
utl_file_dirstring/u01/backup
3.6建立数据字典
SYS@anqing2(rac2)>execute dbms_logmnr_d.build ('dict.ora','/u01/backup',dbms_logmnr_d.store_in_flat_file);
PL/SQL procedure successfully completed.
该数据字典可以直接用cat命令来查看
3.7添加归档日志
根据如下SQL找到对应的归档和online redo:
SQL>select * from v$archived_log order by stamp desc;
SQL>select MEMBER from v$logfile where group# in (select group# from v$log where status ='CURRENT');
SYS@anqing2(rac2)>exec dbms_logmnr.add_logfile(LogFileName=>'+DATA/anqing/onlinelog/redo03.log',Options=>dbms_logmnr.new);
PL/SQL procedure successfully completed.
SYS@anqing2(rac2)>exec dbms_logmnr.add_logfile(LogFileName=>'+FRA/anqing/archivelog/2_75_751552735.arc',Options=>dbms_logmnr.addfile);
PL/SQL procedure successfully completed.
SYS@anqing2(rac2)>exec dbms_logmnr.add_logfile(LogFileName=>'+FRA/anqing/archivelog/2_72_751552735.arc',Options=>dbms_logmnr.addfile);
PL/SQL procedure successfully completed.
SYS@anqing2(rac2)>exec dbms_logmnr.add_logfile(LogFileName=>'+FRA/anqing/archivelog/2_73_751552735.arc',Options=>dbms_logmnr.addfile);
PL/SQL procedure successfully completed.
SYS@anqing2(rac2)>exec dbms_logmnr.add_logfile(LogFileName=>'+FRA/anqing/archivelog/2_74_751552735.arc',Options=>dbms_logmnr.addfile);
PL/SQL procedure successfully completed.
SYS@anqing2(rac2)>exec dbms_logmnr.add_logfile(LogFileName=>'+FRA/anqing/archivelog/2_76_751552735.arc',Options=>dbms_logmnr.addfile);
PL/SQL procedure successfully completed.
3.8开始Logminer
SYS@anqing2(rac2)>execute dbms_logmnr.start_logmnr(dictfilename=>'/u01/backup/dict.ora',options=>dbms_logmnr.ddl_dict_tracking);
PL/SQL procedure successfully completed.
3.9查看结果
我们可以通过查看V$LOGMNR_CONTENTS视图来查看我们挖掘的数据,但是这个视图的数据只对当前的SESSION有效,所以我们需要创建一个表来保存该数据。
--修改时间格式
SQL>alter session set nls_date_format='yyyy-mm-dd hh24:mi:ss';
SYS@anqing2(rac2)>create table hn_logmnr nologging as select * from v$logmnr_contents where 1=2;
Table created.
SYS@anqing2(rac2)>insert /*+append */ into hn_logmnr select * from v$logmnr_contents;
250078 rows created.
SYS@anqing2(rac2)>commit;
Commit complete.
/* Formatted on 2011/6/19 14:12:25 (QP5 v5.163.1008.3004) */
SELECTSCN,
timestamp,
session#,
sql_redo,
sql_undo
FROMhn_logmnr
WHEREsql_redoLIKE'delete from%HUAINING%';
因为内容太长,我这里列一条:
SQL_REDO:
/* Formatted on 2011/6/19 14:14:19 (QP5 v5.163.1008.3004) */
DELETEFROM"SYS"."HUAINING"
WHERE"OWNER"='SCOTT'
AND"OBJECT_NAME"='PK_DEPT'
AND"SUBOBJECT_NAME"ISNULL
AND"OBJECT_ID"='51147'
AND"DATA_OBJECT_ID"='51147'
AND"OBJECT_TYPE"='INDEX'
AND"CREATED"=
TO_DATE('2005-06-30 19:47:57','yyyy-mm-dd hh24:mi:ss')
AND"LAST_DDL_TIME"=
TO_DATE('2005-06-30 19:47:57','yyyy-mm-dd hh24:mi:ss')
AND"TIMESTAMP"='2005-06-30:19:47:57'
AND"STATUS"='VALID'
AND"TEMPORARY"='N'
AND"GENERATED"='N'
AND"SECONDARY"='N'
ANDROWID='AAANSeAABAAA+qmAAc';
SQL_UNDO:
/* Formatted on 2011/6/19 14:14:37 (QP5 v5.163.1008.3004) */
INSERTINTO"SYS"."HUAINING"("OWNER",
"OBJECT_NAME",
"SUBOBJECT_NAME",
"OBJECT_ID",
"DATA_OBJECT_ID",
"OBJECT_TYPE",
"CREATED",
"LAST_DDL_TIME",
"TIMESTAMP",
"STATUS",
"TEMPORARY",
"GENERATED",
"SECONDARY")
VALUES('SCOTT',
'PK_DEPT',
NULL,
'51147',
'51147',
'INDEX',
TO_DATE('2005-06-30 19:47:57','yyyy-mm-dd hh24:mi:ss'),
TO_DATE('2005-06-30 19:47:57','yyyy-mm-dd hh24:mi:ss'),
'2005-06-30:19:47:57',
'VALID',
'N',
'N',
'N');
我们用spool把这些SQL_UNDO导出成sql脚本,在执行一下,对应的DML操作就恢复过来了。
在这里看一个update的SQL:
/* Formatted on 2011/6/19 14:12:25 (QP5 v5.163.1008.3004) */
SELECTSCN,
timestamp,
session#,
sql_redo,
sql_undo
FROMhn_logmnr
WHEREsql_redoLIKE'update %HUAINING%';
SQL_REDO:
/* Formatted on 2011/6/19 14:18:55 (QP5 v5.163.1008.3004) */
UPDATE"SYS"."HUAINING"
SET"OWNER"='DAVE'
WHERE"OWNER"='SYS'ANDROWID='AAANSeAABAAATIqAAD';
SQL_UNDO:
/* Formatted on 2011/6/19 14:18:58 (QP5 v5.163.1008.3004) */
UPDATE"SYS"."HUAINING"
SET"OWNER"='SYS'
WHERE"OWNER"='DAVE'ANDROWID='AAANSeAABAAATIqAAD';
我们可以用SQL_UNDO来进行恢复,注意这里有rowid。一般情况rowid是不会改变的,当move,shrink等操作之后,rowid就会发生改变。这时候,我们在用SQL_UNDO进行恢复的时候,就要先把rowid给过滤掉在进行,这个在小荷的Blog上用正则表达式过滤了。因为他是重新建的表,rowid肯定不一样。
正则语法:
spool hn.sql
select regexp_replace(SQL_UNDO,'and ROWID.+;',';')
from HN_logmnr
WHERE
table_name='HUAINING'
order by to_char(TIMESTAMP,'yyyy-mm-dd hh24:mi:ss') desc;
spool off
过滤之后的语法就可以直接去执行了:
/* Formatted on 2011/6/19 14:42:55 (QP5 v5.163.1008.3004) */
UPDATE"SYS"."HUAINING"
SET"OWNER"='SYS'
WHERE"OWNER"='DAVE';
3.10结束Logminer
SQL>execute dbms_logmnr.end_logmnr;
-------------------------------------------------------------------------------------------------------