【案例描述】
<wbr><wbr> 2011 年 12 月 30 日,某运营商客户,在遭受数据损失之后请求我们协助进行数据恢复。整个数据灾难的过程如下。</wbr></wbr>1.<wbr><wbr><wbr><wbr>凌晨,数据库归档日志写满磁盘,因而无法继续归档,数据库服务中断。</wbr></wbr></wbr></wbr>
2.<wbr><wbr><wbr><wbr>在进行空间释放时,删除了一个认为不再需要的目录。</wbr></wbr></wbr></wbr>
3.<wbr><wbr><wbr><wbr>删除目录之后,发现数据库服务受到影响。</wbr></wbr></wbr></wbr>
4.<wbr><wbr><wbr><wbr>经确认,该目录包含 7 个 16GB 大小的在用数据文件。</wbr></wbr></wbr></wbr>
5.<wbr><wbr><wbr><wbr>在试图通过备份来恢复时,发现之前的备份是不成功的。</wbr></wbr></wbr></wbr>
6.<wbr><wbr><wbr><wbr>灾难形成。</wbr></wbr></wbr></wbr>
<wbr><wbr>这是一个由删除引发的严重故障,如果数据不可恢复,灾难影响将是非常严重的。</wbr></wbr>【数据库环境描述】
OS:HP-UX
数据库版本:Oracle 11.2.0.1
【技术回放】
从告警日志看最初数据库出现的问题是归档日志无法写出,出现归档错误,归档停滞会导致数据库的所有DML事务无法进行,在线业务受到影响。
Fri Dec 30 03:34:33 2011
ARC0: Encountered disk I/O error 19502
ARC0: Closing local archive destination LOG_ARCHIVE_DEST_1:
<wbr><wbr><wbr><wbr>'/archlog/1_6578_765575017.dbf' (error 19502)(com1)</wbr></wbr></wbr></wbr>
ARC0: I/O error 19502 archiving log 5 to'/archlog/1_6578_765575017.dbf'
ARCH: Archival stopped, error occurred. Will continue retrying
ORACLE Instance com1 - Archival Error
ORA-16038: log 5 sequence# 6578 cannot be archived
ORA-19502: write error on file "", block number<wbr><wbr>(block size=512)</wbr></wbr>
ORA-00312: online log 5 thread 1: '/oradata2/redolog5_1.dbf'
ORA-00312: online log 5 thread 1: '/oradata3/redolog5_2.dbf'
ARCH: Archival stopped, error occurred. Will continue retrying
经过处理,在凌晨5点22分左右,归档得以继续,归档进程从失败中被释放出来,但是这位凌晨被吵醒的工程师草率的做出了错误的判断,删除了认为没用的目录,oradata4整个目录在操作系统上被删除,其中的所有数据文件荡然无存,以下是丢失的文件列表和状态,从v$datafile中可以查询得到这些信息
<wbr><wbr><wbr><wbr> TS#<wbr><wbr><wbr><wbr>FILE# NAME<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>BYTESSTATUS</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
---------- ---------- -------------------------------------------------- -------
<wbr><wbr><wbr><wbr><wbr><wbr>5<wbr><wbr><wbr><wbr><wbr><wbr>229 /oradata4/YDDY_DATA55.dbf<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr> 0 ONLINE</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
<wbr><wbr><wbr><wbr><wbr><wbr>19<wbr><wbr><wbr><wbr><wbr><wbr>230 /oradata4/EFB01.dbf<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>0 ONLINE</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
<wbr><wbr><wbr><wbr><wbr><wbr>18<wbr><wbr><wbr><wbr><wbr><wbr>231 /oradata4/undotbs203.dbf<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>0 RECOVER</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
<wbr><wbr><wbr><wbr><wbr><wbr>9<wbr><wbr><wbr><wbr><wbr><wbr>232 /oradata4/SMS_DATA47.dbf<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>0 ONLINE</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
<wbr><wbr><wbr><wbr><wbr><wbr>9<wbr><wbr><wbr><wbr><wbr><wbr>233 /oradata4/SMS_DATA48.dbf<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>0 ONLINE</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
<wbr><wbr><wbr><wbr><wbr><wbr>9<wbr><wbr><wbr><wbr><wbr><wbr>234 /oradata4/SMS_DATA49.dbf<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>0 ONLINE</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
<wbr><wbr><wbr><wbr><wbr><wbr>5<wbr><wbr><wbr><wbr><wbr><wbr>235 /oradata4/YDDY_DATA56.dbf<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr> 0 ONLINE</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
而当用户尝试的从备份开始恢复时,发现备份无法还原出来,此前的几次备份都失败了,备份不可用,现在问题就相当严重了。
【恢复重现】
首先找到一个后台进程(如DBWR进程),通过其进程地址找到文件句柄(/proc//fd, fd - File Descriptors),以下就是数据库文件的句柄显示信息,通过拷贝这些文件就可以恢复那些被删除但是尚未消失的数据文件:
bash-2.05$ ps -ef|grep dbw|grep -v grep
oracle<wbr><wbr>762<wbr><wbr><wbr><wbr>1 0<wbr><wbr>Jun 10 ?<wbr><wbr><wbr><wbr> 217:30 ora_dbw0_sxnms</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
bash-2.05$ ls<wbr><wbr>/proc/762/fd</wbr></wbr>
0<wbr><wbr> 12<wbr><wbr>256 260<wbr><wbr>264<wbr><wbr>268 272<wbr><wbr>276<wbr><wbr>280 284<wbr><wbr>288<wbr><wbr>292 296<wbr><wbr>3<wbr><wbr> 303 307<wbr><wbr>311<wbr><wbr>315 319<wbr><wbr>323<wbr><wbr>327 331<wbr><wbr>335<wbr><wbr>339 343<wbr><wbr>347<wbr><wbr>61<wbr><wbr>13<wbr><wbr>257<wbr><wbr>261 265<wbr><wbr>269<wbr><wbr>273 277<wbr><wbr>281<wbr><wbr>285 289<wbr><wbr>293<wbr><wbr>297 300<wbr><wbr>304<wbr><wbr>308 312<wbr><wbr>316<wbr><wbr>320 324<wbr><wbr>328<wbr><wbr>332 336<wbr><wbr>340<wbr><wbr>344 348<wbr><wbr>710<wbr><wbr>14<wbr><wbr>258<wbr><wbr>262<wbr><wbr>266 270<wbr><wbr>274<wbr><wbr>278 282<wbr><wbr>286<wbr><wbr>290 294<wbr><wbr>298<wbr><wbr>301 305<wbr><wbr>309<wbr><wbr>313 317<wbr><wbr>321<wbr><wbr>325 329<wbr><wbr>333<wbr><wbr>337 341<wbr><wbr>345<wbr><wbr>4<wbr><wbr>811<wbr><wbr>2<wbr><wbr> 259 263<wbr><wbr>267<wbr><wbr>271 275<wbr><wbr>279<wbr><wbr>283 287<wbr><wbr>291<wbr><wbr>295 299<wbr><wbr>302<wbr><wbr>306 310<wbr><wbr>314<wbr><wbr>318 322<wbr><wbr>326<wbr><wbr>330 334<wbr><wbr>338<wbr><wbr>342 346<wbr><wbr>5<wbr><wbr>9</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
db2% ls -l |grep oracle
-r--r--r--<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>657920 Apr 26<wbr><wbr>2002 11</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr><wbr><wbr>923 Jun 10<wbr><wbr>2011 12</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-rw----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr><wbr><wbr> 24 Jun 10<wbr><wbr>2011 13</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>1859584 Jan<wbr><wbr>5 16:42 256</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>1859584 Jan<wbr><wbr>5 16:42 257</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>1859584 Jan<wbr><wbr>5 16:42 258</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>414195712 Jan<wbr><wbr>5 16:41 259</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>6291464192 Jan<wbr><wbr>5 16:42 260</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>161488896 Jan<wbr><wbr>5 16:18 261</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>20979712 Jan<wbr><wbr>5 16:18 262</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>26222592 Jan<wbr><wbr>5 16:18 263</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>10493952 Jan<wbr><wbr>5 16:18 264</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>473178112 Jan<wbr><wbr>5 16:18 265</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>1048584192 Jan<wbr><wbr>5 16:40 266</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>1048584192 Jan<wbr><wbr>5 16:18 267</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>1048584192 Jan<wbr><wbr>5 16:40 268</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>1048584192 Jan<wbr><wbr>5 16:40 269</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>1048584192 Jan<wbr><wbr>5 16:41 270</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>1048584192 Jan<wbr><wbr>5 16:42 271</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
-rw-r-----<wbr><wbr>1 oracle<wbr><wbr>dba<wbr><wbr><wbr><wbr>1384652800 Jan<wbr><wbr>5 16:18 272</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
在这个案例中,我们拷贝文件恢复之后,创建了一个新的目录(保留原来的目录结构未变动),随后通过offline,rename,recover,online四个步骤恢复这些文件,加载到数据库中。以下是简要的步骤记录。
首先复制文件到新分配的目录空间:
cp /proc/762/fd/266 /new_u02/oradata/cinms_user01.dbf
cp /proc/762/fd/267 /new_u02/oradata/cinms_user02.dbf
cp /proc/762/fd/268 /new_u02/oradata/cinms_user03.dbf
cp /proc/762/fd/269 /new_u02/oradata/cinms_user04.dbf
cp /proc/762/fd/285 /new_u02/oradata/cinms_user07.dbf
cp /proc/762/fd/286 /new_u02/oradata/cinms_user08.dbf
将相应的文件离线:
alter database datafile 8 offline;
alter database datafile 9 offline;
alter database datafile 10 offline;
alter database datafile 11 offline;
alter database datafile 27 offline;
alter database datafile 28 offline;
通过更名(RENAME)的方式对文件进行重定向:
alter database rename file
‘/u02/oradata/cinms_user01.dbf’to ‘/new_u02/oradata/cinms_user01.dbf’;
alter database rename file
‘/u02/oradata/cinms_user02.dbf’ to‘/new_u02/oradata/cinms_user02.dbf’;
alter database rename file
‘/u02/oradata/cinms_user03.dbf’ to‘/new_u02/oradata/cinms_user03.dbf’;
alter database rename file
‘/u02/oradata/cinms_user04.dbf’ to ‘/new_u02/oradata/cinms_user04.dbf’;
alter database rename file
‘/u02/oradata/cinms_user07.dbf’ to‘/new_u02/oradata/cinms_user07.dbf’;
alter database rename file
‘/u02/oradata/cinms_user08.dbf’ to‘/new_u02/oradata/cinms_user08.dbf’;
然后执行恢复:
recover datafile 8;
recover datafile 9;
recover datafile 10;
recover datafile 11;
recover datafile 27;
recover datafile 28;
最后将文件Online加载:
alter database datafile 8 online;
alter database datafile 9 online;
alter database datafile 10 online;
alter database datafile 11 online;
alter database datafile 27 online;
alter database datafile 28 online;
以下是日志中记录的操作日志信息:
Mon Dec 19 18:17:38 2011
alter database datafile 8 offline
Mon Dec 19 18:17:39 2011
Completed: alter database datafile 8 offline
Mon Dec 19 18:18:04 2011
alter database rename file '/u02/oradata/sxnms_user01.dbf'
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>to'/new_u02/oradata/sxnms_user01.dbf'</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
Mon Dec 19 18:18:20 2011
ALTER DATABASE RECOVER datafile 8<wbr><wbr></wbr></wbr>
Media Recovery Datafile: 8
Media Recovery Start
Starting datafile 8 recovery in thread 1 sequence 14295
Datafile 8: '/new_u02/oradata/sxnms_user01.dbf'
Media Recovery Log
Recovery of Online Redo Log: Thread 1 Group 1 Seq 14295 Readingmem 0
<wbr><wbr>Mem# 0 errs 0:/u01/oradata/sxnms/redo01.log</wbr></wbr>
Media Recovery Complete
Completed: ALTER DATABASE RECOVER<wbr><wbr>datafile 8</wbr></wbr>
<wbr></wbr>
Mon Dec 19 18:19:00 2011
alter database datafile 8 online
Completed: alter database datafile 8 online
<wbr></wbr>
【案例警示】
<wbr><wbr>分析整个灾难的形成过程,我们总结这次数据灾难给我们的警示是:</wbr></wbr>
1.数据库需要全面的系统规划和监控
2.数据库的破坏性操作需要谨慎
3.数据环境运维必须遵守一定的安全守则
4.数据备份应进行必要的检查和确认
5.避免在疲劳或不清醒状态独自做出重要判断
6.在对故障做出清晰判断之前不要采取贸然措施
本文记录了一次因误删导致的数据恢复实战经历。面对严重数据丢失情况,通过查找后台进程中的文件句柄成功恢复了数据文件,并详细展示了恢复流程,包括文件复制、离线重命名、恢复及在线加载等步骤。
284

被折叠的 条评论
为什么被折叠?



