前段时间刚刚发现TSM备份异常,由于空间不足导致的,今天又遇到了一个问题。
channel full_njdb2: starting piece 1 at 2009-09-17 02:00:33
RMAN-03009: failure of backup command on full_njdb2 channel at 09/17/2009 02:21:09
ORA-19513: failed to identify sequential file
ORA-27206: requested file not found in media management catalog
ORA-19502: write error on file "njdb_full_bak_2634_697773633_1", blockno 17056257 (blocksize=512)
ORA-27030: skgfwrt: sbtwrite2 returned error
ORA-19511: Error received from media manager layer, error text:
ANS1017E (RC-50) Session rejected: TCP/IP connection failure
1. 问题说明
早上发现数据库rman备份日志异常,log日志内容如下:
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of release command at 09/17/2009 02:34:22
RMAN-06012: channel: full_njdb2 not allocated
2.问题原因
虽然最后rman抛错,但是数据库的备份是成功的,其中由于磁带库有2个驱动器,所以备份时,分配2个通道,并行备份,当一个通道发生异常时,会由另一个通道接管任务,重新进行备份(channel full_njdb2 disabled, job failed on it will be run on another channel)
Recovery Manager: Release 10.2.0.4.0 - Production on Thu Sep 17 02:00:00 2009
Copyright (c) 1982, 2007, Oracle. All rights reserved.
connected to target database: ORCL (DBID=1194585828)
connected to recovery catalog database
RMAN> run {
2> allocate channel 'full_testdb1' type 'sbt_tape' connect *parms 'ENV=(tdpo_optfile=/usr/tivoli/tsm/client/oracle/bin64/tdpo.opt)';
3> allocate channel 'full_testdb2' type 'sbt_tape' connect *parms 'ENV=(tdpo_optfile=/usr/tivoli/tsm/client/oracle/bin64/tdpo.opt)';
4> backup format 'testdb_full_bak_%s_%t_%p'
5> tag 'database full backup'
6> database;
7> backup format 'testdb_ctl_bak_%s_%t_%p'
8> tag 'database controlfile backup'
9> current controlfile;
10> RELEASE CHANNEL full_testdb1;
11> RELEASE CHANNEL full_testdb2;
12>
13> allocate channel 'arch_testdb1' type 'sbt_tape' connect *parms 'ENV=(tdpo_optfile=/usr/tivoli/tsm/client/oracle/bin64/tdpo.opt)';
14> allocate channel 'arch_testdb2' type 'sbt_tape' connect *parms 'ENV=(tdpo_optfile=/usr/tivoli/tsm/client/oracle/bin64/tdpo.opt)';
15> sql 'alter system archive log current';
16> backup format 'testdb_arch_bak_%s_%t_%p'
17> tag 'database archivelog backup'
18> archivelog all delete input;
19> RELEASE CHANNEL 'arch_testdb1' ;
20> RELEASE CHANNEL 'arch_testdb2' ;
21> }
22>
allocated channel: full_testdb1
channel full_testdb1: sid=256 instance=orcl1 devtype=SBT_TAPE
channel full_testdb1: Data Protection for Oracle: version 5.4.1.0
allocated channel: full_testdb2
channel full_testdb2: sid=193 instance=orcl2 devtype=SBT_TAPE
channel full_testdb2: Data Protection for Oracle: version 5.4.1.0
Starting backup at 2009-09-17 02:00:32
channel full_testdb1: starting full datafile backupset
channel full_testdb1: specifying datafile(s) in backupset
input datafile fno=00014 name=+DATADG1/data01.dbf
input datafile fno=00006 name=+DATADG1/nnc_data01.dbf
input datafile fno=00010 name=+DATADG1/bos_d1.dbf
input datafile fno=00012 name=+DATADG1/bos_d2.dbf
input datafile fno=00016 name=+DATADG1/nnc_data02.dbf
input datafile fno=00007 name=+DATADG1/nnc_index01.dbf
input datafile fno=00005 name=+DATADG1/orcl/datafile/undotbs2.264.666805151
input datafile fno=00004 name=+DATADG1/orcl/datafile/users.259.666805017
input datafile fno=00018 name=+DATADG1/eprk.dbf
channel full_testdb1: starting piece 1 at 2009-09-17 02:00:33
channel full_testdb2: starting full datafile backupset
channel full_testdb2: specifying datafile(s) in backupset
input datafile fno=00015 name=+DATADG1/data02.dbf
input datafile fno=00008 name=+DATADG1/indexts.dbf
input datafile fno=00001 name=+DATADG1/orcl/datafile/system.256.666805015
input datafile fno=00003 name=+DATADG1/orcl/datafile/sysaux.257.666805017
input datafile fno=00009 name=+DATADG1/bos_idx1.dbf
input datafile fno=00011 name=+DATADG1/lobts.dbf
input datafile fno=00013 name=+DATADG1/bos_d3.dbf
input datafile fno=00017 name=+DATADG1/njwz01.dbf
input datafile fno=00002 name=+DATADG1/orcl/datafile/undotbs1.258.666805017
channel full_testdb2: starting piece 1 at 2009-09-17 02:00:33
RMAN-03009: failure of backup command on full_testdb2 channel at 09/17/2009 02:21:09
ORA-19513: failed to identify sequential file
ORA-27206: requested file not found in media management catalog
ORA-19502: write error on file "testdb_full_bak_2634_697773633_1", blockno 17056257 (blocksize=512)
ORA-27030: skgfwrt: sbtwrite2 returned error
ORA-19511: Error received from media manager layer, error text:
ANS1017E (RC-50) Session rejected: TCP/IP connection failure
channel full_testdb2 disabled, job failed on it will be run on another channel
channel full_testdb1: finished piece 1 at 2009-09-17 02:23:04
piece handle=testdb_full_bak_2633_697773633_1 tag=DATABASE FULL BACKUP comment=API Version 2.0,MMS Version 5.4.1.0
channel full_testdb1: backup set complete, elapsed time: 00:22:31
channel full_testdb1: starting full datafile backupset
channel full_testdb1: specifying datafile(s) in backupset
input datafile fno=00015 name=+DATADG1/data02.dbf
input datafile fno=00008 name=+DATADG1/indexts.dbf
input datafile fno=00001 name=+DATADG1/orcl/datafile/system.256.666805015
input datafile fno=00003 name=+DATADG1/orcl/datafile/sysaux.257.666805017
input datafile fno=00009 name=+DATADG1/bos_idx1.dbf
input datafile fno=00011 name=+DATADG1/lobts.dbf
input datafile fno=00013 name=+DATADG1/bos_d3.dbf
input datafile fno=00017 name=+DATADG1/njwz01.dbf
input datafile fno=00002 name=+DATADG1/orcl/datafile/undotbs1.258.666805017
channel full_testdb1: starting piece 1 at 2009-09-17 02:23:05
channel full_testdb1: finished piece 1 at 2009-09-17 02:33:20
piece handle=testdb_full_bak_2635_697774985_1 tag=DATABASE FULL BACKUP comment=API Version 2.0,MMS Version 5.4.1.0
channel full_testdb1: backup set complete, elapsed time: 00:10:15
channel full_testdb1: starting full datafile backupset
channel full_testdb1: specifying datafile(s) in backupset
including current control file in backupset
channel full_testdb1: starting piece 1 at 2009-09-17 02:33:22
channel full_testdb1: finished piece 1 at 2009-09-17 02:33:37
piece handle=testdb_full_bak_2636_697775600_1 tag=DATABASE FULL BACKUP comment=API Version 2.0,MMS Version 5.4.1.0
channel full_testdb1: backup set complete, elapsed time: 00:00:17
channel full_testdb1: starting full datafile backupset
channel full_testdb1: specifying datafile(s) in backupset
including current SPFILE in backupset
channel full_testdb1: starting piece 1 at 2009-09-17 02:33:39
channel full_testdb1: finished piece 1 at 2009-09-17 02:33:54
piece handle=testdb_full_bak_2637_697775618_1 tag=DATABASE FULL BACKUP comment=API Version 2.0,MMS Version 5.4.1.0
channel full_testdb1: backup set complete, elapsed time: 00:00:16
Finished backup at 2009-09-17 02:33:54
Starting backup at 2009-09-17 02:34:03
channel full_testdb1: starting full datafile backupset
channel full_testdb1: specifying datafile(s) in backupset
including current control file in backupset
channel full_testdb1: starting piece 1 at 2009-09-17 02:34:05
channel full_testdb1: finished piece 1 at 2009-09-17 02:34:20
piece handle=testdb_ctl_bak_2638_697775643_1 tag=DATABASE CONTROLFILE BACKUP comment=API Version 2.0,MMS Version 5.4.1.0
channel full_testdb1: backup set complete, elapsed time: 00:00:17
Finished backup at 2009-09-17 02:34:20
released channel: full_testdb1
released channel: full_testdb2
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of release command at 09/17/2009 02:34:22
RMAN-06012: channel: full_testdb2 not allocated
Recovery Manager complete.
3. 原因分析
经与带库工程师和metalink人员沟通,可能的原因有两个:
a. 磁带IO争用的问题
带库偶然性出现的问题很正常,由于磁带IO的竞争可能会导致一条通道失败,
那么oracle会自动处理,由其他通道来接管失败通道的备份任务。
b. there is timeout setting for your media manager.
For example,
TSM : COMMTIMEOUT was found to be set to just 60 secs.
可能是磁带库中有一些地方有TIMEOUT的设置,修改相应参数也许问题可以得到解决,由于这次的想象为1年多备份中的偶然现象,之后的备份也都成功了,所以我没有修改TSM SERVER中的参数,如果再次出现这个问题的时候,我再修改。
参考文献:
Subject: | RMAN Incremental Level n Backup Often Fails With a Timeout from the Media Manager | |||
Doc ID: | 360831.1 | Type: | PROBLEM | |
Modified Date : | 24-MAR-2006 | Status: | PUBLISHED |
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/9252210/viewspace-614910/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/9252210/viewspace-614910/