11.2 RAC环境OCR和Votedisk损坏丢失恢复

本文介绍了一个RAC环境中OCR与Votedisk损坏丢失后的恢复过程,包括检查备份、重启HAS服务、重建diskgroup、恢复OCR及Votedisk等关键步骤。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

11.2 RAC环境OCR和Votedisk损坏丢失恢复

之前安装好的两个节点rac环境,不小心把磁盘组所在的磁盘格式化了,rac重启启动不了
如下恢复步骤:
1. 检查votedisk和 ocr备份
[root@rac1 ~]# crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   a853d6204bbc4feabfd8c73d4c3b3001 (/dev/asm-diskh) [SYSTEMDG]
 2. ONLINE   a5b37704c3574f0fbf21d1d9f58c4a6b (/dev/asm-diskg) [SYSTEMDG]
 3. ONLINE   36e5c51ff0294fc3bf2a042266650331 (/dev/asm-diski) [SYSTEMDG]
 4. ONLINE   af337d1512824fe4bf6ad45283517aaa (/dev/asm-diskj) [SYSTEMDG]
 5. ONLINE   3c4a349e2e304ff6bf64b2b1c9d9cf5d (/dev/asm-diskk) [SYSTEMDG]
Located 5 voting disk(s).
su - grid
[grid@rac1 ~]$ ocrconfig -showbackup
PROT-26: Oracle Cluster Registry backup locations were retrieved from a local copy
rac1     2014/08/09 01:59:56     /g01/11.2.0/maclean/grid/cdata/vrh-cluster/backup00.ocr
rac1     2014/08/08 21:59:56     /g01/11.2.0/maclean/grid/cdata/vrh-cluster/backup01.ocr
rac1     2014/08/08 17:59:55     /g01/11.2.0/maclean/grid/cdata/vrh-cluster/backup02.ocr
rac1     2014/08/08 05:59:54     /g01/11.2.0/grid/cdata/vrh-cluster/day.ocr
rac1     2014/08/08 05:59:54     /g01/11.2.0/grid/cdata/vrh-cluster/week.ocr
PROT-25: Manual backups for the Oracle Cluster Registry are not available
2. 彻底关闭所有节点上的clusterware ,OHASD 
crsctl stop has -f
3. 在一个节点上尝试重新启动HAS
[root@rac1 ~]# crsctl start has
CRS-4123: Oracle High Availability Services has been started.
但是因为ocr和votedisk所在diskgroup丢失,所以CSS将无法正常启动,如以下日志所示:
alertrac1.log 
[cssd(5162)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /g01/11.2.0/grid/log/rac1/cssd/ocssd.log
2014-08-09 03:35:41.207
[cssd(5162)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /g01/11.2.0/grid/log/rac1/cssd/ocssd.log
2014-08-09 03:35:56.240
[cssd(5162)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /g01/11.2.0/grid/log/rac1/cssd/ocssd.log
2014-08-09 03:36:11.284
[cssd(5162)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /g01/11.2.0/grid/log/rac1/cssd/ocssd.log
2014-08-09 03:36:26.305
[cssd(5162)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /g01/11.2.0/grid/log/rac1/cssd/ocssd.log
2014-08-09 03:36:41.328
ocssd.log
2014-08-09 03:40:26.662: [    CSSD][1078700352]clssnmReadDiscoveryProfile: voting file discovery string(/dev/asm*)
2014-08-09 03:40:26.662: [    CSSD][1078700352]clssnmvDDiscThread: using discovery string /dev/asm* for initial discovery
2014-08-09 03:40:26.662: [   SKGFD][1078700352]Discovery with str:/dev/asm*:
2014-08-09 03:40:26.662: [   SKGFD][1078700352]UFS discovery with :/dev/asm*:
2014-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskf:
2014-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskb
2014-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskj:
2014-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskh:
2014-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskc:
2014-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskd:
2014-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diske
2014-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskg:
2014-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diski:
2014-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskk:
2014-08-09 03:40:26.665: [   SKGFD][1078700352]OSS discovery with :/dev/asm*:
2014-08-09 03:40:26.665: [   SKGFD][1078700352]Handle 0xdf22a0 from lib :UFS:: for disk :/dev/asm-diskf:
2014-08-09 03:40:26.665: [   SKGFD][1078700352]Handle 0xf412a0 from lib :UFS:: for disk :/dev/asm-diskb:
2014-08-09 03:40:26.666: [   SKGFD][1078700352]Handle 0xf3a680 from lib :UFS:: for disk :/dev/asm-diskj:
2014-08-09 03:40:26.666: [   SKGFD][1078700352]Handle 0xf93da0 from lib :UFS:: for disk :/dev/asm-diskh:
2014-08-09 03:40:26.667: [    CSSD][1078700352]clssnmvDiskVerify: Successful discovery of 0 disks
2014-08-09 03:40:26.667: [    CSSD][1078700352]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2014-08-09 03:40:26.667: [    CSSD][1078700352]clssnmvFindInitialConfigs: No voting files found
2014-08-09 03:40:26.667: [    CSSD][1078700352](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retryin
恢复:
1. 以-excl -nocrs 方式启动cluster,这将可以启动ASM实例 但不启动CRS
 [root@rac1 rac1]# crsctl start crs -excl -nocrs 
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded
 
2.重建原ocr和votedisk所在diskgroup,注意compatible.asm必须是11.2:
[root@rac1 rac1]# su - grid
[grid@rac1 ~]$ sqlplus  / as sysasm
SQL*Plus: Release 11.2.0.3.0 Production on Thu Aug 9 04:16:58 2012
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> create diskgroup systemdg high redundancy disk '/dev/asm-diskh','/dev/asm-diskg','/dev/asm-diski','/dev/asm-diskj','/dev/asm-diskk'
 ATTRIBUTE 'compatible.rdbms' = '11.2', 'compatible.asm' = '11.2';
3.从ocr backup中恢复ocr并做ocrcheck检验:
[root@rac1 ~]# ocrconfig -restore /g01/11.2.0/grid/cdata/vrh-cluster/backup00.ocr
[root@rac1 ~]# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3180
         Available space (kbytes) :     258940
         ID                       : 1238458014
         Device/File Name         :  +systemdg
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
         Cluster registry integrity check succeeded
         Logical corruption check succeeded       
      
4. 准备恢复votedisk ,可能会遇到下面的错误:
[grid@rac1 ~]$ crsctl replace votedisk  +SYSTEMDG
CRS-4602: Failed 27 to add voting file 2e4e0fe285924f86bf5473d00dcc0388.
CRS-4602: Failed 27 to add voting file 4fa54bb0cc5c4fafbf1a9be5479bf389.
CRS-4602: Failed 27 to add voting file a109ead9ea4e4f28bfe233188623616a.
CRS-4602: Failed 27 to add voting file 042c9fbd71b54f5abfcd3ab3408f3cf3.
CRS-4602: Failed 27 to add voting file 7b5a8cd24f954fafbf835ad78615763f.
Failed to replace voting disk group with +SYSTEMDG.
CRS-4000: Command Replace failed, or completed with errors.
 
需要重新配置一下ASM的参数,并重启ASM:
SQL> alter system set asm_diskstring='/dev/asm*';
System altered.
SQL> create spfile from memory;
File created.
SQL>  startup force mount;
ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance
ASM instance started
Total System Global Area  283930624 bytes
Fixed Size                  2227664 bytes
Variable Size             256537136 bytes
ASM Cache                  25165824 bytes
ASM diskgroups mounted
SQL> show parameter spfile
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      /g01/11.2.0/grid/dbs/spfile+ASM1.ora
[grid@rac1 trace]$  crsctl replace votedisk  +SYSTEMDG
CRS-4256: Updating the profile
Successful addition of voting disk 85edc0e82d274f78bfc58cdc73b8c68a.
Successful addition of voting disk 201ffffc8ba44faabfe2efec2aa75840.
Successful addition of voting disk 6f2a25c589964faabf6980f7c5f621ce.
Successful addition of voting disk 93eb315648454f25bf3717df1a2c73d5.
Successful addition of voting disk 3737240678964f88bfbfbd31d8b3829f.
Successfully replaced voting disk group with +SYSTEMDG.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced

5. 重启has服务,检验cluster是否正常:
[root@rac1 ~]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
[root@rac1 ~]# crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   85edc0e82d274f78bfc58cdc73b8c68a (/dev/asm-diskh) [SYSTEMDG]
 2. ONLINE   201ffffc8ba44faabfe2efec2aa75840 (/dev/asm-diskg) [SYSTEMDG]
 3. ONLINE   6f2a25c589964faabf6980f7c5f621ce (/dev/asm-diski) [SYSTEMDG]
 4. ONLINE   93eb315648454f25bf3717df1a2c73d5 (/dev/asm-diskj) [SYSTEMDG]
 5. ONLINE   3737240678964f88bfbfbd31d8b3829f (/dev/asm-diskk) [SYSTEMDG]
Located 5 voting disk(s).

查看资状态
[root@rac1 ~]# crsctl stat res -t

由于恢复过去了很长时间,大体的步骤就是这样,恢复中间也遇到很多错误,但都解决了,遇到错误根据实际情况进行处理



oracle rac日常基本维护命令 所有实例服务的状态 $ srvctl status database -d orcl Instance orcl1 is running on node linux1 Instance orcl2 is running on node linux2 单个实例的状态 $ srvctl status instance -d orcl -i orcl2 Instance orcl2 is running on node linux2 在数据库全局命名服务的状态 $ srvctl status service -d orcl -s orcltest Service orcltest is running on instance(s) orcl2, orcl1 特定节点上节点应用程序的状态 $ srvctl status nodeapps -n linux1 VIP is running on node: linux1 GSD is running on node: linux1 Listener is running on node: linux1 ONS daemon is running on node: linux1 ASM 实例的状态 $ srvctl status asm -n linux1 ASM instance +ASM1 is running on node linux1. 列出配置的所有数据库 $ srvctl config database orcl 显示 RAC 数据库的配置 $ srvctl config database -d orcl linux1 orcl1 /u01/app/oracle/product/10.2.0/db_1 linux2 orcl2 /u01/app/oracle/product/10.2.0/db_1 显示指定集群数据库的所有服务 $ srvctl config service -d orcl orcltest PREF: orcl2 orcl1 AVAIL: 显示节点应用程序的配置 —(VIP、GSD、ONS、监听器) $ srvctl config nodeapps -n linux1 -a -g -s -l VIP exists.: /linux1-vip/192.168.1.200/255.255.255.0/eth0:eth1 GSD exists. ONS daemon exists. Listener exists. 显示 ASM 实例的配置 $ srvctl config asm -n linux1 +ASM1 /u01/app/oracle/product/10.2.0/db_1 集群中所有正在运行的实例 SELECT inst_id , instance_number inst_no , instance_name inst_name , parallel , status , database_status db_status , active_state state , host_name host FROM gv$instance ORDER BY inst_id; INST_ID INST_NO INST_NAME PAR STATUS DB_STATUS STATE HOST -------- -------- ---------- --- ------- ------------ --------- ------- 1 1 orcl1 YES OPEN ACTIVE NORMAL rac1 2 2 orcl2 YES OPEN ACTIVE NORMAL rac2 位于磁盘组中的所有数据文件 select name from v$datafile union select member from v$logfile union select name from v$controlfile union select name from v$tempfile; NAME ------------------------------------------- +FLASH_RECOVERY_AREA/orcl/controlfile/current.258.570913191 +FLASH_RECOVERY_AREA/orcl/onlinelog/group_1.257.570913201 +FLASH_RECOVERY_AREA/orcl/onlinelog/group_2.256.570913211 +FLASH_RECOVERY_AREA/orcl/onlinelog/group_3.259.570918285 +FLASH_RECOVERY_AREA/orcl/onlinelog/group_4.260.570918295 +ORCL_DATA1/orcl/controlfile/current.259.570913189 +ORCL_DATA1/orcl/datafile/example.257.570913311 +ORCL_DATA1/orcl/datafile/indx.270.570920045 +ORCL_DATA1/orcl/datafile/sysaux.260.570913287 +ORCL_DATA1/orcl/datafile/system.262.570913215 +ORCL_DATA1/orcl/datafile/undotbs1.261.570913263 +ORCL_DATA1/orcl/datafile/undotbs1.271.570920865 +ORCL_DATA1/orcl/datafile/undotbs2.265.570913331 +ORCL_DATA1/orcl/datafile/undotbs2.272.570921065 +ORCL_DATA1/orcl/datafile/users.264.570913355 +ORCL_DATA1/orcl/datafile/users.269.570919829 +ORCL_DATA1/orcl/onlinelog/group_1.256.570913195 +ORCL_DATA1/orcl/onlinelog/group_2.263.570913205 +ORCL_DATA1/orcl/onlinelog/group_3.266.570918279 +ORCL_DATA1/orcl/onlinelog/group_4.267.570918289 +ORCL_DATA1/orcl/tempfile/temp.258.570913303 21 rows selected. 属于“ORCL_DATA1”磁盘组的所有 ASM 磁盘 SELECT path FROM v$asm_disk WHERE group_number IN (select group_number from v$asm_diskgroup where name = 'ORCL_DATA1'); PATH ---------------------------------- ORCL:VOL1 ORCL:VOL2  二: 启动/停止RAC集群 确保是以 oracle UNIX 用户登录的。我们将从rac1节点运行所有命令: # su – oracle $ hostname Rac1 停止 Oracle RAC 10g 环境 第一步是停止 Oracle 实例。当此实例(相关服务)关闭后,关闭 ASM 实例。最后,关闭节点应用程序(虚拟IP、GSD、TNS 监听器 ONS)。 $ export ORACLE_SID=orcl1 $ emctl stop dbconsole $ srvctl stop instance -d orcl -i orcl1 $ srvctl stop asm -n rac1 $ srvctl stop nodeapps –n rac1 启动 Oracle RAC 10g 环境 第一步是启动节点应用程序(虚拟 IP、GSD、TNS 监听器 ONS)。当成功启动节点应用程序后,启动 ASM 实例。最后,启动 Oracle 实例(相关服务)以及企业管理器数据库控制台。 $ export ORACLE_SID=orcl1 $ srvctl start nodeapps -n rac1 $ srvctl start asm -n rac1 $ srvctl start instance -d orcl -i orcl1 $ emctl start dbconsole 使用 SRVCTL 启动/停止所有实例 启动/停止所有实例及其启用的服务。我只是觉得有意思就把此步骤作为关闭所有实例的一种方法加进来了! $ srvctl start database -d orcl $ srvctl stop database -d orcl
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值