在RAC的维护过程中,常常由于机房搬迁、网络规划调整、存储调整等原因需调整集群的Public IP及VIP,及变更CRS的OCR、Voting Disk。以下通过维护过程中实施经验,并加以试验总结,希望对大家在维护过程中碰到类似操作能起到指导作用。
本文所涉及环境:
操作系统--Oracle Enterprise Linux 5
Oracle版本--10.2.0.4
Oracel架构—RAC +ASM
一、 RAC环境修改Public、私有、Vip 等IP
1) 调整目标
调整前hosts配置如下:
127.0.0.1 localhost
138.30.0.101 rac1
138.30.0.102 rac2
138.30.0.201 rac1-vip
138.30.0.202 rac2-vip
100.100.1.1 rac1-priv
100.100.1.2 rac2-priv
调整目标为:
192.168.0.128 rac1
192.168.0.129 rac2
192.168.0.201 rac1-vip
192.168.0.202 rac2-vip
10.10.1.1 rac1-priv
10.10.1.2 rac2-priv
2) 关闭CRS相关资源
停止CRS上所有相关资源,包括:实例(DB+ASM)、服务、应用等
当前crs状态如下:
[oracle@RAC1 ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE ONLINE rac2
ora....C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
ora.racdb.db application ONLINE ONLINE rac2
ora....b1.inst application ONLINE ONLINE rac1
ora....b2.inst application ONLINE ONLINE rac2
通过crs_stop命令为停止CRS上所有资源的最快捷方法:
[oracle@RAC1 ~]$ crs_stop -all
Attempting to stop `ora.rac1.gsd` on member `rac1`
Attempting to stop `ora.rac1.ons` on member `rac1`
Attempting to stop `ora.rac2.gsd` on member `rac2`
Attempting to stop `ora.rac2.ons` on member `rac2`
Attempting to stop `ora.racdb.db` on member `rac2`
Stop of `ora.rac1.gsd` on member `rac1` succeeded.
Stop of `ora.rac1.ons` on member `rac1` succeeded.
Stop of `ora.rac2.gsd` on member `rac2` succeeded.
Stop of `ora.rac2.ons` on member `rac2` succeeded.
Stop of `ora.racdb.db` on member `rac2` succeeded.
`ora.racdb.racdb1.inst` is already OFFLINE.
`ora.racdb.racdb2.inst` is already OFFLINE.
Attempting to stop `ora.rac1.ASM1.asm` on member `rac1`
Attempting to stop `ora.rac2.ASM2.asm` on member `rac2`
Attempting to stop `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
Attempting to stop `ora.rac2.LISTENER_RAC2.lsnr` on member `rac2`
Stop of `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` succeeded.
Attempting to stop `ora.rac1.vip` on member `rac1`
Stop of `ora.rac1.vip` on member `rac1` succeeded.
Stop of `ora.rac2.ASM2.asm` on member `rac2` succeeded.
Stop of `ora.rac2.LISTENER_RAC2.lsnr` on member `rac2` succeeded.
Attempting to stop `ora.rac2.vip` on member `rac2`
Stop of `ora.rac1.ASM1.asm` on member `rac1` succeeded.
Stop of `ora.rac2.vip` on member `rac2` succeeded.
当然也可以通过srvctl命令集进行逐个资源停止。
srvctl stop database -d racdb 关闭数据库
srvctl stop asm -n rac1 关闭asm实例1
srvctl stop asm -n rac2 关闭asm实例2
srvctl stop nodeapps -n rac1 关闭实例1上应用程序
srvctl stop nodeapps -n rac2 关闭实例2上应用程序
通过crs_stat –t命令进行检查,是否所有资源状态为offline.
然后以ROOT分别停止2个节点的CRS;
[root@RAC1 bin]# ./crsctl stop crs
[root@RAC2 bin]# ./crsctl stop crs
3) 修改系统层IP配置
在操作系统层修改/etc/hosts配置,需2个节点一致;
分别在2个节点修改IP地址,本文所用到网卡名为eth0和eth1;
也可以通过修改/etc/sysconfig/network-scripts/目录下网络配置文件:
-rw-r--r-- 4 root root 205 Mar 5 13:41 ifcfg-eth1
-rw-r--r-- 4 root root 229 Mar 5 13:41 ifcfg-eth0
修改后分别重新激活网卡eth0及eth1;
4) 启动CRS
通过crsctl启动crs,但由于CRS相关服务会伴随CRS启动,需通过上述2)步骤停止相关资源,但保留CRS进程,后续修改ip需CRS在启动状态。
另可以在启动CRS前通过srvctl命令配置服务、实例、应用不随CRS启动,具体命令如下:
-- 禁止数据库自动启动
srvctl disable database -d racdb
-- 禁止某个实例的自动启动
srvctl disable instance -d racdb -i racdb1
srvctl disable instance -d racdb -i racdb2
-- 禁止服务在实例上运行
srvctl disable service -d racdb -s service_name -i racdb1
-- 查看具体配置信息
srvctl config database -d racdb –a
参数:-a为列出详细信息,例如:
[root@RAC1 bin]# ./srvctl config database -d racdb
rac1 racdb1 /home/oracle/10gR2/db
rac2 racdb2 /home/oracle/10gR2/db
[root@RAC1 bin]# ./srvctl config database -d racdb -a
rac1 racdb1 /home/oracle/10gR2/db
rac2 racdb2 /home/oracle/10gR2/db
DB_NAME: null
ORACLE_HOME: /home/oracle/10gR2/db
SPFILE: null
DOMAIN: null
DB_ROLE: null
START_OPTIONS: null
POLICY: AUTOMATIC
ENABLE FLAG: DB ENABLED
使用srvctl禁用数据库自动启动后区别如下:
[root@RAC1 bin]# ./srvctl disable database -d racdb
[root@RAC1 bin]# ./srvctl config database -d racdb -a
rac1 racdb1 /home/oracle/10gR2/db
rac2 racdb2 /home/oracle/10gR2/db
DB_NAME: null
ORACLE_HOME: /home/oracle/10gR2/db
SPFILE: null
DOMAIN: null
DB_ROLE: null
START_OPTIONS: null
POLICY: MANUAL
ENABLE FLAG: DB DISABLED, INST DISABLED ON racdb1 racdb2
验证结果:
crs_stat –t中输出结果中,所有资源为offline,并且crsctl check crs状态如下:
CSS appears healthy
CRS appears healthy
EVM appears healthy
5) 修改CRS中IP配置
通过oifcfg命令集进行配置,该命令用于定义和修改Oracle 集群需要的网卡属性,包括网卡的网段地址,子网掩码,接口类型等。
Oifcfg 命令的格式如下:
interface_name/subnet:interface_type
接口的配置方式分为两类,global 和node-specific。前者说明集群所有节点的配置信息相同,也就是说所有节点的配置是对称的;而后者意味着这个节点的配置和其他节点配置不同,是非对称的。
Iflist:显示网口列表
Getif: 获得单个网口信息
Setif:配置单个网口
Delif:删除网口
-- 查看public 类型的网卡
[root@rac1 bin]# ./oifcfg getif -type public
-- 删除接口配置
[root@rac1 bin]# ./oifcfg delif –global
-- 添加接口配置
[root@rac1 bin]# ./oifcfg setif -global eth0/192.168.0.0:public
[root@rac1 bin]# ./oifcfg setif -global eth1/10.10.1.0:cluster_interconnect
注:IP 地址最一个为0,代表的是一个网段。
具体调整CRS中IP配置步骤如下:
-- 查看当前配置
./oifcfg getif -global
eth0 138.30.0.0 global public
eth1 100.100.1.0 global cluster_interconnect
-- 删除当前配置
./oifcfg delif -global eth0
./oifcfg delif -global eth1
-- 添加新public 及私有ip地址
./oifcfg setif -global eth0/192.168.0.0:public eth1/10.10.1.0:cluster_interconnect
-- 列出修改后配置信息
./oifcfg iflist
eth0 192.168.0.0
eth1 10.10.1.0
-- 添加新VIP地址
./srvctl modify nodeapps -n rac1 -A 192.168.0.201/255.255.255.0/eth0
./srvctl modify nodeapps -n rac2 -A 192.168.0.202/255.255.255.0/eth0
以上步骤只需在1个节点执行操作。
6) 修改listener文件
修改监听配置文件中涉及IP部分。
7) 重启CRS及验证结果
通过crsctl命令重启CRS,并通过crs_stat –t检查所有资源是否是online状态,如下:
[root@RAC1 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE ONLINE rac2
ora....C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
ora.racdb.db application ONLINE ONLINE rac1
ora....b1.inst application ONLINE ONLINE rac1
ora....b2.inst application ONLINE ONLINE rac2
若显示以上结果,既调整完成。
二、 RAC CRS的OCR及VOTING损坏修复
Oracle Clusterware由2部分组成,分别是Voting Disk和 OCR。Voting Disk里面记录着节点成员的信息。如RAC数据库中有哪些节点成员,节点增加或者删除时也同样会将信息记录进来。Voting Disk必须存放在共享存储上,通常来说是存放在裸设备上。为了保证Voting Disk的安全,需要配置多个Voting Disk,Voting Disk使用的是一种“多数可用算法”,如果存在多个Voting Disk,则必须一半以上的同时使用,Clusterware才能正常使用。不满足半数以上,集群会立即宕掉,所以Oracle建议Voting Disk的个数应该为奇数个,如 1、3、5个,每个Voting Disk的大小约为20MB。
OCR 记录的是节点成员的配置信息,如数据库、ASM、实例、监听器、VIP等CRS资源的配置信息。CRS进程管理的信息来自OCR的内容。OCR存储的配置信息是以目录树的形式来记录一系列“键值”对应信息的。OCR记录着 CRS进程管理资源的所有配置信息。大小约为100MB。
Voting Disk和OCR存放的信息是至关重要的,一旦他们丢失或者损坏的话,Clusterware将无法启动,这样整个RAC都无法启动。因此需要对 Voting Disk和OCR进行完备的备份。 本文描述在未备份前提下OCR、Voting Disk损坏恢复过程。
1) 测试环境准备
将RAC的CRS的OCR、Voting Disk对应内容清除,同时清理相关CRS配置;
在停止CRS相关资源及CRS的前提下分别在2所有节点执行CRS_HOME目录下的./install/rootdelete.sh及rootdeinstall.sh脚本,然后用dd命令清理OCR、Voting Disk对应裸设备内容清除;
dd if=/dev/zero f=/dev/raw/raw2 bs=1M count=100
dd if=/dev/zero f=/dev/raw/raw1 bs=1M count=100
注:以上操作切忌在生产库执行。
查询CRS的OCR、Voting Disk对应裸设备方法如下:
[root@RAC1 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 524024
Used space (kbytes) : 3948
Available space (kbytes) : 520076
ID : 566773499
Device/File Name : /dev/raw/raw1
Device/File integrity check succeeded
Device/File not configured
Cluster registry integrity check succeeded
[root@RAC1 bin]# ./crsctl query css votedisk
0. 0 /dev/raw/raw2
located 1 votedisk(s).
2) 重建OCR、VotingDisk
分别在1、2节点执行CRS_HOME目录下的root.sh,执行结果,
1节点:
[root@RAC1 crs]# /home/oracle/10gR2/crs/root.sh
WARNING: directory '/home/oracle/10gR2' is not owned by root
WARNING: directory '/home/oracle' is not owned by root
Checking to see if Oracle CRS stack is already configured
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/home/oracle/10gR2' is not owned by root
WARNING: directory '/home/oracle' is not owned by root
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :
node 1: rac1 rac1-priv rac1
node 2: rac2 rac2-priv rac2
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Now formatting voting device: /dev/raw/raw2
Format of 1 voting devices complete.
Startup will be queued to init within 30 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
rac1
CSS is inactive on these nodes.
rac2
Local node checking complete.
Run root.sh on remaining nodes to start CRS daemons.
2节点:
[root@RAC2 ~]# /home/oracle/10gR2/crs/root.sh
WARNING: directory '/home/oracle/10gR2' is not owned by root
WARNING: directory '/home/oracle' is not owned by root
Checking to see if Oracle CRS stack is already configured
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/home/oracle/10gR2' is not owned by root
WARNING: directory '/home/oracle' is not owned by root
clscfg: EXISTING configuration version 3 detected.
clscfg: version 3 is 10G Release 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :
node 1: rac1 rac1-priv rac1
node 2: rac2 rac2-priv rac2
clscfg: Arguments check out successfully.
NO KEYS WERE WRITTEN. Supply -force parameter to override.
-force is destructive and will destroy any previous cluster
configuration.
Oracle Cluster Registry for cluster has already been initialized
Startup will be queued to init within 30 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
rac1
rac2
CSS is active on all nodes.
Waiting for the Oracle CRSD and EVMD to start
Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps
Creating VIP application resource on (2) nodes...
Creating GSD application resource on (2) nodes...
Creating ONS application resource on (2) nodes...
Starting VIP application resource on (2) nodes...
Starting GSD application resource on (2) nodes...
Starting ONS application resource on (2) nodes...
Done.
执行完后CRS将自动启动,用crs_stat –t结果如下:
[root@RAC2 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
只创建gsd、ons、vip等应用程序资源;
3) 注册其它资源到集群
通过srvctl注册其它资源到集群中;
--注册库及实例到CRS
srvctl add database -d racdb -o /home/oracle/10gR2/db
srvctl add instance -d racdb -i racdb1 -n rac1
srvctl add instance -d racdb -i racdb2 -n rac2
--注册ASM实例到CRS
srvctl add asm -n rac1 -i +ASM1 -o /home/oracle/10gR2/db
srvctl add asm -n rac2 -i +ASM2 -o /home/oracle/10gR2/db
--修改实例和ASM实例的依赖关系
[oracle@RAC1 ~]$ srvctl modify instance -d racdb -i racdb1 -s +ASM1
[oracle@RAC1 ~]$ srvctl modify instance -d racdb -i racdb2 -s +ASM2
--创建服务到CRS
srvctl add service -d racdb -s service_name -r racdb1 -a racdb2 -P BASIC
参数说明:
-s : 服务名
-r:首选实例名
-a:备选实例名
-P:TAF策略,可选值为None(缺省值),Basic,preconnect
4) 注册节点的监听器到集群
分别在2个节点$ORACLE_HOME下的/crs/public目录创建监听cap文件
vi $ORACLE_HOME/crs/public/ora.rac1.LISTENER_RAC1.lsnr.cap
添加如下内容(以1节点为例子):
NAME=ora.rac1.LISTENER_RAC1.lsnr
TYPE=application
ACTION_SCRIPT=/home/oracle/10gR2/db/bin/racgwrap
CHECK_INTERVAL=600
ACTIVE_PLACEMENT=1
DESCRIPTION=CRS application for listener on node
HOSTING_MEMBERS=rac1
PLACEMENT=favored
REQUIRED_RESOURCES=ora.rac1.vip
然后通过crs_register将监听器注册到集群
crs_register ora.rac1.LISTENER_RAC1.lsnr
最后启动资源
[oracle@RAC2 ~]$ crs_start –all
5) 验证结果
通过crs_stat –t命令查询,若所有已添加资源状态为online,则OCR、VotingDisk已修复。
[root@RAC1 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE ONLINE rac2
ora....C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
ora.racdb.db application ONLINE ONLINE rac1
ora....b1.inst application ONLINE ONLINE rac1
ora....b2.inst application ONLINE ONLINE rac2
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/24145320/viewspace-701552/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/24145320/viewspace-701552/