事件背景
rman清理脚本异常.导致磁盘空间爆满(一个环境变量没有设置正确)
释放磁盘空间,进行rman清理
之后,领导把实例重启,但是ams实例没有关闭
环境
系统 : AIX
数据库: Oracle10g
问题处理
下午,客户反应页面无法登陆.
事发第一时间,检查集群
1.进行集群检查
#crs_stat -t
CRS-0184: Cannot communicate with the CRS daemon.
2.尝试启动(root)
#crsctl start crs
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.
根据提示Services is already active
3.检查crs服务
#crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services demon
CRS-4534: Cannot communicate with Event Manager‘
4.看似没问题,进行快速检查vip是否正常
查看本地配置文件
#cat /etc/hosts
10.205.128.21 ORA1
10.205.128.22 ORA2
100.100.100.21 ORA1-priv
100.100.100.22 ORA2-priv
10.205.128.25 ORA1-vip
10.205.128.26 ORA2-vip
10.205.128.27 ORA-scan
#ifconfig -a
en0: flags=1e084863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
inet 10.205.128.22 netmask 0xffffffc0 broadcast 10.205.128.63
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
发现vip不在节点上,仅剩下物理ip
判断问题为VIP进行漂移了
解决方法
1.关闭实例
#su - oracle
$sqlplus / as sysdba
SQL>shu immediate
2.关闭asm
#su - grid
$sqlplus / as sysasm
SQL>shu immediate
3.尝试重启crs
#crsctl disable crs
CRS-4621: Oracle High Availability Services autostart is disabled.#crsctl stop crs
CRS-2796: The command may not proceed when Cluster Ready Services is not running
CRS-4687: Shutdown command has completed with errors.
CRS-4000: Command Stop failed, or completed with errors.
再次尝试启动,也是报错。
#crsctl enable crs
CRS-4622: Oracle High Availability Services autostart is enabled.#crsctl start crs
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.
4.直接kill掉crs
最后看到mos上有一个workaround,可以手动Kill掉那些crs的进程。
#ps -fea | grep ohasd.bin | grep -v grep
root 30541310 1 1 16:19:17 - 3:33 /u01/grid/11.2.0/bin/ohasd.bin reboot
#ps -fea | grep gipcd.bin | grep -v grep
grid 2098638 1 1 16:19:40 - 2:29 /u01/grid/11.2.0/bin/gipcd.bin
#ps -fea | grep mdnsd.bin | grep -v grep
grid 31916408 1 0 16:19:36 - 0:01 /u01/grid/11.2.0/bin/mdnsd.bin
#ps -fea | grep gpnpd.bin | grep -v grep
grid 5177660 1 0 16:19:38 - 0:27 /u01/grid/11.2.0/bin/gpnpd.bin
#ps -fea | grep evmd.bin | grep -v grep
grid 3342370 1 0 16:33:17 - 1:28 /u01/grid/11.2.0/bin/evmd.bin
#ps -fea | grep crsd.bin | grep -v grep
root 3408064 1