IOT/Abort trap 故障

在AIX 5.3环境下,Oracle 10.2.0.4 RAC节点2出现无法连接数据库并报IOT/Abort trap错误。故障分析指向磁盘权限或系统空间问题。检查发现/software挂载点空间100%占用,Oracle和交易日志在同一目录。清理30G日志后,启动ONS和GSD服务,业务恢复正常。GSD负责协调集群管理执行行政任务,而ONS提供FAN事件的发布订阅服务。
  • 故障现象:

RAC节点2新建连接无法连接数据库,已有连接报无法扩展空间。

crs_stat –t -v报错,IOT/Abort trap

节点1执行crs_stat –t -v发现节点2的ONS和GSD服务为OFFLINE

$errpt

 

C69F5C9B   0719173614 P S SYSPROC        SOFTWARE PROGRAM ABNORMALLY TERMINATED
C69F5C9B   0719173514 P S SYSPROC        SOFTWARE PROGRAM ABNORMALLY TERMINATED
F7FA22C9   0719173514 I O SYSJ2          UNABLE TO ALLOCATE SPACE IN FILE SYSTEM
$ errpt -a | more
PCSS/SPI2 FLDS/crs_stat. SIG/6 FLDS/__dftdt__ VALU/248
---------------------------------------------------------------------------
LABEL:          CORE_DUMP
IDENTIFIER:     C69F5C9B


Date/Time:       Sun Jul 20 01:37:40 BEIST 2014
Sequence Number: 15668
Machine Id:      00C5B7E44C00
Node Id:         rac02
Class:           S
Type:            PERM
Resource Name:   SYSPROC         


Description
SOFTWARE PROGRAM ABNORMALLY TERMINATED


Probable Causes
SOFTWARE PROGRAM


User Causes
USER GENERATED SIGNAL


        Recommended Actions
        CORRECT THEN RETRY


Failure Causes
SOFTWARE PROGRAM


        Recommended Actions
        RERUN THE APPLICATION PROGRAM
        IF PROBLEM PERSISTS THEN DO THE FOLLOWING
        CONTACT APPROPRIATE SERVICE REPRESENTATIVE


Detail Data
SIGNAL NUMBER
           6
USER'S PROCESS ID:
               4296744
FILE SYSTEM SERIAL NUMBER
           1
[47;1H[K[7mStandard input[m[47;1H[47;1H[KINODE NUMBER
           2
CORE FILE NAME
//core
PROGRAM NAME
crs_stat.bin
STACK EXECUTION DISABLED
           0
COME FROM ADDRESS REGISTER
_ptrgl 8


PROCESSOR ID
  hw_fru_id: N/A
  hw_cpu_id: N/A


ADDITIONAL INFORMATION
pthread_k 88
??
_p_raise 6C
raise 38
abort B8
myabort__ 10
-03--80- 74
terminate 10
??
__dftdt__ 248
asic_ostr 64C
-03-?! 12C
 7F8
procArgs_ 364
 1B0
__start 94


Symptom Data
REPORTABLE
1
INTERNAL ERROR
0
SYMPTOM CODE
 

 

  • 环境描述:AIX 5.3  Oracle10.2.0.4 RAC
  • 故障分析

根据上面错误提示判断两个方面的问题,磁盘权限或者系统空间。

让SA着手检查磁盘权限,没有问题。

检查磁盘空间,发现一个/software 挂载点空间占用100%

检查发现Oracle Cluster和DB都在该目录安装。和用户沟通后发现该挂载点存有交易日志,清理响应日志,释放30G空间。

之后启动ONS和GSD服务

srvctl start nodeapps –n rac2

业务恢复正常

 

参考资料:

  • GSD

The Global Services Daemon (GSD) runs on each node with one GSD process per node. The GSD coordinates with the cluster manager to receive requests from clients such as the DBCA, EM, and the SRVCTL utility to execute administrative job tasks such as instance startup or shutdown. The GSD is not an Oracle instance background process and is therefore not started with the Oracle instance.

  • ONS

A publish and subscribe service for communicating information about all FAN events.

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值