11.2.0.3 RAC IO问题_rac查询io top-优快云博客

http://t.askmaclean.com/thread-897-1-1.html

环境：

1.OS:

[root@11grac1 ~]# cat /etc/issue
Oracle Linux Server release 6.3
Kernel \r on an \m

2.oracle 数据库版本：

[oracle@11grac1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Fri Feb 15 17:39:25 2013

Copyright (c) 1982, 2011, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> desc v$version
Name Null? Type
----------------------------------------- -------- ----------------------------
BANNER VARCHAR2(80)

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE 11.2.0.3.0 Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

3.各rac节点内存：

[root@11grac1 ~]# free -m
total used free shared buffers cached
Mem: 3016 2385 631 0 134 879
-/+ buffers/cache: 1372 1644
Swap: 3071 137 2934

4.出现的问题，异常现象

[root@11grac1 ~]# iostat 2
Linux 2.6.39-300.26.1.el6uek.x86_64 (11grac1) 02/15/2013 _x86_64_ (2 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
13.29 0.00 10.39 18.32 0.00 58.00

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 134.07 144.57 18631.06 3950612 509135788
sdb 1.36 1.20 3.59 32673 98077
sdc 1.38 1.03 3.81 28147 104055
sdd 1.39 1.16 3.82 31581 104383
sde 3.83 89.54 20.85 2446835 569831
sdf 2.22 38.06 20.85 1040170 569831
sdg 0.85 1.64 15.23 44734 416167

avg-cpu: %user %nice %system %iowait %steal %idle
10.28 0.00 7.27 8.27 0.00 74.19

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 168.50 0.00 22432.00 0 44864
sdb 1.50 0.00 5.00 0 10
sdc 1.50 0.00 5.00 0 10
sdd 1.50 0.00 5.00 0 10
sde 3.50 80.00 20.00 160 40
sdf 4.00 96.00 20.00 192 40
sdg 1.00 0.00 20.00 0 40

avg-cpu: %user %nice %system %iowait %steal %idle
18.95 0.00 14.96 6.98 0.00 59.10

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 209.50 0.00 28736.00 0 57472
sdb 1.50 0.00 5.00 0 10
sdc 1.50 0.00 5.00 0 10
sdd 1.50 0.00 5.00 0 10
sde 2.00 48.00 4.00 96 8
sdf 0.50 0.00 4.00 0 8
sdg 0.50 0.00 4.00 0 8

是一个叫ologgerd的进程长时间占用大量的磁盘io

ologgerd 是什么进程，

网上别人的文章：http://blog.youkuaiyun.com/jjwspj/article/details/7857106

oracle官方文档：http://docs.oracle.com/cd/E11882_01/rac.112/e16794/troubleshoot.htm#autoId0

Cluster Health Monitor（以下简称CHM）是一个Oracle提供的工具，用来自动收集操作系统的资源（CPU、内存、SWAP、进程、I/O以及网络等）的使用情况。CHM会每秒收集一次数据。

这些系统资源数据对于诊断集群系统的节点重启、Hang、实例驱逐(Eviction)、性能问题等是非常有帮助的。另外，用户可以使用CHM来及早发现一些系统负载高、内存异常等问题，从而避免产生更严重的问题。

CHM会自动安装在下面的软件：
11.2.0.2 及更高版本的 Oracle Grid Infrastructure for Linux (不包括Linux Itanium) 、Solaris (Sparc 64 和 x86-64)
11.2.0.3 及更高版本 Oracle Grid Infrastructure for AIX 、 Windows (不包括Windows Itanium)。

解决方法：

需要安装11.2.0.3.1的PSU:p13348650_112030_Linux-x86-64.zip

但没有metalink帐号，打不了补丁，

所以只有关闭所有节点的CHM服务：

[grid@11grac2 ~]$ crsctl status res -t -init | grep ora.crf
ora.crf

[grid@11grac2 ~]$ crsctl stop res ora.crf -init
CRS-2673: Attempting to stop 'ora.crf' on '11grac2'
CRS-2677: Stop of 'ora.crf' on '11grac2' succeeded