Oracle RAC CSS 超时计算 及 参数 misscount, Disktimeout 说明

一. 概述

在之前的文章:

RAC 的一些概念性和原理性的知识

http://blog.youkuaiyun.com/tianlesoftware/article/details/5331067

提到OCSSD这个进程是Clusterware最关键的进程,如果这个进程出现异常,会导致系统重启,这个进程提供CSS(ClusterSynchronizationService)服务。CSS服务通过多种心跳机制实时监控集群状态,提供脑裂保护等基础集群服务功能。

CSS服务有2种心跳机制:一种是通过私有网络的NetworkHeartbeat,另一种是通过VotingDisk的DiskHeartbeat.

这2种心跳都有最大延时,对于DiskHeartbeat,这个延时叫作IOT(I/OTimeout);对于NetworkHeartbeat,这个延时叫MC(Misscount)。这2个参数都以秒为单位,缺省时IOT大于MC,在默认情况下,这2个参数是Oracle自动判定的,并且不建议调整。

可以通过如下命令来查看参数值:

$crsctlgetcssdisktimeout

$crsctlgetcssmisscount

如:

[oracle@rac1 ~]$ crsctl get css disktimeout

200

[oracle@rac1 ~]$ crsctl get css misscount

60

这是这2个参数的默认值。

二. MOS 上相关的几篇文章

How to start/stop the 10g CRS ClusterWare[ID 309542.1]

10g RAC: Steps To Increase CSS Misscount,Reboottime and Disktimeout [ID 284752.1]

CSS Timeout Computation in OracleClusterware [ID 294430.1]

RAC Assurance Support Team: RAC and OracleClusterware Starter Kit and Best Practices (Generic) [ID 810394.1]

2.1修改CSS Misscount 步骤:

1)Shut down CRS on all but one node. For exact steps use Note 309542.1

2)Execute crsctl as root to modify the misscount:

$ORA_CRS_HOME/bin/crsctl set css misscount <n>

where <n> is the maximum i/o latency to the voting disk +1 second

3)Reboot the node where adjustment was made

4)Start all other nodes shutdown in step 1

With the Patch:4896338 for 10.2.0.1 thereare two additional settings that can be tuned.This change is incorporated into the 10.2.0.2 and 10.1.0.6patchsets.

These following are only relevant on10.2.0.1 with Patch:4896338,In addition to MissCount, CSS now has two more parameters:

1)reboottime (default 3 seconds) - the amount of time allowed for a node to complete a reboot after the CSS daemon hasbeen evicted. (I.E. how long does ittake for the machine to completely shutdown when you do a reboot)

2)disktimeout (default 200 seconds) - the maximum amount of time allowed for a voting file I/O to complete; if thistime is exceeded the voting disk will be marked as offline. Note that this is also the amount of timethat will be required for initial cluster formation, i.e. when no nodes havepreviously been up and in a cluster.

$CRS_HOME/bin/crsctl set css reboottime <r> [-force] (<r> is seconds)

$CRS_HOME/bin/crsctl set css disktimeout <d> [-force] (<d>is seconds)

Confirm the new css misscount setting via ocrdump

2.2 CSS Timeout Computation in OracleClusterware

2.2.1 MISSCOUNTDEFINITION AND DEFAULT VALUES
The CSS misscount parameterrepresents the maximum time, in seconds, that a network heartbeat can be missedbefore entering into a cluster reconfiguration to evict the node. The followingare the default values for the misscount parameter and their respectiveversions when using Oracle Clusterware* in seconds:

OS

10g (R1 &R2)

11g

Linux

60

30

Unix

30

30

VMS

30

30

Windows

30

30

*CSS misscount default value when using vendor (non-Oracle)clusterware is 600 seconds. This is to allow the vendor clusterwareample time to resolve any possible split brain scenarios.

On AIX platforms with HACMP starting with 10.2.0.3 BP#1, themisscount is 30. This is documented inNote551658.1

2.2.2 CSS HEARTBEATMECHANISMS AND THEIR INTERRELATIONSHIP
The synchronization servicescomponent (CSS) of the Oracle Clusterware maintains two heartbeat mechanisms

1.) the disk heartbeat to the voting deviceand

2.) the network heartbeat across theinterconnect which establish and confirm valid node membership in the cluster.

Bothof these heartbeat mechanisms have an associated timeout value. The diskheartbeat has an internal i/o timeout interval (DTO Disk TimeOut), in seconds,where an i/o to the voting disk must complete. The misscount parameter (MC), asstated above, is the maximum time, in seconds, that a network heartbeatcan be missed. The disk heartbeat i/o timeout interval is directly related tothe misscount parameter setting. There has been some variation in thisrelationship
between versions as described below:

9.x.x.x

NOTE, MISSCOUNT WAS A DIFFERENT ENTITY IN THIS RELEASE

10.1.0.2

No one should be on this version

10.1.0.3

DTO = MC - 15 seconds

10.1.0.4

DTO = MC - 15 seconds

10.1.0.4+Unpublished Bug 3306964

DTO = MC - 3 seconds

10.1.0.4 with CRS II Merge patch

DTO =Disktimeout (Defaults to 200 seconds) Normally OR Misscount seconds only during initial Cluster formation or Slightly before reconfiguration

10.1.0.5

IOT = MC - 3 seconds

10.2.0.1 +Fix for unpublishedBug 4896338

IOT=Disktimeout(Defaults to 200 seconds) Normally OR Misscount seconds only during initial Cluster formation or Slightly before reconfiguration

10.2.0.2

Same as above (10.2.0.1 with PatchBug:4896338

10.1 - 11.1

During node join and leave (reconfiguration) in a cluster we need to reconfigure, in that particular case we use Short Disk TimeOut (SDTO) which is in all versions SDTO = MC – reboottime (usually 3 seconds)

Misscountdrives cluster membership reconfigurations and directly effects theavailability of the cluster. In most cases, the default settings for MC shouldbe acceptable. Modifying the default value of misscount not onlyinfluences the timeout interval for the i/o to the voting disk, but alsoinfluences the tolerance for missed network heartbeats across the interconnect.

2.2.3 LONG LATENCIES TOTHE VOTING DISKS
If I/O latencies to the voting diskare greater than the default DTO calculations noted above, the cluster mayexperience CSS node evictions depending on (a)the Oracle Clusterware (CRS)version, (b)whether merge patch has been applied and (c)the state of theCluster. More details on this are covered in the section "Change inBehavior with CRS Merge PATCH(4896338 on 10.2.0.1)".

Theselatencies can be attributed to any number of problems in the i/o subsystem orproblems with any component in the i/o path. The following is a non exhaustivelist of reported problems which resulted in CSS node eviction due to latenciesto the voting disk longer than the default Oracle Clusterware i/o timeoutvalue(DTO):

1.QLogic HBA cards with a LinkDown Timeout greater than the default misscount.

2.Bad cables to the SAN/storagearray that effect i/o latencies

3.SAN switch (like Brocade)failover latency greater than the default misscount

4.EMC Clariion Array whentrespassing the SP to the backup SP greater than default misscount

5.EMC PowerPath path errordetection and I/O repost and redirect greater than defaultmisscount

6.NetApp Cluster (CFO) failoverlatency greater than default misscount

7.Sustained high CPU load whicheffects the CSSD disk ping monitoring thread

8.Poor SAN network configurationthat creates latencies in the I/O path.


The mostcommon problems relate to multi-path IO software drivers, and thereconfiguration times resulting from a failure in the IO path. Hardwareand (re)configuration issues that introduce these latencies should becorrected. Incompatible failover times with underlying OS, network or storagehardware or software may be addressed given a complete understanding of theconsiderations listed below.

Misscount should NOT be modified to workaround theabove-mentioned issues. Oracle support recommends that you apply thelatest patchset which changes the CSS behaviour.More details covered innext section.

2.2.4 Change in BehaviorwithBug:4896338applied on top of10.2.0.1
Starting with 10.2.0.1+Bug:4896338,CSS will not evict the node from the cluster due to (DTO) I/O to voting disktaking more than misscount seconds unless it is during the initial clusterformation or slightly before reconfiguration.
So if we have aN number ofnodes in acluster and one of the nodes takes more than misscountsecondsto access the voting disk, the node will not be evicted as long asthe access to the voting disk is completed within disktimeoutseconds.Consequently with thispatch, there is no need to increasethe misscount at all.

Additionallythis merge patch introduces Disktimeout which is the amount of time thata lack of disk ping to voting disk(s) will be tolerated.

Note: applying the patch will notchange your value for Misscount.

The table below explains intheconditions under which the eviction will occur

Network Ping

Disk Ping

Reboot

Completes within misscount seconds

Completes withinMisscount seconds

N

Completes within Misscount seconds

Takes more than misscount seconds but less than Disktimeout seconds

N

Completes within Misscount seconds

Takes more than Disktimeout seconds

Y

Takes more than Misscount Seconds

Completes within Misscount seconds

Y

* By defaultMisscount is lessthan Disktimeout seconds

2.2.5 CONSIDERATIONS WHENCHANGING MISSCOUNT FROM THE DEFAULT VALUE

1.Customers drive SLA and clusteravailability. The customer ultimately defines Service Levels and availabilityfor the cluster. Before recommending any change to misscount, the full impactof that change should be described and the impact to cluster availabilitymeasured.

2.Customers may have timeout andretry logic in their applications. The impact of delaying reconfiguration maycause 'artificial' timeouts of the application, reconnect failures andsubsequent logon storms.

3.Misscount timeout values areversion dependent and are subject to change. As we have seen, misscountcalculations are variable between releases and between versions within arelease. Creating a false dependency on misscount calculation in one versionmay not be appropriate for later versions.

4.Internal I/O timeout interval(DTO) algorithms may change in later releases as stated above, there exists adirect relationship between the internal I/O timeout interval and misscount.This relationship is subject to change in later releases.

5.An increase in misscount tocompensate for i/o latencies directly effects reconfiguration times for networkfailures. The network heartbeat is the primary indicator of connectivity withinthe cluster. Misscount is the tolerance level of missed 'check ins' thattrigger cluster reconfiguration. Increasing misscount will prolong the time totake corrective action in the event of network failure or other anomalieseffecting the availability of a node in the cluster. This directly effectscluster availability.

6.Changing misscount toworkaround voting disk latencies will need to be corrected when the underlyingdisk latency is corrected, misscount needs to be set back to the default Thecustomer needs to document the change and set the parameter back to the defaultwhen the underlying storage I/O latency is resolved.

7.Do not change default misscountvalues if you are running Vendor Clusterware along with OracleClusterware. The default values for misscount should not be changed when usingvendor clusterware. Modifying misscount in this environment may causeclusterwide outages and potential corruptions.

8.Changing misscount parameterincurs a clusterwide outage. As note below, the customer will need to schedule
a clusterwide outage to make this change.

9.Changing misscount should notbe used to compensate for poor configurations or faulty hardware

10.Cluster and RDBMS availabilityare directly effected by high misscount settings.

11.In case of stretched clustersand stretched storage systems and a site failure where we loose one storage andN number of nodes we go into a reconfiguration state and then we revert toShortDiskTimeOut value as internal I/O timeout for the votings. Several casesare known with stretched clusters where when a site failure happen the storagefailover cannot complete within SDTO. If the I/O to the votings is blocked morethan SDTO the result is node evictions on the surviving side.


To Change MISSCOUNT back to default PleaserefertoNote:284752.1
THIS IS THE ONLY SUPPORTED METHOD.NOT FOLLOWING THIS METHOD RISKS EVICTIONS AND/OR CORRUPTING THE OCR

10g Release 2 MIRRORED VOTING DISKS AND VENDORMULTIPATHING SOLUTIONS
Oracle RAC 10g Release 2 allows formultiple voting disks so that the customer does not have to rely on amultipathing solution from a storage vendor. You can have n voting disks (up to31) where n = m*2+1 where m is the number of disk failures you want tosurvive. Oracle recommends each voting disk to be on a separate physical disk.

-------------------------------------------------------------------------------------------------------

Blog:http://blog.youkuaiyun.com/tianlesoftware

Weibo: http://weibo.com/tianlesoftware

Email: dvd.dba@gmail.com

DBA1 群:62697716(满); DBA2 群:62697977(满)DBA3 群:62697850(满)

DBA 超级群:63306533(满); DBA4 群: 83829929(满)DBA5群: 142216823(满)

DBA6 群:158654907(满) 聊天 群:40132017(满) 聊天2群:69087192(满)

--加群需要在备注说明Oracle表空间和数据文件的关系,否则拒绝申请

<think>好的,用户现在的问题是关于Oracle RAC单节点故障的重新启动处理方法,需要详细的处理步骤。先看一下用户提供的引用内容,以及之前的对话历史,确保回答准确且符合用户需求。 首先,用户之前的问题是关于单节点故障处理,现在进一步询问具体的重启步骤。之前的回答已经涵盖了故障检测、自动重启、手工排查和数据一致性验证,但现在需要更具体的步骤指南。 引用1提到Oracle RAC和MySQL的自动重启机制,但用户现在专注于Oracle RAC,所以需要重点放在Clusterware和CRS的相关操作。引用2是关于删除和添加节点的命令,可能在处理节点替换时有用,但当前问题是重启,可能需要参考这个步骤来调整节点列表或重新配置。引用3提到了重启后节点资源无法启动的问题,说明可能存在配置或资源冲突,需要检查日志和资源状态。 用户的需求是明确的步骤,所以需要结构化回答,分步骤说明如何处理单节点故障后的重启。可能需要包括检查集群状态、隔离故障节点、日志分析、手动启动资源、修复后的重新加入等步骤。 还要注意用户之前的回答中提到的自动重启机制和手工介入部分,需要结合起来。比如,在自动重启失败的情况下,如何手动干预。同时,要确保引用的内容被正确使用,比如引用2中的命令可能在节点删除时使用,但当前问题可能涉及节点恢复,所以需要调整说明。 另外,用户可能遇到的情况包括节点无法自动恢复,需要手动重启集群服务,或者存在资源冲突导致节点无法正常加入集群。因此,步骤中需要包括检查资源状态、清理残留锁、重新配置节点等操作。 需要确保每个步骤都有对应的命令示例,比如使用crsctl检查状态,srvctl启动实例,以及通过日志分析定位问题。同时,提醒用户验证数据一致性,如引用3中提到的问题,可能需要在重启后使用工具检查数据完整性。 最后,生成的相关问题要围绕用户可能进一步询问的方向,比如如何预防单点故障、日志分析工具使用、资源冲突解决等,帮助用户扩展知识。</think>以下是Oracle RAC单节点故障重启的详细处理步骤,结合故障检测、日志分析和手动干预流程: --- ### **步骤1:确认集群状态** 通过以下命令检查集群整体健康状态: ```bash # 查看集群资源状态 crsctl status res -t # 检查节点存活状态 olsnodes -s ``` - 若故障节点显示为`UNKNOWN`或`INACTIVE`,说明集群已检测到异常[^2]。 - 若节点状态为`OFFLINE`,需进一步排查网络或存储问题。 --- ### **步骤2:隔离故障节点** 若节点因硬件/网络问题无法恢复,需手动隔离以防止资源争用: ```bash # 强制关闭故障节点上的集群服务 crsctl stop crs -f # 从集群配置中临时移除节点(需在正常节点执行) crsctl delete node -n <故障节点名> ``` 注:此操作需谨慎,避免误删节点配置[^2]。 --- ### **步骤3:分析日志定位原因** 关键日志文件路径及排查点: 1. **集群日志**:`$GRID_HOME/log/<节点名>/alert<节点名>.log` - 检查`CRS-1001`/`CRS-0215`等错误代码,确认通信超时或资源锁冲突[^3]。 2. **数据库告警日志**:`$ORACLE_BASE/diag/rdbms/<DB_NAME>/<实例名>/trace/alert_<实例名>.log` - 关注`ORA-29740`(实例驱逐)或`ORA-600`(内部错误)。 3. **操作系统日志**:`/var/log/messages`(Linux)或系统事件查看器(Windows) - 检查硬件故障(如磁盘I/O错误)或网络中断记录。 --- ### **步骤4:手动重启节点服务** 若自动恢复失败,按顺序执行以下操作: ```bash # 1. 清理残留资源锁 crsctl delete resource ora.<资源名>.db -f # 2. 重启集群服务 crsctl start crs # 3. 启动数据库实例 srvctl start instance -d <DB_NAME> -i <实例名> ``` - 若启动失败,尝试重置节点资源: ```bash crsctl modify resource ora.<资源名>.db -attr "AUTO_START=always" ``` --- ### **步骤5:修复后重新加入集群** 若节点已修复且日志无异常: ```bash # 重新添加节点到集群(在正常节点执行) crsctl add node -n <故障节点名> # 同步集群配置 $GRID_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=<GRID_HOME> CLUSTER_NODES=<节点列表> ``` 完成后通过`srvctl status nodeapps -n <节点名>`验证服务状态[^2]。 --- ### **关键注意事项** 1. **脑裂防护**:确保表决磁盘(Voting Disk)和OCR(Oracle Cluster Registry)的冗余性,避免单点故障引发集群分裂。 2. **超时参数调整**:若频繁误判故障,可调整`misscount`参数(默认30秒): ```bash crsctl set css misscount 60 ``` 3. **数据一致性验证**:使用`DBV`或`RMAN VALIDATE`检查数据文件完整性,确保无损坏块[^1]。 --- ### 相关问题 1. **如何通过日志快速定位Oracle RAC节点通信故障?** 2. **Oracle RAC中表决磁盘(Voting Disk)损坏应如何恢复?** 3. **在哪些场景下需要手动清理OCR中的残留资源锁?**
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值