hearbeat of RAC

本文详细介绍了Oracle RAC集群中使用的三种心跳机制:网络心跳(NHB)、磁盘心跳(DHB)和本地心跳(LHB),并探讨了它们在确保集群健康状态方面的作用。此外,还讨论了在split-brain情况下如何确定节点的状态。

Heartbeat is a pooling mechanism in clustered platforms to verify if the other server participating in the cluster is alive. Oracle also uses the heartbeat mechanism to verify the health of the other nodes participating in the cluster.In a RAC cluster , every node will poll the other node in the cluster,This helps each server in the cluster to understand the health of the other server in the cluster and take appropriate actions should polling fail. In RAC, the CSS performs polling in three different methods:

1)Network Heartbeat (NHB)

2)Disk Heartbeat (DHB)

3) Local Heartbeat (LHB)

1)Network Heartbeat (NHB)
The NHB is sent over the private interconnect. CSS sends an NHB every second from one node to all the other nodes in a cluster and receives an NHB from the remote nodes similarly every second. The NHB contains timestamp information from the local node and is used by the remote. If an acknowledgment is not received from the other node
in the cluster in 30 seconds (represented by the miscount value), CSS would request a cluster reconfiguration. The reconfiguration will not always be required. CSS will verify the health and state of the node through other methods before making a decision for reconfiguration.

2)Disk Heartbeat (DHB)
Apart from the NHB, we use the DHB, which is required for split-brain resolution. It contains a timestamp of the local time in Unix epoch seconds as well as a millisecond timer. The DHB is the definitive mechanism to make a decision about whether a node is still alive. DHB is a mechanism where each server in the cluster will write a timestamp to the voting disk every second. In the case of NHB failure, CSS will verify the voting disk to check if the node in question has written any timestamp to the voting disk during the NHB missed timeframe to decide if cluster reconfiguration is required.

3)Local Heartbeat (LHB)
LHB is an internal heartbeat mechanism where the message is sent to the cssdmonitor and the cssdagent to keep them informed about the health of the CSS. LHB notifications also happen every second and use and share the same thread with the NHB and DHB.

 

此外,我们还需要思考

1)当split-brain发生时,如何判断哪些节点存活下去?谁来计算和判断?根据什么来计算和判断?这些搞清楚,就差不多了。

2)上面介绍的是cluster层面的心跳,除此之外,Oracle层面也有自己的心跳机制,那就是通过ckpt进程和control files实现的,这里就不具体讲述了。

 

转载于:https://www.cnblogs.com/lhdz_bj/p/9134280.html

在使用 heartbeat 实现高可用性时,通常会配置虚拟IP(Virtual IP,简称 VIP),以确保在主节点故障时,备用节点可以接管服务并继续提供服务。下面是配置 heartbeat 虚拟IP 的具体步骤: 1. 在主备节点上安装 heartbeat。可以使用系统自带的软件包管理器进行安装,也可以从官方网站下载源码编译安装。 2. 配置 heartbeat 的主配置文件 /etc/ha.d/ha.cf。在该文件中,需要设置以下参数: ``` keepalive 2 # 心跳检测频率,单位为秒 deadtime 10 # 节点失效时间,单位为秒 warntime 5 # 节点状态变化警告时间,单位为秒 initdead 120 # 初始检测时间,单位为秒 bcast eth0 # 广播地址,这里假设使用 eth0 网卡进行通信 auto_failback on # 自动故障转移 node node1 node2 # 节点名称,分别对应主节点和备用节点 ``` 3. 配置 heartbeat 的资源文件 /etc/ha.d/resource.d。在该文件夹中,需要新建一个脚本文件,用于启停服务和配置虚拟IP。脚本内容如下: ``` #!/bin/bash case "$1" in start) # 启动服务 /etc/init.d/my_service start # 配置虚拟IP /sbin/ifconfig eth0:0 10.0.0.100 netmask 255.255.255.0 broadcast 10.0.0.255 ;; stop) # 停止服务 /etc/init.d/my_service stop # 删除虚拟IP /sbin/ifconfig eth0:0 down ;; status) # 查询服务状态 /etc/init.d/my_service status ;; *) # 其他操作 echo "Usage: $0 {start|stop|status}" exit 1 ;; esac exit 0 ``` 4. 配置 heartbeat 的认证文件 /etc/ha.d/authkeys。在该文件中,需要设置认证密钥,以确保节点之间的通信是安全可靠的。请根据具体情况设置密钥,以下是一个示例: ``` auth 1 1 sha1 my_secret_key ``` 5. 启动 heartbeat 服务。在主节点上执行命令 /etc/init.d/heartbeat start,然后在备用节点上执行相同的命令。 6. 检查虚拟IP 是否成功配置。可以在主节点和备用节点上执行命令 ifconfig 确认虚拟IP 是否已经生效。 需要注意的是,以上步骤仅供参考,具体的配置方式可能因系统版本和环境不同而有所差异,请根据实际情况进行调整。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值