What are the different methods used for communication with RHEL High Availability?_there are different methods used for extinguishing-优快云博客

本文详细介绍了Red Hat Enterprise Linux (RHEL) 高可用集群中使用的四种IP通信方式：广播、组播、单播和流通信。从RHEL4到RHEL7，通信方式经历了从广播为主到默认使用UDPU的演变。文章还讨论了各种通信方式对集群性能的影响，包括DLM、GFS/GFS2文件系统、LVM、cmirror服务以及集群文件系统的交互。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

https://access.redhat.com/articles/146163

What are the different methods used for communication with RHEL High Availability?

Updated 2016年六月1日01:28 -

English

There are 4 types of IP communication used by Red Hat Enterprise Linux(RHEL) High Availability.

Broadcast: A packet is sent to the subnet's broadcast address and all nodes in the subnet will see that message. All of those cluster nodes need to make a decision whether to take any notice of that packet or not. Broadcast packets are not routed, they are just transmitted all over the network.
Multicast: A packet is sent to a nominated multicast address. Interested cluster nodes subscribe to multicast addresses and so those that do not subscribe to the cluster multicast do not see the packets. Multicast packets can be routed (though many switches do this badly by default).
Unicast: A packet is sent to a specific IP address, only that particular cluster node will see it. These messages can be routed.
Stream: The above messages are UDP packets, streams are a TCP thing and always unicast. Pairs of nodes establish a connection between themselves and exchange a stream of bytes in either/both directions.

NOTE: The iba transport for corosync is not supported: Is the iba transport supported in a RHEL 6 or 7 High Availability cluster?

Red Hat Cluster Suite 4+

The service cman in RHEL 4 used broadcast packets by default. This is mainly because it was the easiest to code and to set up and also the protocol was very simple and so did not generate much traffic. Due to the way the protocol was implemented it wasn't even capable of generating much traffic. Since there was not much traffic generated then broadcast was fine for this sort of application. RHEL 4 does have a multicast option that can be enabled in the /etc/cluster/cluster.conf.

Some packets in RHEL 4 were UDP unicast packets. When cman knew exactly which cluster node to send data to (eg ACKs) then they were sent directly to that cluster node using a unicast packet.

Red Hat Enterprise Linux Server 5 (with the High Availability Add on) using openais

Clustering in RHEL 5 was rewritten from scratch. Red Hat integrated openais as the major part of the cluster stack. This automatically forced the product to use multicast as that was the only type of communication that openais could use at the time. openais also has a much more complex protocol than RHEL 4 cman and thus isolation of traffic from all cluster nodes in a subnet is more important. Without proper isolation using IGMP, cluster traffic could be sent to all nodes, even those not part of the cluster, causing traffic overload.

As in RHEL 4, not all openais packets are multicast. Quite a lot are actually unicast as openais is a ring protocol and packets are sent round the ring. The join part of the protocol is one that does use multicast though, so failure of multicast routing can be seen quite quickly in an openaiscluster as cluster nodes fail to join.

The broadcast transport was added to openais and corosync to help alleviate multicast configuration and functional errors with switches. Broadcast was actually added to openais and corosync to enable deployments where multicast was not functional. Broadcast was originally not supported because it transmits traffic to all nodes in the network resulting in overload and broadcast is incompatible with IPv6.

Red Hat Enterprise Linux Server 6 (with the High Availability Add on) using corosync

By default RHEL 6 uses multicast for default transport.

In RHEL 6.2 UDPU option was added to corosync as fully supported transport which already included broadcast and multicast. UDPU helps to resolve the multicast and broadcast issues customers were facing by using UDPU which uses all unicast UDP packets.

All unicast UDP packets are sent directly to targeted cluster nodes. The packets that would have been broadcast or multicasted to all cluster nodes now have to be sent N-1 times (where N is the number of cluster nodes in the cluster) so there is a traffic increase compared to the other systems, but this is often a decent tradeoff, particularly for smaller clusters. One restriction on UDPU clusters is that all cluster nodes of the cluster need to be specified in cluster.conf (if using only corosync and not cman then specified in the corosync.conf) so that the cluster node addresses are known in advance. This means there is an administrative as well as a traffic overhead for <cman transport="udpu"/> in /etc/cluster/cluster.conf,

Red Hat Enterprise Linux Server 7 (with the High Availability Add on) using corosync

By default RHEL 7 uses UDPU as default transport.

What factors can affect cluster communication?

There are various other factors that can affect the different traffic protocols such as what other subsystems are also in use. Below is a description of various services or protocols that can be directly or indirectly involved with cluster communication performance:

DLM has always used TCP streams for communication, so it is not directly involved in this. It is indirectly involved when plocks are used. The plocks that are used by GFS and GFS2 file-system use openais checkpoints(since RHEL 5). These checkpoints exchange information over the openais orcorosync TOTEM protocol and so are now affected by the multicast, broadcast, or UDPU issues. For a cluster with high plock usage when using a GFS orGFS2 file-system there can be communication issues because of the extra traffic that is generated with broadcast or UPDU which makes these protocolsunsuitable for these configurations. For small scale clusters or clusters not using plocks on a GFS or GFS2 file-system then broadcast or UDPU could be used. In RHEL 4 plocks were implemented on top of the DLM. plocks are range locks with odd semantics and the implementation over DLM was not an efficient use of the DLM API. This meant that plock traffic was sent over the TCP streams of the DLM and not subject to the issues discussed above.
* Another factor that can affect communication is when network switches put a *cap on the number of multicast packets. This can cause communication issues(and possible fencing of cluster node) when that cap is exceeded.
* lvm is the other notable user of the cluster protocol. The service clvmd is a very minimal user actually. It just sends locking notification messages when metadata is altered so there is no serious impact on performance and clvmd should work quite happily in a multicast, broadcast or UDPU installation.
* The service cmirror is a different thing altogether. cmirror use the openais or corosync CPG protocol to share information and on a busy system with many mirrors this can be quite a lot of traffic, especially when cluster nodes reboot or resyncs are necessary. For this reason multicast is recommended for cmirror installations.
* The filesystem GFS and GFS2 does not have any communication mechanism of its own. GFS and GFS2 use DLM which uses TCP streams.
* The service rgmanager will normally use the the DLM. However in multi-homed clusters it will instead use it's own simplified lock manager called cpglock, which implements basic locking over the corosync CPG protocol. So this traffic will be subject to the multicast, broadcast, or udpu effects mentioned above. rgmanager does not generate a lot of lock traffic though, so should be fine with any of the networking systems.