Single IP Address Cluster

最新推荐文章于 2024-05-25 09:09:03 发布

转载最新推荐文章于 2024-05-25 09:09:03 发布 · 1.1k 阅读

文章标签：

#application #performance #migration #server #network #sockets

Cluster 专栏收录该内容

14 篇文章

订阅专栏

本文介绍了一种称为SAPS的单地址协议栈技术，该技术能够使计算机集群从IP网络角度看作单一系统。通过将集群视为拥有单一IP地址的实体，客户端可以如同面对单一计算机一样进行访问。这种设计适用于大规模Web服务器或作为网格节点的PC集群。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Single IP Address Cluster

We are investigating a new network protocol stack for a single system image cluster. It is called SAPS (Single Address Protocol Stack). The main objective of SAPS is to make a cluster of computers a single system from an IP network point of view. With SAPS, a cluster has a single IP address. Client computers access the cluster using this IP address as if it were a single computer. The single system image cluster is used for large-scale web servers, or PC clusters used as a node of a grid because the single system image makes it easier to manage the cluster or to build an application for the cluster.

Figure 1. Single IP Address Cluster

Design

In order to realize a single IP address on physically distributed computers, a computer that handles all the network I/O is introduced. This is called the I/O server. The other nodes are called application nodes. They provide the socket interface to applications. The overall design of SAPS is shown in Figure 1.

The I/O server receives packets from the Internet. TCP protocol handling is performed on this computer. The received data is forwarded to an application node on which the process responsible for that data is running.

Outgoing messages are forwarded from an application node to the I/O server. In the server, TCP packets are constructed and sent to the Internet. Control messages are also transferred between the I/O server and application nodes. The control messages consist of the notification of the arrival of a new connection, or a port number to listen on.

For the cluster network that forwards the data and control messages, a reliable high performance interface is assumed such as Myrinet or Infiniband.

Socket Migration

In the traditional cluster, it is impossible to move a process with IP sockets from one node to another because two nodes have different IP addresses. By using SAPS, IP sockets may be migrated between two application nodes because they work on the same IP address.

While a process is moving the node, the I/O server stops forwarding received data and buffers it on the server. After the migration is completed, the buffered data is sent to the migrated node. The I/O server continues to perform the protocol handling during the migration. Thus, TCP connections need not be closed to migrate the process.

Figure 3. Socket Migration

Implementation

SAPS is currently implemented as a kernel module of Linux kernel 2.6.13. As a cluster network, the Myrinet network and the PM/Myrinet[2] communication library are used. On the I/O server, the original implementation of the TCP/IP protocol stack is extended so that communication is initiated by the arrival of the data from the cluster network, instead of system calls. The I/O server converts the protocol between TCP/IP and the protocol of the cluster network. This conversion is realized by rewriting the header of each packet. Thus, data copying does not happen on the I/O server. On the application nodes, we made a new implementation of the socket interface. This new socket provides fully compatible interfaces to the applications. Thus, no modification is required to the existing applications. No kernel patch is required on either the I/O server or the application nodes.

Performance

All the measurements are done using computers with the Opteron or Xeon processor and a PCI-X bus. In order to compare the performance of SAPS with that of the Virtual Server[3], the performance of the network using Network Address Translation (NAT) is also measured..

The user level point-to-point performance is shown in Table 1 and Figure 3. These results show SAPS performs better than NAT.

Table 1. Round trip time (µs)

	Min.	Max.	Avg.
SAPS	165	243	217.6
NAT	210	334	255.6

Figure 4. Point-to-Point Bandwidth

Using a 3-node (plus one I/O server) cluster and three clients, the performance of SAPS is measured with multiple streams. In order to emulate the Internet, a 5ms delay is added in the route as shown in Figure 6.

Figure.6 Multiple Stream

For the evaluation with multiple streams, three application nodes send data to each client at the same time. The throughput of each stream is measured at the delay router. As shown in Figure 5, three streams share the available bandwidth equally in SAPS while unfair sharing is observed in NAT.

Figure 5. Multiple Stream Performance

Related Work

Kerrighed[4] is a research project to develop a single system image OS for clusters. Its Dynamic Streams supports the migration of sockets[1] but is not designed to migrate processes that are communicating with processes running outside the cluster.

Virtual Server[3] provides the single system image from the point of view of computers outside the cluster. Virtual Server does not provide the single system image inside the cluster and it does not enable the migration of communicating process.

Work in Progress

We have realized the single IP address cluster and socket migration with a good performance. By adding features such as single authentication and inter-node IPC, we intend to develop a single system image OS.