注:机翻,未校。
Difference Between SO_REUSEADDR and SO_REUSEPORT
Last Updated : 05 Feb, 2023
Processes use sockets as endpoints of a two-way channel to transfer data. The socket options SO_REUSEADDR and SO_REUSEPORT have different man pages and programmer documentation for various operating systems, which can be very confusing. The option SO_REUSEPORT is not even available on some operating systems. To perform socket operations, such as connecting to a socket address or listening for a new connection, processes use a variety of socket-layer functions.
进程使用套接字作为双向通道的端点来传输数据。套接字选项 SO_REUSEADDR 和 SO_REUSEPORT 具有不同的手册页和适用于各种操作系统的程序员文档,这可能非常令人困惑。选项 SO_REUSEPORT 在某些操作系统上甚至不可用。为了执行套接字操作,例如连接到套接字地址或侦听新连接,进程使用各种套接字层函数。
Understanding Socket Implementation 了解套接字实现
Pipes and sockets are comparable. To the programs that use them, both appear to be filed. Both aid in the communication between processes. Sockets interact with a remote program; pipes interact with a local program. As you mentioned, sockets also provide bidirectional communication (much like a pair of properly connected pipes could). The values of a socket are the protocol, source IP address, source port, destination IP address, and port. A socket is a five-tuple. In order to maintain a connection between the two endpoints, no two sockets can have the same five values. A call to the socket() function is used to initially create a socket. A socket’s unique identifier, or socket descriptor, is what this function returns. We can give the socket a source IP address and a source port with the bind() function. The connect() function sets the destination IP address and destination port. At last, it is normal for programs on a solitary machine to convey utilizing standard organization conventions, like TCP; it would be inefficient to go the whole way to the organization equipment (if any!) and compute checksums.
管道和套接字具有可比性。对于使用它们的程序来说,两者似乎都已归档。两者都有助于进程之间的通信。套接字与远程程序交互;管道与本地程序交互。正如提到的,套接字还提供双向通信(就像一对正确连接的管道一样)。套接字的值包括协议、源 IP 地址、源端口、目标 IP 地址和端口。套接字是一个五元组。为了保持两个端点之间的连接,任何两个套接字都不能具有相同的五个值。对 socket() 函数的调用用于初始创建套接字。套接字的唯一标识符或套接字描述符是此函数返回的内容。可以使用 bind() 函数为套接字提供源 IP 地址和源端口。connect() 函数设置目标 IP 地址和目标端口。最后,独立机器上的程序使用标准组织约定(如 TCP)进行传输是正常的;全程到组织设备(如果有的话!)并计算校验和是低效的。
The protocol, source IP address, source port, destination IP address, and port are the values of a socket, which is a five-tuple. In order to maintain a connection between the two endpoints, no two sockets can have the same five values. Connecting to www.geeksforgeeks.org in our web browser will allow us to verify this.
协议、源 IP 地址、源端口、目标 IP 地址和端口是套接字的值,套接字是一个五元组。为了保持两个端点之间的连接,任何两个套接字都不能具有相同的五个值。在网络浏览器中连接到 www.geeksforgeeks.org 将能够验证这一点。
nslookup www.geeksforgeeks.org

A collection of IPV4 and IPV6 addresses are retrieved. Then, when we use the ss command, we’ll search for any of these IP addresses. This command aids in our ongoing investigation of the socket:
检索 IPV4 和 IPV6 地址的集合。然后,当使用 ss 命令时,将搜索这些 IP 地址中的任何一个。此命令有助于对套接字的持续调查:
ss -t

We can see that the socket we have has a working connection. The source IP address in this instance is 180.149.59.201, and the source port is 52984. Additionally, the destination IP address (180.149.59.203) and port (192 for HTTPS) are both those of the youtube server.
可以看到拥有的套接字有一个工作连接。本实例中的源 IP 地址为 180.149.59.201,源端口为 52984。此外,目标 IP 地址 (180.149.59.203) 和端口(HTTPS 为 192)都是 youtube 服务器的地址。
What Are Socket Options? 什么是套接字选项?
A network socket’s identifier is a socket file descriptor. It is essential to keep in mind that while all file descriptors are sockets, not all sockets are file descriptors. This is due to the fact that file descriptors can serve as identifiers for pipes, sockets, and files. The option name, such as SO_BROADCAST, indicates that the property is set. There are various options for various protocol levels. The protocol level is a necessary parameter because of this. When examining the socket level, the protocol level to use is SOL SOCKET. By looking at the option name’s prefix, we can tell the options for a level apart. For example, we can perceive that SO_DEBUG is on the attachment level just from the initial two letters of the chchosenoice name. IP_DONTFRAG operates at the IP protocol level, whereas TCP_NODELAY operates at the TCP protocol level. Socket management must be possible for processes. For instance, a process might need to enable the recording of debugging information or broadcast messages. The values of SO BROADCAST and SO DEBUG would change in this situation. This is accomplished by a process using the setsockopt() function. The setsockopt function needs the following five inputs: Name of the socket, file descriptor, Protocol level, value, and length.
网络套接字的标识符是套接字文件描述符。必须记住,虽然所有文件描述符都是套接字,但并非所有套接字都是文件描述符。这是因为文件描述符可以用作管道、套接字和文件的标识符。选项名称(如 SO_BROADCAST)表示已设置属性。对于不同的协议级别,有各种选项。因此,协议级别是一个必要的参数。检查套接字级别时,要使用的协议级别是 SOL SOCKET。通过查看选项名称的前缀,可以分辨出不同级别的选项。例如,可以从 chchosenoice 名称的前两个字母中感知到 SO_DEBUG 处于依恋级别。IP_DONTFRAG 在 IP 协议级别运行,而 TCP_NODELAY 在 TCP 协议级别运行。对于进程,套接字管理必须是可行的。例如,进程可能需要启用调试信息的记录或广播消息。在这种情况下,SO BROADCAST 和 SO DEBUG 的值将更改。这是通过使用 setsockopt() 函数的进程实现的。setsockopt 函数需要以下五个输入:套接字名称、文件描述符、协议级别、值和长度。
What Is SO_REUSEADDR? 什么是 SO_REUSEADDR?
Local addresses and ports can be reused with the SO_REUSEADDR socket option. Your server can bind to an address that is in the TIME-WAIT state using SO REUSEADDR. It prevents multiple servers from binding to the same address. The fact that another server can bind to the same port by binding to a specific address rather than INADDR_ANY poses a security risk when this flag is used. Starting with the Linux kernel version 2.4 and later, SO REUSEADDR is used. Different operating systems have different ways of implementing this socket option.
本地地址和端口可以通过 SO_REUSEADDR 套接字选项重复使用。服务器可以使用 SO REUSEADDR 绑定到处于 TIME-WAIT 状态的地址。它可以防止多个服务器绑定到同一地址。使用此标志时,另一台服务器可以通过绑定到特定地址而不是 INADDR_ANY 地址来绑定到同一端口,这一事实会带来安全风险。从 Linux 内核版本 2.4 及更高版本开始,将使用 SO REUSEADDR。不同的操作系统有不同的方式来实现此套接字选项。
The same address/port combination will be used every time the process stops and starts over. We would need to explicitly request this behavior by activating the SO_REUSEADDR socket option with setsockopt() in order for this to take place. Before calling the bind() function, the setsockopt() function needs to be called. Additionally, the restarted process will fail if the SO_REUSEADDR socket option is not enabled. The way wildcard addresses are handled changes when the SO REUSEADDR socket option is set. Because more than one socket needs to bind to the UDP port, SO_REUSEADDR is required. This guarantees that the source IP will send a message to each socket that is connected to the UDP port.
每次进程停止和重新开始时,都将使用相同的地址 / 端口组合。需要通过使用 setsockopt() 激活 SO_REUSEADDR 套接字选项来明确请求此行为,以便发生此操作。在调用 bind() 函数之前,需要调用 setsockopt() 函数。此外,如果未启用 SO_REUSEADDR 套接字选项,重新启动的过程将失败。设置 SO REUSEADDR 套接字选项时,通配符地址的处理方式会发生变化。 由于需要将多个套接字绑定到 UDP 端口,因此需要 SO_REUSEADDR 套接字。这保证了源 IP 将向连接到 UDP 端口的每个套接字发送消息。
Additionally, wildcard addresses can bind to the same port with this socket option. A socket binding to 0.0.0.0:80 and another socket b attempting to bind to 10.1.0.1:40 will fail without SO_REUSEADDR. Since 0.0.0.0 also includes 10.1.0.1, there would be a conflict because it lists all possible local addresses. This is interpreted by the kernel as a pair of sockets sharing the same local address and port. The way wildcard addresses are handled changes when the SO REUSEADDR socket option is set. There won’t be a conflict between a socket bound to 0.0.0.0:40 and a socket bound to 10.1.0.1:40 if SO REUSEADDR is enabled. This is due to the fact that the IP address 0.0.0.0:40 is treated as a wildcard address and isn’t the same as the precise local address of 10.1.0.1.
此外,通配符地址可以使用此套接字选项绑定到同一端口。绑定到 0.0.0.0:80 的套接字和尝试绑定到 10.1.0.1:40 的另一个套接字 b 将失败,且不 SO_REUSEADDR。由于 0.0.0.0 还包括 10.1.0.1,因此会存在冲突,因为它列出了所有可能的本地地址。这被内核解释为一对共享相同本地地址和端口的套接字。设置 SO REUSEADDR 套接字选项时,通配符地址的处理方式会发生变化。如果启用了 SO REUSEADDR,则绑定到 0.0.0.0:40 的套接字和绑定到 10.1.0.1:40 的套接字之间不会发生冲突。这是因为 IP 地址 0.0.0.0:40 被视为通配符地址,与精确的本地地址 10.1.0.1 不同。
What Is SO_REUSEPORT? 什么是 SO_REUSEPORT?
Multiple sockets can bind to the same address and port combination when SO REUSEPORT is enabled, just like SO REUSEADDR does. If all of the processes use the SO_REUSEPORT option, the SO_REUSEPORT flag allows them to bind to the same address. The rule stipulates that the socket option SO_REUSEPORT must be enabled for each socket binding to the address and port. Before binding to a specific local IP and port combination, for instance, no socket can bind to socket A if SO_REUSEPORT is not enabled.
当启用 SO REUSEPORT 时,多个套接字可以绑定到相同的地址和端口组合,就像 SO REUSEADDR 一样。如果所有进程都使用 SO_REUSEPORT 选项,则 SO_REUSEPORT 标志允许它们绑定到同一地址。该规则规定,必须为绑定到地址和端口的每个套接字启用套接字选项 SO_REUSEPORT。例如,在绑定到特定的本地 IP 和端口组合之前,如果未启用 SO_REUSEPORT 套接字,则任何套接字都无法绑定到套接字 A。
As previously mentioned, a socket enters the synchronized state known as TIME WAIT when it closes. Unless both sockets have the SO REUSEPORT option, another socket won’t be able to use the IP address and port combination of the socket in the TIME-WAIT state. The SO_REUSEPORT socket option behaves similarly to SO_REUSEADDR when it comes to multicasting. The user’s restriction is what differentiates SO_REUSEPORT from the others. With SO_REUSEPORT, one compelling userID ought to achieve all attachments that share a similar IP and port. In point of fact, this holds true for both TCP and UDP.
如前所述,套接字在关闭时会进入称为 TIME WAIT 的同步状态。除非两个套接字都有 SO REUSEPORT 选项,否则另一个套接字将无法在 TIME-WAIT 状态下使用套接字的 IP 地址和端口组合。SO_REUSEPORT 套接字选项的行为类似于 SO_REUSEADDR 在组播方面。用户的限制是 SO_REUSEPORT 与其他人的区别。借助 SO_REUSEPORT,一个引人注目的 userID 应该实现共享相似 IP 和端口的所有附件。事实上,TCP 和 UDP 都是如此。
Difference Between SO_REUSEADDR and SO_REUSEPORT
SO_REUSEADDR | SO_REUSEPORT |
---|---|
Local addresses and ports may be reused with the SO REUSEADDR socket option. | Multiple sockets may bind to the same address and port combination with SO REUSEPORT enabled. |
Starting with Linux kernel version 2.4 and up, SO_REUSEADDR can be used. | This socket option was only implemented in Linux kernel version 3.9, making it relatively recent. |
Different operating systems have different ways of implementing this socket option. | Different operating systems have the same ways of implementing this socket option. |
In a multicast, packets are sent in a group communication to multiple destination IPs at once. | The SO REUSEPORT socket option functions similarly to SO REUSEADDR when multicasting. |
Using setsockopt(), set the SO_REUSEADDR socket option. | unless both sockets have the SO_REUSEPORT option, the TIME_WAIT state. |
this holds true for only UDP. | this holds true for both TCP and UDP. |
For example, SO REUSEADDR doesn’t examine whether any of the different sockets tying to the IP/port combination have a particular socket option set. | For example, Before binding to a particular local IP and port combination, no socket can do so if socket A does not have SO_REUSEPORT enabled. |
via:
-
Difference Between SO_REUSEADDR and SO_REUSEPORT - GeeksforGeeks
https://www.geeksforgeeks.org/difference-between-so_reuseaddr-and-so_reuseport/
Linux TCP SO_REUSEPORT — 使用和实现
Linux TCP SO_REUSEPORT — Usage and implementation
Krishna Kumar
Aug 19, 2019
Improve your server performance using a relatively new feature of the Linux networking stack — the SO_REUSEPORT socket option.
使用 Linux 网络堆栈的一个相对较新的功能(SO_REUSEPORT套接字选项)提高服务器性能。
Figure 1: Server on top uses parallel listeners to avoid bottlenecks, while the server at the bottom uses a single listener to accept incoming connection.
Summary 总结
HAProxy and NGINX are some of the few applications that use the TCP’s SO_REUSEPORT socket option [1] of the Linux networking stack. This option, initially introduced in 4.4 BSD, is used to implement high performance servers that help better utilize today’s large multicore systems. The first few sections of this article explains some essential concepts of TCP/IP sockets, and the remaining sections uses that knowledge to describe the rationale, usage and implementation of the SO_REUSEPORT socket option.
HAProxy 和 NGINX 是少数几个使用 Linux 网络堆栈的 TCP SO_REUSEPORT套接字选项 [1] 的应用程序。此选项最初在 4.4 BSD 中引入,用于实现高性能服务器,以帮助更好地利用当今的大型多核系统。本文的前几节介绍了 TCP/IP 套接字的一些基本概念,其余几节使用这些知识来描述 SO_REUSEPORT 套接字选项的基本原理、用法和实现。
Problem statement 问题陈述
The conventional method that a high performance server employs, when running on a multiprocessor system, is to have a single listener process that accepts incoming connections and passes these connections to worker processes for processing. However under heavy connection load, the listening process becomes a bottleneck. The other method often used by servers is to open a single listening socket, and fork multiple processes each of which invokes accept() to handle incoming connections on that socket, while performing the work themselves. The problem with this approach is that the process that starts picking up connections tends to get a high skew of connections. We discuss an alternate third approach in this article — opening multiple listen sockets to process incoming connections using SO_REUSEPORT, which solves both the problem of a single process bottleneck, as well as connection skew between processes.
在多处理器系统上运行时,高性能服务器采用的传统方法是具有单个侦听器进程,该进程接受传入连接并将这些连接传递给工作进程进行处理。然而,在重连接负载下,监听过程成为瓶颈。服务器经常使用的另一种方法是打开单个侦听套接字,并分叉多个进程,每个进程都调用 accept() 来处理该套接字上的传入连接,同时自己执行工作。这种方法的问题在于,开始获取连接的过程往往会获得高偏斜的连接。在本文中,我们将讨论另一种替代的第三种方法 — 打开多个侦听套接字以使用 SO_REUSEPORT 处理传入连接,这既解决了单个进程瓶颈的问题,也解决了进程之间的连接倾斜问题。
TCP connection basics
TCP 连接基础知识
A TCP connection is defined by a unique 5-tuple [2]:
TCP 连接由唯一的 5 元组 [2] 定义:
[ Protocol, Source IP address, Source Port, Destination IP address, Destination Port ]
Individual tuple elements are specified in different ways by clients and servers. Let’s understand how each tuple element is initialized by applications.
客户端和服务器以不同的方式指定各个元组元素。让我们了解应用程序如何初始化每个元组元素。
Client application 客户端应用程序
- Protocol: This field is initialized when the socket is created based on parameters provided by the application. Protocol is always TCP for purposes of this article. For example,
socket(AF_INET, SOCK_STREAM, 0); /* create a TCP socket /
协议:根据应用程序提供的参数创建套接字时,将初始化此字段。就本文而言,协议始终是 TCP。例如 socket(AF_INET, SOCK_STREAM, 0);/ 创建 TCP 套接字 */ - Source IP address and Port: These are usually set by the kernel when the application calls connect() without a prior invocation to bind(). The kernel picks a suitable IP address for communicating with the destination server, and a source port from the ephemeral port range (sysctl net.ipv4.ip_local_port_range).
源 IP 地址和端口:这些通常是在应用程序调用 connect() 而没有事先调用 bind() 时由内核设置的。内核选择一个合适的 IP 地址与目标服务器通信,并从临时端口范围 (sysctl net.ipv4.ip_local_port_range) 中选择一个源端口。 - Destination IP address and Port: These are set by the application by invoking connect(). For example:
目标 IP 地址和端口:这些是应用程序通过调用 connect() 来设置的。例如:
server.sin_family = AF_INET;
server.sin_port = htons(SERVER_PORT);
bcopy(server_ent->h_addr, &server.sin_addr.s_addr, server_ent->h_length);
/* Connect to server, and set the socket's destination IP address and port#
* based on above parameters. Also, request the kernel to automatically set
* the Source IP and port# if the application did not call bind() prior to connect().
*/
connect(fd, (struct sockaddr *)&server, sizeof server);
Server application 服务器应用程序
- Protocol: Initialized in the same way as described for a client application.
协议:以与客户端应用程序相同的方式进行初始化。 - Source IP address and Port: Set by the application when it invokes bind(), for example:
源 IP 地址和端口:由应用程序在调用 bind() 时设置,例如:
srv_addr.sin_family = AF_INET;
srv_addr.sin_addr.s_addr = INADDR_ANY;
srv_addr.sin_port = htons(SERVER_PORT);
bind(fd, &srv_addr, sizeof srv_addr);
Destination IP address and Port: A client connects to a server by completing the TCP 3-way handshake [3]. The server’s TCP/IP stack creates a new socket to track the client connection, and sets it’s Source IP:Port and Destination IP:Port from the incoming client connection parameters. The new socket is transitioned to the ESTABLISHED state while the server’s LISTEN socket is left unmodified. At this time, the server application’s call to accept() on the LISTEN socket returns with a reference to the newly ESTABLISHED socket. See the source code listing at the end of this article for an example implementation of client and server applications.
目标 IP 地址和端口:客户端通过完成 TCP 3 向握手 [3] 连接到服务器。服务器的 TCP/IP 堆栈创建一个新套接字来跟踪客户端连接,并从传入的客户端连接参数中设置其源 IP:Port 和目标 IP:Port。新套接字将转换为 ESTABLISHED 状态,而服务器的 LISTEN 套接字保持不变。此时,服务器应用程序在 LISTEN 套接字上对 accept() 的调用将返回对新 ESTABLISHED 套接字的引用。有关客户端和服务器应用程序的示例实现,请参阅本文末尾的源代码列表。
TIME-WAIT sockets TIME-WAIT 套接字
A TIME-WAIT [4] socket is created when an application closes it’s end of a TCP connection first. This results in the initiation of the TCP 4-way handshake, during which the socket state changes from ESTABLISHED to FIN-WAIT1 to FIN-WAIT2 to TIME-WAIT, before the socket is closed. The TIME-WAIT state is a lingering state for protocol reasons. An application can instruct the TCP/IP stack to not linger a connection by sending a TCP RST packet. In doing so, the connection gets instantly terminated without going through the TCP 4-way handshake. The following code fragment implements the reset of a connection by specifying a socket linger time of zero seconds:
当应用程序首先关闭其 TCP 连接结束时,将创建 time-wait [4] 套接字。这会导致启动 TCP 4 次握手,在此期间,套接字状态从 ESTABLISHED 变为 FIN-WAIT1 再到 FIN-WAIT2 变为 TIME-WAIT,然后套接字才会关闭。由于协议原因,TIME-WAIT 状态是一种挥之不去的状态。应用程序可以通过发送 TCP RST 数据包来指示 TCP/IP 堆栈不要在连接中停留。这样一来,连接就会立即终止,而无需经过 TCP 4 次握手。以下代码片段通过指定零秒的套接字停留时间来实现连接的重置:
const struct linger opt = {
.l_onoff = 1, .l_linger = 0 };
setsockopt(fd, SOL_SOCKET, SO_LINGER, &opt, sizeof opt);
close(fd);
Understanding different states of a server socket
了解服务器套接字的不同状态
A server typically executes the following system calls at start up:
服务器通常在启动时执行以下系统调用:
1. Create a socket:
server_fd = socket(...);
2. Bind to a well known IP address and port#:
ret = bind(server_fd, ...);
3. Mark the socket as passive by changing it's state to LISTEN:
ret = listen(server_fd, ...);
4. Wait for a client to connect, and get a reference file descriptor:
client_fd = accept(server_fd, ...);
Any new socket, created via socket() or accept() system calls, is tracked in the kernel using a “struct sock” structure [5]. In the code fragment above, a socket is created in step #1, and given a well known address in step #2. This socket is transitioned to the LISTEN state in step #3. Step #4 calls accept(), which blocks till a client connects to this IP:port. After the client completes the TCP 3-way handshake, the kernel creates a 2nd socket and returns a reference to this socket. The state of the new socket is set to ESTABLISHED, while the server_fd socket remains in LISTEN state.
任何通过 socket() 或 accept() 系统调用创建的新套接字,都会在内核中使用 “struct sock” 结构 [5] 进行跟踪。在上面的代码片段中,在步骤 #1 中创建一个套接字,并在步骤 #2 中给出一个众所周知的地址。在步骤 #3 中,此套接字转换为 LISTEN 状态。步骤 #4 调用 accept(),它会阻塞直到客户端连接到此 IP:端口。客户端完成 TCP 3 次握手后,内核会创建第二个套接字并返回对此套接字的引用。新套接字的状态设置为 ESTABLISHED,而server_fd套接字保持 LISTEN 状态。
SO_REUSEADDR socket option
SO_REUSEADDR socket 选项
The SO_REUSEADDR option for TCP sockets can be better understood from the following two use cases:
从以下两个用例中可以更好地理解 TCP 套接字的SO_REUSEADDR选项:
Use case #1. A server application restarts in two steps — an exit followed by start up. During exit, the server’s LISTEN socket is closed immediately. Let’s take a look at two situations that can arise due to the presence of existing connections to the server.
用例#1。服务器应用程序通过两个步骤重新启动 — 退出,然后启动。在退出过程中,服务器的 LISTEN 套接字将立即关闭。让我们看一下由于存在与服务器的现有连接而可能出现的两种情况。
- All established connections that were being handled by this dying server process are closed, and those sockets transitions to the TIME-WAIT state.
这个垂死的服务器进程正在处理的所有已建立的连接都将关闭,并且这些套接字将转换为 TIME-WAIT 状态。 - All established connections which were handed off to a child process continue to remain in ESTABLISHED state.
移交给子进程的所有已建立连接将继续保持 ESTABLISHED 状态。
When the server is subsequently started up, it’s attempt to bind to it’s LISTEN port fails with EADDRINUSE because some sockets on the system are already bound to this IP:port combination (for example, a socket in either TIME-WAIT or ESTABLISHED state). A demonstration of this problem is shown below:
当服务器随后启动时,它尝试绑定到其 LISTEN 端口失败,并使用 EADDRINUSE 失败,