Select system call limitation in Linux

最新推荐文章于 2024-12-12 15:24:01 发布

leocuka

最新推荐文章于 2024-12-12 15:24:01 发布

阅读量703

点赞数

分类专栏：网络通信文章标签： system descriptor file network scalability documentation

网络通信专栏收录该内容

3 篇文章

订阅专栏

Try finding a network sample code of how to accept TCP connections and most likely you will find the select() system call in such code. The reason being that select() is the most popular (but not the only one as we will see) system call to wait for I/O in a list of file descriptors.

I am here to warn you, select() has some important limitations to be aware of. I must confess I used select for a long time without realizing its limitations, until, of course, I hit the limits.

About a year ago I started porting a heavily threaded networking real-time voice application for Windows to Linux. When the code was compiling and running apparentely without issues, we started doing scalability and stress tests. We used 32 telephony E1 cards (pretty much like a network card) where each E1 port can handle up to 30 calls. So we’re talking about 32 * 30 = 960 calls in a single server. Knowing in advance that I would need lots of file descriptors (Linux typically defaults to 1024 per process), we used the setrlimit() system call to increase the limit up to 10,000 which should be more than enough because a telephony call in this system requires about 4 file descriptors (for the network VoIP connection and the E1 side devices etc).

At some point during the stress test, calls stopped working and some threads were going crazy eating up 99% CPU. After finding out using “ps” and “pstack” which threads were the ones going crazy, I found out that were the ones waiting for I/O in some file descriptors using select(), like the embedded HTTP server or the Network-related code.

Reading carefully the select documentation you will find the answer by yourself. “man select” says:

“An fd_set is a fixed size buffer. Executing FD_CLR or FD_SET with a value of fd that is negative or is equal to or larger than FD_SETSIZE will result in undefined behavior.”

So, big deal you may say, you can split across threads the load to not have more than 1024 file descriptors in your select() call, right? WRONG! read it twice, the problem is with the highest file descriptor value provided in a fd_set structure, not the number of file descriptors in the fd set.

These 2 numbers are related but are in no way the same. Let’s say you have program that opens 2000 text files (either with open() or fopen()) to read from them and scan for a list of words, at the same time each time you hit a word you must connect and send a network message to some TCP server and read data from the TCP server too. Probably you would launch some threads for reading on the files and another thread to handle the network connection related data. Event though only 1 thread is using select() and that thread is providing select() with just 1 file descriptor (the TCP server connection), you cannot guarantee which file descriptor number will be assigned to that network connection. You could try to ensure that you always start the TCP connection before opening any files so you will get a lower-number file descriptor, but that is a very shaky design, if you later want to do other stuff that requires the use of file descriptors (use pipes, unix sockets, etc) you may hit the problem again.

The limitation comes from the way select() works, most concretely the data type used to represent the list of file descriptors, fd_set. Let’s take a look at fd_set in /usr/include/sys/select.h

You will see a definition pretty much like:

typedef struct  {
    long int fds_bits[32];
} fd_set;

I removed a bunch of macros to make the code more clear. As you see you have a static array of 32 long ints. This is just a bit map, where each bit represents a file descriptor number. 32 * sizeof(long int), for 32 bit platforms is 1024. So, if you do fd_set(&fd, 10), to add file descriptor 10 to an fd_set, it will just set to 1 the 10th bit in the bit map, what happens then if you do fd_set(&fd, 2000) ?, you guessed right, unpredictable. May be some sort of array overflow (fd_set is, at least on my system, implemented using the assembly instruction btsl, bit test and set).

Be aware also, all of this is on Linux, I am not sure about how select is implemented in other operating systems, like Windows. Given that we did not notice this problem on Windows servers, probably select is implemented differently.

Solution? use poll (or may be epoll when available). These 2 system calls are not as popular and available in most operating systems as select is, but whenever possible, I recommend using poll. There may be differences in the performance, for example, using FD_ISSET() is faster (just checking if a given bit is set in the bitmap) than iterating over the poll array, but I have found in my applications that the difference is just not critical.

In short, next time you find using select(), think it twice before deciding that is what you need.

http://www.moythreads.com/wordpress/2009/12/22/select-system-call-limitation/