IO multiplexing in python -- detailed explanation of select, poll and epoll

select, poll and epoll are IO multiplexing mechanisms. I/O multiplexing is a mechanism that enables a process to monitor multiple descriptors. Once a descriptor is ready (generally read ready or write ready), it can notify the program to carry out the corresponding read-write operation.

select, poll and epoll are synchronous I/O in essence, because they need to be responsible for reading and writing after the reading and writing events are ready, that is, the reading and writing process is blocked

Asynchronous I/O does not need to be responsible for reading and writing. The implementation of asynchronous I/O will be responsible for copying data from the kernel to the user space.


Differences among sell, poll and epoll:


At present, almost all platforms are supported

By default, there is a maximum limit on the number of file descriptors that a single process can monitor. By default, only 1024 socket s are supported on linux

This limit can be raised by modifying the macro definition or recompiling the kernel (modifying the maximum number of ports supported by the system)

After the kernel prepares the data, it notifies the user that there is data, but does not tell the user which connection has data. The user can only obtain the data by polling

Suppose select lets the kernel monitor 100 socket connections. When one connection has data, the kernel will notify the user that there is data in 100 connections

However, the user is not told which connection has data. At this time, the user can only check one by one through polling and then obtain the data

Here is to assume that there are 100 socket connections. What if there are tens of thousands, hundreds of thousands?

Then you have to poll tens of thousands of times, hundreds of thousands of times, and you get only one result. This will waste a lot of useless expenses

Only horizontal triggering is supported

Every time you call select, you need to copy the fd set from the user state to the kernel state. This overhead will be great when fd is a lot

At the same time, every time you call select, you need to traverse all fd passed in the kernel. This overhead will be great when there are many fd


There is no essential difference from select, but there is no limit on the maximum number of file descriptors

Only horizontal triggering is supported

Just a transitional version, rarely used


  Linux2. epoll, which appeared in June, has all the advantages of select and poll, and is recognized as the best multi-channel IO ready notification method

There is no limit to the maximum number of file descriptors

It supports both horizontal trigger and edge trigger

windows platform is not supported

After the kernel prepares the data, it will notify the user which connection has data

IO efficiency does not decrease linearly with the increase of fd number

Using mmap to speed up the messaging between kernel and user space



Horizontal trigger and edge trigger:

Horizontal trigger: after the ready file descriptors are told to the process, if the process does not perform IO operations on them, these file descriptors will be reported again when epoll is called next time. This method is called horizontal trigger

Edge trigger: it only tells the process which file descriptors have just become ready. It only says it once. If we don't take action, it won't tell it again. This method is called edge trigger

Theoretically, the performance of edge trigger is higher, but the code implementation is quite complex.


Features of select and epoll:


Select is used to monitor the array of multiple file descriptors through a select() system call. When select() returns, the ready file descriptors in the array will be modified by the kernel, so that the process can obtain these file descriptors for subsequent read and write operations.

Due to the delay of network response time, a large number of TCP connections are inactive, but calling select() will perform a linear scan on all socket s, so it also wastes some overhead.


Epoll also tells only those ready file descriptors, and when we call epoll_ When wait() obtains ready file descriptors, it returns not the actual descriptors, but a value representing the number of ready descriptors. You only need to obtain the corresponding number of file descriptors in turn from an array specified by epoll. Memory mapping (mmap) technology is also used here, which completely eliminates the cost of copying these file descriptors during system call.

Another essential improvement is that epoll adopts event based ready notification. In select/poll, the kernel scans all monitored file descriptors only after the process calls a certain method, and epoll passes epoll in advance_ CTL () to register a file descriptor. Once a file descriptor is ready, the kernel will use a callback mechanism similar to callback to quickly activate the file descriptor. When the process calls epoll_ You are notified when you wait ().



select(rlist, wlist, xlist, timeout=None)

The file descriptors monitored by the select function are divided into three categories: writefds, readfds, and exceptfds.

After calling, the select function will block until a descriptor is ready (with data readable, writable, or except), or timeout (timeout specifies the waiting time, and if the immediate return is set to null), the function returns. When the select function returns, you can traverse the fdset to find the ready descriptor.



int poll (struct pollfd *fds, unsigned int nfds, int timeout);

Unlike the way that select uses three bitmaps to represent three fdset s, poll uses a pollfd pointer.

struct pollfd {
    int fd; /* file descriptor */
    short events; /* requested events to watch */
    short revents; /* returned events witnessed */

The pollfd structure contains the events to be monitored and the events that occur. The method of "parameter value" transmission of select is no longer used.

At the same time, there is no limit on the maximum number of pollfd (but the performance will decline if the number is too large).  

Like the select function, after poll returns, pollfd needs to be polled to get the ready descriptor.


From the above, both select and poll need to traverse the file descriptor to obtain the ready socket after returning.

In fact, a large number of clients connected at the same time may only be in a ready state at one time, so its efficiency will decrease linearly with the increase of the number of monitored descriptors.



Epoll is proposed in the 2.6 kernel and is an enhanced version of the previous select and poll. Compared with select and poll, epoll is more flexible and has no descriptor restrictions.

epoll uses a file descriptor to manage multiple descriptors, and stores the events of the file descriptor of user relationship in an event table of the kernel, so that it only needs to copy once in user space and kernel space.

epoll operation process

epoll operation requires three interfaces, as follows:

int epoll_create(int size);//Create an epoll handle, and size is used to tell the kernel how many listeners there are
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
int epoll_wait(int epfd, struct epoll_event * events, int maxevents, int timeout);

  1. int epoll_create(int size);

Create an epoll handle. Size is used to tell the kernel how much the number of listeners is. This parameter is different from the first parameter in select(). It gives the value of fd+1 of the maximum listener. The parameter size does not limit the maximum number of descriptors that epoll can listen to, but is just a suggestion for the kernel to initially allocate internal data structures.

When the epoll handle is created, it will occupy an fd value. Under linux, if you check / proc / process id/fd /, you can see this fd. Therefore, after using epoll, you must call close() to close it, otherwise fd may be exhausted.


  2. int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);

The function performs op operations on the specified descriptor fd.

epfd: epoll_ Return value of create().

op: represents op operation, which is represented by three macros:


Add, delete and modify listening events for fd respectively.

fd: fd (file descriptor) that needs to be monitored

  epoll_event: it tells the kernel what to listen for


  3. int epoll_wait(int epfd, struct epoll_event * events, int maxevents, int timeout);

Wait for io events on epfd and return maxevents at most.

The parameter events is used to get the set of events from the kernel. Maxevents tells the kernel how big the events are, and the value of maxevents cannot be greater than the value of epoll created_ size when creating (). The parameter timeout is the timeout (in milliseconds, 0 will be returned immediately, - 1 will be uncertain. It is also said that it is permanently blocked). If the function returns 0, it indicates the number of events that have timed out.


A simple select multiple concurrent socket server code is as follows:


import select
import socket
import queue

server = socket.socket()
HOST = 'localhost'
PORT = 8080
print("start up %s on port: %s",% (HOST,PORT))

server.setblocking(False)   #No blocking

msg_dic_queue = {}    #This is a queue dictionary that stores the data to be returned to the client

inputs = [server]   #The inputs store the connections to be monitored by the kernel. The server here refers to monitoring the connection status of the server itself
#inputs = [server,conn]
outputs = []    #outputs stores the data connection object to be returned to the client

while True:
    print("waiting for next connect...")
    readable,writeable,exceptional =,outputs,inputs)   #If no fd is ready, the program will always be blocked here
    # print(readable,writeable,exceptional)
    for r in readable:  #Handle active connections, and each r is a socket connection object
        if r is server: #Represents a new connection
            conn,client_addr = server.accept()
            print("arrived a new connect: ",client_addr)
            inputs.append(conn) #Because the newly established connection hasn't sent data yet. If you receive it now, the program will report an exception
            #Therefore, if you want the server to know when the client sends data, you need to let the select monitor the conn
            msg_dic_queue[conn] = queue.Queue()   #Initialize a queue and store the data to be returned to the client
        else:   #If r is not a server, it means it is a file descriptor established with the client
            #The data from the client comes and is received here
            data = r.recv(1024)
            if data:
                print("received data from [%s]: "% r.getpeername()[0],data)
                msg_dic_queue[r].put(data)  #The received data is put into the queue dictionary first, and then returned to the client
                if r not in outputs:
                    outputs.append(r)   #Put it into the returned connection queue. In order not to affect the processing of connections with other clients, data is not returned to the client immediately
            else:   #If the data is not received, it means that the client has been disconnected
                print("Client is disconnect",r)
                if r in outputs:
                    outputs.remove(r)   #Clean up disconnected connections
                del msg_dic_queue[r]

    for w in writeable: #Process the list of connections to return to the client
            next_msg = msg_dic_queue[w].get_nowait()
        except queue.Empty:
            print("client [%s]"% w.getpeername()[0],"queue is empty...")
            outputs.remove(w)   #Make sure that writeable does not return processed connections on the next loop
            print("sending message to [%s]"% w.getpeername()[0],next_msg)
            w.send(next_msg)    #Return to client source data

    for e in exceptional:   #Handling exception connections
        if e in outputs:
        del msg_dic_queue[e]

select multiple concurrent socket client code is as follows:


import socket

msgs = [ b'This is the message. ',
             b'It will be sent ',
             b'in parts.',
SERVER_ADDRESS = 'localhost'

# Create a few TCP/IP socket
socks = [ socket.socket(socket.AF_INET, socket.SOCK_STREAM) for i in range(500) ]

# Connect the socket to the port where the server is listening
print('connecting to %s port %s' % (SERVER_ADDRESS,SERVER_PORT))
for s in socks:

for message in msgs:

    # Send messages on both sockets
    for s in socks:
        print('%s: sending "%s"' % (s.getsockname(), message) )

    # Read responses on both sockets
    for s in socks:
        data = s.recv(1024)
        print( '%s: received "%s"' % (s.getsockname(), data) )
        if not data:
            print(sys.stderr, 'closing socket', s.getsockname() )


epoll multi concurrent socket server code is as follows:


import socket, logging
import select, errno

logger = logging.getLogger("network-server")

def InitLog():

    fh = logging.FileHandler("network-server.log")
    ch = logging.StreamHandler()

    formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")


if __name__ == "__main__":

        # Create TCP socket as listening socket
        listen_fd = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
    except socket.error as  msg:
        logger.error("create socket failed")

        # Set SO_REUSEADDR option
        listen_fd.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    except socket.error as  msg:
        logger.error("setsocketopt SO_REUSEADDR failed")

        # Bind -- no ip address is specified here, that is, bind all network card ip addresses
        listen_fd.bind(('', 8008))
    except socket.error as  msg:
        logger.error("bind failed")

        # Set the number of listen backlog
    except socket.error as  msg:

        # Create epoll handle
        epoll_fd = select.epoll()
        # Register the readable event of listening socket in epoll handle
        epoll_fd.register(listen_fd.fileno(), select.EPOLLIN)
    except select.error as  msg:

    connections = {}
    addresses = {}
    datalist = {}
    while True:
        # Where epoll performs fd scanning -- if no timeout is specified, it is blocking waiting
        epoll_list = epoll_fd.poll()

        for fd, events in epoll_list:
            # If listening, fd is activated
            if fd == listen_fd.fileno():
                # accept -- get the ip and port of the connected client and the socket handle
                conn, addr = listen_fd.accept()
                logger.debug("accept connection from %s, %d, fd = %d" % (addr[0], addr[1], conn.fileno()))
                # Set the connection socket to non blocking
                # Register the readable event of the connection socket with the epoll handle
                epoll_fd.register(conn.fileno(), select.EPOLLIN | select.EPOLLET)
                # Save conn and addr information respectively
                connections[conn.fileno()] = conn
                addresses[conn.fileno()] = addr
            elif select.EPOLLIN & events:
                # Readable event activation
                datas = ''

Posted by metuin on Wed, 11 May 2022 07:33:23 +0300