A thought on engine contracting


This is the result of the engine strace. It can be seen that the engine frequently uses the write system call when replying to the message. The message content can be seen as an http response message,

Normally, the write reply message can only be called once, but the actual situation is that the write reply message is called many times, and the message is replied in turn according to the response header and response line of the http message.

So the optimization method: combine multiple replies into one

Case 1: assemble multiple bufs into one buf and then write it out. Therefore, memcpy # is required to copy the message in advance

Is there no other way??

Remember that there is a chapter on advanced I/O in APUE about readv # writev!!  scatter read /gather write

Now let's take a look at the socket receiving and contracting interfaces and their differences:

#include <unistd.h>
 ssize_t read(int fd, void *buf, size_t count);
 ssize_t write(int fd, const void *buf, size_t count);

You can see that the function of read write is to read / write data to FD , but you can see that there is no interaction between the kernel and user state processes; At the same time, fd# the connection has been established

So there is recv send, but there is a basic problem in the design of flag: it is passed by value, not value result parameters. Therefore, it can only pass flags from the process to the kernel, and the kernel cannot pass flags to the process.

The functions of send/recv and write/read are basically the same, except that there is an additional flag parameter. When the flag parameter is set to 0, their functions are the same!

In order to solve the problem that fd # is required to be in the connected state, there are problems such as recv from # sendto. You can receive and contract fd # in the disconnected state, such as udp # socket

When the address pointer of sendto/recvfrom function is NULL and the address length is 0, its effect on send/recv is consistent;

sendto is used to write (read) data to the socket. If it is used on a socket that has established a connection, its address and address length parameters need to be ignored, that is, the address pointer is set to NULL and the address length is set to 0; For example, udp, if you do not call connect to establish a connection, you need to specify the address parameter. If you call connect to establish a connection, you need to omit the address parameter

In order to solve the problem of single read-write buffer; So there are readv and writev (decentralized reading and centralized writing), but one disadvantage is that the kernel and process cannot interact with relevant information

recvmsg sendmsg solves the problem of transferring auxiliary data between kernel and process

sendmsg is used to write the data of multiple buffers to the socket file descriptor, and recvmsg is used to read the data in the socket file descriptor to multiple buffers. msghdr message header needs to be constructed before sending (receiving)


#include <sys/types.h>
#include <sys/socket.h>

ssize_t send(int sockfd, const void *buf, size_t len, int flags);

ssize_t sendto(int sockfd, const void *buf, size_t len, int flags,
                      const struct sockaddr *dest_addr, socklen_t addrlen);

ssize_t sendmsg(int sockfd, const struct msghdr *msg, int flags);
ssize_t recv(int sockfd, void *buf, size_t len, int flags);

ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags,
                        struct sockaddr *src_addr, socklen_t *addrlen);

ssize_t recvmsg(int sockfd, struct msghdr *msg, int flags);


The msghdr parameter is complex

struct msghdr {
    void          *msg_name;            /* protocol address */
    socklen_t     msg_namelen;          /* sieze of protocol address */
    struct iovec  *msg_iov;             /* scatter/gather array */
    int           msg_iovlen;           /* # elements in msg_iov */
    void          *msg_control;         /* ancillary data ( cmsghdr struct) */
    socklen_t     msg_conntrollen;      /* length of ancillary data */
    int           msg_flags;            /* flags returned by recvmsg() */

msg_name and msg_namelen is used to specify the address of the receiving source or sending destination when the socket is not connected (mainly the unconnected UDP socket). The two members are the socket address and its size, similar to the second and third parameters of recvfrom and sendto. For connected sockets, you can directly set the two parameters to NULL and 0. For recvmsg, msg_name is a value result parameter that returns the socket address of the sender.
msg_iov and MSG_ The two iovlen members are used to specify the data buffer array, that is, the iovec structure array. Iovec is structured as follows

#include <sys/uio.h>
struct iovec {
    void    *iov_base;      /* starting address of buffer */
    size_t  iov_len;        /* size of buffer */

Where iov_base is a buffer element, in fact, it is also an array, and iov_len specifies the size of the data. That is, the buffer is a two-dimensional array, and the length of each dimension is not fixed


Therefore: in order to solve the problem of multiple system calls by the engine, the simplest way is to use the writev  interface to send data. However, it needs to be considered that writev  sends 1000bytes of data. In fact, only 500 bytes are sent, and the remaining 500bytes will be sent next time,

That is, the exception logic processing of writev, that is, after each writev, you need to find out which part of the data has not been sent, then listen fd to writeable again, and then continue to send. The cache of the sent data can be released


Posted by Rottingham on Mon, 09 May 2022 15:46:01 +0300