I/O multiplexing refers to the concept of processing multiple input/output events from a single event loop, with system calls like poll and select (Unix). –Wikipedia
kqueue(on MacOS) and epoll(on Linux) are kernel system calls for scalable I/O event notification mechanisms in an efficient manner. In simple words, you subscribe to certain kernel events and you get notified when any of those events occur. These system calls are desigend for scalable situations such as a webserver where
10,000 concurrent connections are being handled by one server.
In order to watch one file descriptor, say
777, you need to write something like this in C:
When you call
select, you pass
nfds = 778 which is the “number of file descriptors”. What
select does under the hood is, it simply loops through all the file descriptors from
nfds - 1. It checks if you have set them and if the desired event occured. So, the runtime is
n is the largest file descriptor you’re watching!
This line of the man page, explains it well:
The first nfds descriptors are checked in each set; i.e., the descriptors from 0 through nfds-1 in the descriptor sets are examined. (Example: If you have set two file descriptors “4” and “17”, nfds should not be “2”, but rather “17 + 1” or “18”.)
Aside from performance, a bigger problem is, using
select you may destroy your call stack and crash your process! Let’s again look at how we set our desired file descriptor:
fds is the set of file descriptors we want to watch, and according to the man page:
The descriptor sets are stored as bit fields in arrays of integers.
It also later states that:
The default size FD_SETSIZE (currently 1024) is somewhat smaller than the current kernel limit to the number of open files.
Now, if you try to watch file descriptor
select will loop over fds from
1999 and will read garbage. The bigger issue is when it tries to set results for a file descriptor past
1024 and tries to set that bit field in say
errorfds field. At this point it will write something random on the stack eventually crashing the process and making it very hard to debug what happened since your stack is randomized.
After the introduction of select in 1983, poll was released in 1986 (and came to libc in linux in 1997) to address the shortcomings of select. poll solved the fd limit of 1024, introduced more flavours of events to watch and slight changes to the timeout in the api, such as using milliseconds instead of microseconds.
You can see that
poll’s api makes a lot more sense:
You pass in the list of fds you’re interested in, in a sparse format, and
ndfs is actually the number of fds you are interested in. Here’s the structure of
In order to watch fds
777 for read, you can simply do:
poll is an improvement to
select, it still loops over the all the provided file descriptors hence runs in
n is the number of file descriptors we are watching.
Kqueue is a scalable event notification interface introduced in 2000. It provides a standard API for applications to register their interest in various events/conditions and have their notifications delivered efficiently. It is designed to be scalable, flexible, reliable and correct.
In order to understand how to use kqueue, let’s look at a few concepts.
A kevent is identified by an
<ident, filter> pair, where ident can be a file/socket descriptor and filter is the kernel filter used to process the respective event. There are some pre-defined system filters, such as
EVFILT_WRITE, that are triggered when data exists for read or write.
For exmaple, if we want to be notified when there is data available for reading in a socket, we need to specify a kevent in the form of
We use flags to add, delete, enable or disable a kevent from the queue (
The kqueue holds all the events we are interested in. So, to start we can simply create an empty kqueue.
kevent system call
We can use the
kevent(...) system call to:
- add/modify/delete events on the queue
- wait/read occured events
It also allows for setting a timeout but for now we put in
NULL which means no timeout (wait indefinitely).
Note: re-adding an existing event will modify the parameters of the original event and won’t result in a duplicate entry.
After creating the queue and registering our events, we wait for the events in an infinite loop. Once they occur, the call to
kevent(...) unblocks and we receive these events in
evList. We can iterate over these events and handle them however we wish. For example, if we receive a new connection from a client, we can accept the connection or if we receive new data, we can read it from the socket.
For a more practical use case, the next post Streaming Server Using Kqueue shows how to write a simple server using kqueue in only 100 lines in C!