Press enter or click to view image in full size
This is a short explainer of event-driven servers, intended to help readers gain an intuitive understanding of event loops. It could be useful when:
- deciding between different MPMs for Apache HTTP Server
- comparing Apache HTTP Server and NGINX
- choosing concurrency models for gunicorn
- troubleshooting an event loop
Concurrent Servers
In the classic client-server architecture, a server accepts connections from clients, receives data on the new socket, forwards it to the application for processing, and then sends data back to the client on the same socket.
Press enter or click to view image in full size
To achieve concurrency, the simplest multi-threaded implementation could create a new thread for each new connection. Once created, the thread is bound to the connection, processing all requests and responses on this connection sequentially. The thread is then destroyed and freed when the connection is closed.
This describes the worker Multi-Processing Module of the popular Apache HTTP Server.
Depending on the specifics of each implementation, this model can have a few drawbacks, including the memory overhead incurred per connection. This scales poorly in a deployment that expects a large number of concurrent clients, each maintaining persistent connections (e.g. HTTP/1.1 Keep-Alive, long polling, or reusing a TLS socket). Precious system resources would be wasted on threads that are idle most of the time.
Press enter or click to view image in full size
memory required = memory required per thread * number of connectionsTo drive this intuition a little further, consider this example: if a server implemented using this 1:1 model allocates 1 MB of memory per connection, it would need 10 GB of memory to serve 10 thousand concurrent connections.
This is the gist of the C10K problem, and we want an implementation with a resource consumption profile that scales sub-linearly with the growing number of concurrent connections.
Toward an event-driven implementation
To conserve memory and CPU while handling a large number of idle connections, we can improve upon the above implementation by doing the following:
- reduce the amount of memory that required for each new connection by allowing the main thread to handle multiple connections, and
- reduce CPU cycles wasted with polling by using event notifications from the kernel to trigger callbacks when sockets are ready to be read/written.
Without getting into precise definitions, Step 1 introduces us to asynchronous programming, where each thread is no longer bound to a single connection, but instead multiplexes over multiple ones. For this to succeed, the main thread needs to be aware of the readiness and states of each socket, reading from sockets once they are ready to be read, and quickly writing to sockets when there is data to be sent.
Press enter or click to view image in full size
Furthermore, since we are sharing the main thread across many clients, the application needs to ensure that no single request can block the entire thread, starving the other clients of their time. Anything that is using the main thread needs to yield its control as soon as it can.
Step 2 introduces us to kernel event notifications, such as I/O readiness, UNIX signals, and timeouts. These can be used via kqueue in BSD, or epoll in Linux. The following example integrates with libevent, which is an OS-independent library that supports kqueue, epoll, and select, etc.
A Small Example
event-proxy is a small program I had written to demonstrate the fundamentals of event-based servers, and the use of libevent. It is minimal by design, to aid instruction and study. It is also written in C, which I have found to be the best language to use for learning systems engineering.
Here is a short demo of event-proxy in action:
Press enter or click to view image in full size
Press enter or click to view image in full size
event-proxy is structured as follows:
main.c, which provides the program’s entry point, initializes the logger, and invokesproxy.c#proxyto start the proxy;proxy.c, which creates the listening socket, initializes the event loop, and blocks until the loop exits;io.c, which providesproxy.cwith thedo_acceptcallback for accepting connections; andclient.c, which creates a socket and connects to the proxied server.
proxy.c
In proxy.c#proxy, we first create a TCP socket, bind it to an IP address and port number, and use the listen system call to wait for incoming connections. We have not done anything unique so far, and these steps are separated into _init_listen_fd, so we can focus on _init_event_loop.
Get James Lim’s stories in your inbox
Join Medium for free to get updates from this writer.
To start using libevent, we first allocate an event_base structure using event_base_new. This structure will be used to hold the events we are interested in, as well as register pointers to callbacks. Later, it is used to start the event loop. For simplicity, we create a default event_base without any custom configuration:
ev_base = event_base_new();Next, we register the do_accept callback against a read event on listen_fd, which is the listening socket from before. This tells the event loop to invoke do_accept whenever there is a new incoming connection at that socket.
ev_listen = event_new(ev_base, listen_fd, EV_READ|EV_PERSIST, do_accept, …);
event_add(ev_listen, NULL);
// NULL here means no timeout
// without EV_PERSIST, this event will be triggered at most onceThen, we register the quit_cb callback against a SIGQUIT signal.
ev_quit = evsignal_new(ev_base, SIGQUIT, quit_cb, ev_base);
event_add(ev_quit, NULL);
// NULL here means no timeoutFinally, we start the event loop and block indefinitely:
event_base_dispatch(ev_base);Press enter or click to view image in full size
Roughly, event_base_dispatch starts and blocks on an infinite loop that waits until at least one event is triggered, and invokes the activated events’ registered callbacks. These callbacks are invoked on the same thread, like so:
_init_event_loop.Callbacks that take too long to return will penalize the entire event loop and undermine fairness. Thus, callbacks must return as quickly as possible and yield control of the main thread back to the event loop. For this reason, event-based programming generally requires avoiding blocking calls, such as I/O waits, explicit sleeps, and mutex waits.
Notice that the last argument to both event_new and evsignal_new is passed to the callback functions when they are invoked. For example, when SIGQUIT is sent to the program, we pass ev_base to quit_cb, so that it can tell the event loop to exit. Unfortunately, since we are working in C, the type of the argument is void *, and some casting is necessary:
struct event_base *ev_base = arg;
event_base_loopexit(ev_base, NULL);
// NULL here means exit after all current events are completeio.c
io.c provides do_accept, a callback that is invoked whenever a new incoming connection is received at the listener socket. When invoked, we first use the accept system call to create a new socket for each connection (named accept_fd). Then, we connect to the proxied server, creating new socket named client_fd. We have not done anything new yet, and the novelty begins in _init_bufferevents.
At this point, we have created:
accept_fd, a socket that can be used to read and write bytes to the connected client, andclient_fd, a socket that can be used to read and write bytes to the proxied server.
The remaining tasks are:
- read bytes from
accept_fdand write them toclient_fd - read bytes from
client_fdand write them toaccept_fd
This can be done by registering a callback against read events on both sockets. To do this, we use libevent’s bufferevent API, which provides a convenient interface for working with buffers and stream sockets. We start by creating a bufferevent structure that is associated with the main event loop and our new sockets:
bev = bufferevent_socket_new(ev_base, fd, BEV_OPT_CLOSE_ON_FREE);
// create a bufferevent structureThen, we register the read callback and the error callback:
bufferevent_setcb(bev, readcb, NULL, errorcb, arg);By default, a new bufferevent has writing enabled, but not reading, and the following is necessary:
bufferevent_enable(bev, EV_READ | EV_WRITE);The final piece of the puzzle is the implementation of readcb, which looks like:
static void readcb (struct bufferevent *bev, …) {
…
input = bufferevent_get_input(bev);
…
bufferevent_write_buffer(output, input);
…
}Notice that we are reusing the same ev_base as before, which lets us plug into the same event loop. The newly registered callbacks will be invoked by the same event loop in that was dispatched in proxy.c#proxy, and on the same thread.
_init_event_loop.Beyond Servers
The design of batch jobs can benefit from a discussion of threads and event loops as well.
Suppose, for example, we have a batch job running on a single machine that retrieves a large number of records from many databases and aggregates the data. To achieve concurrency, we could implement this batch job using multiple threads, with one thread for each database. This setup looks familiar, except that our batch job is acting as a client that is connected to multiple database servers.
This implementation suffers from the same drawbacks described above: each new thread could consume quite a bit of memory. Furthermore, if we assume that the SELECT queries take a long time on the database hosts, then we are also managing a large number of mostly idle threads and connections.
In cases like this, an event loop could be a useful optimization. After all database connections have been created, we could register callbacks on I/O events from them. The event loop would then run on one main thread, invoking the associated callbacks when results from the databases are available to be read on their respective sockets. Each invoked callback would pull the query results into process memory for aggregation.
In this model, we reduced a large number of idle threads to one fully utilized main thread; however, we also forfeited the ability to utilize multiple cores on the machine. It might be advantageous to optimize further by distributing the work over multiple processes on each machine: one process for each core, and each with its own event loop.
In my next article, I will explore the details of how Linux schedules threads and processes that are waiting on I/O, and discuss some downsides of using event-driven implementations.
Thanks to Gabbi Fisher for reviewing an early draft.