The Gerbil httpd · mighty-gerbils gerbil · Discussion #1148

Gerbil has sported an embedded httpd since v0.12; the package
provides HTTP/1.1 server functionality, with dynamic handlers
dispatched by a request multiplexer (Mux). The embedded httpd is quite
capable, but it requires writing some code to get running and leaves
several tasks to the programmer.

In the v0.18.2 release cycle we are introducing a standalone
httpd (runnable with gerbil httpd), which packages the embedded
httpd into a (somewhat low level) web application framework
distributed with Gerbil. The server can serve static files, direct
requests to compiled dynamic handlers and also supports servlets,
which are interpreted dynamic handlers from gerbil source modules
residing inside the server root. See the PR for code details.

The server also supports an ensemble supervisor that can
trivially spawn a number of httpd workers bound on the same port with
SO_REUSEPORT, which lets you utilize all cores in the system.

Finally, as we will see below, the server is quite performant, with a
performance envelope approaching high-performance servers written in C,
like lighttpd.

Configuring and running the gerbil httpd

Configuring the server is quite simple. The configuration format is a
flat plist, which allows you to extend it with sections for handler
specific configuration without any difficulty.

Here is an example configuration:

$ cat server.config
config: httpd-v0
root: "content"
handlers: (("/handler" . :test/site/handler))
enable-servlets: #t
server-log: "/tmp/server.log"
request-log: "/tmp/request.log"
listen: ("127.0.0.1:8080")

This configuration specifies a site with the document root in
content, enables servlets and also specifies a compiled handler for
a path.

Here is how an example server root looks like:

$ tree content
content
├── files
│   └── hello.txt
├── index.html
└── servlets
    └── hello.ss

Here, we have an index.html for our root, and an additional static
file in files -- this could be your assets directory for instance. The
server also has a servlet in /servlets/hello.ss.

Here is what the servlet looks like:

$ cat content/servlets/hello.ss
(import :std/net/httpd
        :std/format)
(export handler-init! handle-request)

(def state "not initialized")

(def (handler-init! cfg)
  (set! state 'initialized))

(def (handle-request req res)
  (http-response-write res 200 '(("Content-Type" . "text/plain"))
                       (format "hello! I am a servlet and my state is ~a~n" state)))

The servlet is a module, which exports a handle-request method for
its handler, and optionally a handler-init! procedure called at load
time with the server configuration.

Our compiled dynamic handler is just the same, but it is precompiled:

$ cat site/handler.ss
(import :std/net/httpd
        :std/format)
(export handler-init! handle-request)

(def state "not initialized")

(def (handler-init! cfg)
  (set! state 'initialized))

(def (handle-request req res)
  (http-response-write res 200 '(("Content-Type" . "text/plain"))
                       (format "hello! I am a dynamic handler and my state is ~a~n" state)))

So let's run our little site:

# first build the dynamic handler
$ gerbil build

# and run the server
$ gerbil httpd -c server.config

And in another terminal we can poke it with curl:

# get the root
$ curl http://127.0.0.1:8080/
<html>
  <head>
    <title>hello</title>
  </head>
  <body>
    hello, world!
  </body>

# get the asset file
$ curl http://127.0.0.1:8080/files/hello.txt
hello, world!

# get the compiled handler
$ curl http://127.0.0.1:8080/handler
hello! I am a dynamic handler and my state is initialized

# get the servlet
$ curl http://127.0.0.1:8080/servlets/hello.ss
hello! I am a servlet and my state is initialized

Configuring and running the gerbil httpd ensemble

The ensemble configuration is just as simple:

$ cat ensemble.config
config: httpd-ensemble-v0
ensemble-servers: (httpd1 httpd2)
ensemble-request-log: #t
server-configuration:
(root: "content"
 handlers: (("/handler" . :test/site/handler))
 enable-servlets: #t
 listen: ("127.0.0.1:8080"))

In this config, we specify we want two workers (httpd1 and httpd2),
with request logging enabled, and the same server configuration as the
standalone server.

You can run the ensemble as follows:

# first create an ensemble cookie
$ gerbil ensemble admin cookie

# run the ensemble registry
$ gerbil ensemble registry

# and start the httpd ensemble
$ gerbil httpd -e -c ensemble.config

Performance

We mentioned performance, and indeed is quite good.

First here is a baseline from lighttpd-1.4.63-1ubuntu3.1:

$ ./hey_linux_amd64 -c 10 -n 100000 http://localhost:80/index.html

Summary:
  Total:	1.1123 secs
  Slowest:	0.0040 secs
  Fastest:	0.0000 secs
  Average:	0.0001 secs
  Requests/sec:	89901.9951

And here is how the single worker httpd fares:

$ ./hey_linux_amd64 -c 10 -n 100000 http://localhost:8080/index.html

Summary:
  Total:	1.2072 secs
  Slowest:	0.0173 secs
  Fastest:	0.0000 secs
  Average:	0.0001 secs
  Requests/sec:	82834.8050

And here is an ensemble with two workers:

$ ./hey_linux_amd64 -c 10 -n 100000 http://localhost:8080/index.html

Summary:
  Total:	1.0287 secs
  Slowest:	0.0196 secs
  Fastest:	0.0000 secs
  Average:	0.0001 secs
  Requests/sec:	97213.7561

Notice the performance increase; it is not linear because the load
test is running in the same laptop so there is competition between the
two, but it's there and edges comfortably over what you can get with
lighttpd. Further exploring the performance envelope with multiple
workers would require running the ensemble on a different box than the
load tester.