In part I, part II and
part III I talked
about the changes to DNS resolution in curl and why we do them. In this
post I cover the performance/resources related changes in the
threaded resolver. This is the most common build option, deployed
by many distros.
Previously…
In curl 8.19.0 and earlier, the “threaded resolver” means libcurl
starts a new thread to call getaddrinfo(), as I explained in
part II. In addition, it opens a socketpair
(or eventfd on modern linux), so the resolver thread can notify
the “main” thread it’s done.
When the application uses parallel processing in a multi handle and
adds 50 easy handles to different hosts, libcurl starts 50 threads and
opens 50 socketpairs. Meh!
This may not really concern you when using the curl commandline tool,
but libcurl is used in places with thousands of parallel transfers. It
then may add up.
An additional complication of the resolving thread being tied to an
easy handle is that it may block remove/cleanup of the easy. When
the DNS resolving “hangs”, joining the thread blocks. The application
then stalls.
There is CURLOPT_QUICK_EXIT to detach any thread, avoiding a wait. This
is convenient when the application wants to terminate anyway and
waiting on threads does not make sense. But if the application does not
exit, detached threads may accumulate uncontrolled.
Pooling Resources
With curl 8.20.0 we add a thread pool for the resolver. The pool
is owned by a multi handle and used for all easy handles processed
by this multi. There is a single socketpair at the multi used
for notification by the threads, no matter how many easy handles are
there.
The thread pool has a configurable maximum number of threads, which are started on demand and shut down after an idle period. DNS resolves are placed in a inbound queue which the threads feed upon, placing results on an outbound queue and notifying the multi. The multi then empties the outbound queue and dispatches the DNS resolves to the proper easy handle.
This makes resource use by threads and socketpair a matter controlled
by the libcurl application. The new CURLMOPT_RESOLVE_THREADS_MAX can
be used to set the maximum number of threads in the pool. The default
is now 20, which may change when we get feedback on this.
The side effect of controlling the number of threads is that resolve attempts that stall will eventually occupy all threads and prevent progress. We think this risk is preferably to not having control about resource consumption.
There is also now the new CURLMOPT_QUICK_EXIT which controls how the
thread pool is shut down when the multi handle is cleaned up. Similar
to before, the default behaviour is to join all threads in the pool. Setting
this new option at the multi handle detaches the threads.
Since the easy handles are no longer “owning” any threads, they can
be removed/cleaned up without delay. Late arrivals of DNS resolves after
the easy has gone are just discarded.
Performance
Besides using less resources, many DNS resolves are now done in already running threads. This save some time, memory and system calls. If that becomes noticeable very much depends on the application and the system it runs on. But it should in any case be better than before.
Bugs
There is quite some new code and many changes involved. No doubt, we will have added bugs. Hopefully nothing major, but these changes were made by humans — who make mistakes.
Summary
This should be the last part in my “curl dns 2026” series, unless you have further questions or topics I should shed more light on. Leave me a message on mastodon if you do or open an issue at curl’s github.
Thanks for your time.