HAProxy 2.7
haproxy.comHAProxy is great. I use the dataplane api with our releases to drop servers from the group for zero downtime release. Easy curl command, which they provide, and switched it to powershell too. I have another script that downloads our SSL/TLS certs and reloads haproxy using their no downtime reload option. I ran a load test earlier this year using loader[.]io and managed 57,782 requests in 20 seconds using a single haproxy server (tls) and two .net core web servers. Around 300mb/sec of data after brotli compression. Bottleneck was clearly the two web servers and could have scaled more but it was on a 1gb link and testing 10gb links over the internet is not something I am prepared for testing. HAProxy was basically idle the whole time.
> However, due to the fast, atomic operations involved at many places, HAProxy was previously limited to 64 threads, and therefore 64 CPU cores, on 64-bit machines. This limit is now raised to 4096 threads by the introduction of thread groups.
This is such a welcome change! I've been in more than one situation where I rediscovered this limitations when trying to scale things while trying to remain monolith (due to costs and performance reasons) but noticing HAProxy not being able to utilize the newly upgraded machine. This will make a huge difference!
Less than you think. You only need few cores to saturate tens of gigabits of traffic, HAProxy already is very performant.
It's future-proofing basically.
The page is ancient https://www.haproxy.org/10g.html but they were doing 10Gbits on Core2Duo in '09 (sans SSL but back then AES acceleration wasn't very common either)
Indeed, we've done 100Gbps about a year ago on a 8-core machine.
Usually the only reason to use many cores is SSL, but since OpenSSL 3.0 that totally collapses under load, even then you're forced to significantly lower the number of threads (or to downgrade to 1.1.1 that about any high traffic site does).
Do you know what's behind the performance degradation of OpenSSL 3.0? Has the problem been documented anywhere?
Horrible locking. 95% CPU spent in spinlocks. We're still doing measurements that we'll report with all data shortly. Anyway many of them were already collected by the project; there are so many that they created a meta-issue to link to them: https://github.com/openssl/openssl/issues/17627#issuecomment...
3.1-dev is slightly less worse but still far behind 1.1.1. They made it too dynamic, and certain symbols that were constants or macroes have become functions running over lists under a lock. We noticed the worst degradation in client mode where the performance was divided by 200 for 48 threads, making it literally unusable.
Maybe I'm biased by what I was doing with it and when, but HAProxy seemed better suited to processes than threads; there's not much state that's useful to share, and sharing the single process list of FDs is a scalability challenge (and presents opportunities for bugs around reuse; I fixed one of those).
Also, I found it best to work with everything avoiding cross CPU communication; making sure everything happens on the CPU core that handles the RX queue for the incoming connection. My NICs only did 16 queues, so more cores than that wasn't useful for me. Do they make nics with 128 queues now? (Selecting outgoing ports so traffic to/from origin stays on the same CPU is tricky, but also solves the problem of making sure there's no port contention between the processes)
Actually that what I've been saying for years until users complained that processes had different vision of server states in fast changing environments, that it was not possible to perform dynamic updates at runtime because all of them had to be synchronized, that stick-tables were only on one process, that they had to manage their stats and sum them externally, that the LB algorithms were uneven with many processes because roundrobin starts from the same server on all processes, and leastconn didn't know other processes' load, that the maxconn size was unmanageable as it couldn't consider connections from other processes, that rate counters weren't shared, that maps and ACLs had to be loaded and managed in every process, eating a lot of memory or not being updatable, that memory usage was going through the roof because unused memory from one process was not usable by another one, etc... The list was so long that it was clear at some point that we had to move to threads to resolve all of this at once. And we're doing really good, I still hate sharing data so we're extremely careful to share very little. Look at the pools, the thread_info struct and the thread_group struct to see how much is local to the thread or to the group. Even the idle server connections are thread-local but can be stolen if needed (i.e. almost never any sharing unless really needed). So we've kept the old practices of processes with the facilities offered by threads, and that's what brought us this great scalability.
Yeah, that's fair. That's a big list of features I didn't need ;) and I was working on it when threads were new.
And admittedly threads didn't work great 5 years ago when we put them into 1.8. Fortunately 1.8 is dead soon, one less thing to worry about :-)
>Also, I found it best to work with everything avoiding cross CPU communication; making sure everything happens on the CPU core that handles the RX queue for the incoming connection.
Yes, this is absolutely critical. Atomic ops (including taking a lock) when you have a large number of cores just completely kills performance.
>Do they make nics with 128 queues now?
Yep, the Intel E810 100gbit controller supports 256 queue pairs for just one example.
Hope you can process that traffic pretty fast, the amount of memory dedicated to packet buffers with that many queues is making my head spin a bit.
Oh nice, in that case, time for 256 core haproxy box ;)
My use case was just tcp proxy, inspecting the first few bytes to determine destination (http vs https vs proprietary) and add proxy protocol headers to the origin. Setup is somewhat expensive (tcp table manipulation isn't cheap), but data on established connections is simple and easy.
IMO for super high packet rate cases, DPDK would make more sense. For proxy protocol injection, this would be fine since you don't really care about a lot of the TCP stack, just keeping track of new connections, and doing the sequence number rewriting to account for the injected bytes. All of the congestion control etc can be handled by the endpoints.
Of course if HAProxy performs well enough, then that's still a million times easier.
Yeah, I had lots of fun ideas, but had a time crunch, HAProxy basically just worked to get started, and took well to optimizing, and pretty soon I had more capacity that would ever be needed, and never had a chance to rebuild it with exotic technology ;)
For pure TCP you rarely need threads, indeed! You may benefit from TCP splicing however if you're dealing with high bandwidth.
Yeah, TCP splicing would have been neat. The use case wasn't high bandwidth, and it was transitional: client software has IP addresses to try when DNS fails, but we moved to new hosting related to an aquisition, HAProxy at the old host until all clients in the field have updated.
The bottleneck was connection establishment in the kernel, and as traffic decreased it became harder to run meaningful tests without great risk (send all the traffic to one server to see if it can handle it is fun, but unsafe), and we had to keep a fair number of servers to keep the IPs from the hosting provider, so we had way more capacity than needed and optimizing more wasn't a good use of time :(
Indeed, that's fun at the beginning but past one point when you'd want to stop the servers taking 3 connections a day but you can't because these are the 3 most business critical connections, it starts to become much less fun.
Connection establishment in the kernel was an issue before kernel 3.8 I think though not sure. The problem is finding a spare port when you use a lot of concurrent connections and at some point it costs a lot. The "source" directive in haproxy was made to overcome this (you can define a per-server range so that the kernel doesn't have to scan its lists). But nowadays more efficient mechanisms were implemented in the kernel, that we support in haproxy (using IP_BIND_ADDRESS_NO_PORT) and we haven't witnessed these painful moments for a while now.
I was running on FreeBSD, and using port selection inside HAProxy to avoid the kernel selecting it, but still had contention on locks when inserting connections into the table. I can't remember if that was FreeBSD 11 or 12, but the next version made improvements which looked nice, but I wasn't able to test.
Somewhere my (not very great, but working) patch set is floating around, where I added rss aware port selection to HAProxy, so client and origin connections would both hash to the same NIC RX queue, and HAProxy never had to deal with cross-cpu communication for reading sockets.
This interests me. I've long been wondering how we could do that without knowing the NIC's hash and/or without having to brute-force the ports using the NIC's hash to reverse-map them. But maybe there are kernel facilities I'm not aware of to do that. If you happen to stumble upon it and share it to the mailing list, that would be great!
https://www.mail-archive.com/haproxy@formilux.org/msg34548.h...
I haven't looked at this code in a long time! Haven't gotten to a computer and it's hard to read this on a phone, so... And I left where I wrote it, so I can't confirm if there were any substantial changes after these. My fuzzy memory is this may be all the HAProxy changes, then I focused on kernel level changes.
Many thanks! Too bad this thread was not noticed and left unresponded. I'll have a look. Thanks again!
Professionally, i've recently used haproxy with the dataplane api to provide programmatically controlled layer 7 load balanacing.
Personally, i also use haproxy as a load balancer to terminate https and proxy traffic to internal services.
it's features, such as logging and debugging are far superior than nginx. While the haproxy config can be a little cryptic at times, you can do a lot more with it too.
We'd love to hear about different use cases on our next HAProxyConf. both community and enterprise environments. We just wrapped up HAProxyConf 2022 in Paris (https://www.haproxy.com/blog/haproxyconf-2022-recap/) and are starting to plan for next one. The Call for papers hasn't been announced yet but feel free to shoot as an email at submission@haproxy.com.
Congratulations to the team on this release! I switched to haproxy for load balancing our production traffic 5-6 years ago and it's proven incredibly reliable. Creating a ticket to update our dev & staging environments to 2.7 right now.
I've spent the last decade on Nginx and then of course spending lots of money on Nginx Plus for their upstream health checking. I knew HAProxy was a possible solution but was too happy with Nginx. The slow evolution of open-source Nginx has been frustrating to watch and the lack of basic features like upstream health checking in the open-source project is now ridiculous considering the excellent competition.
Is there a good alternative to openresty (aka nginx w/ lua) ?
I too have fought nginx on upstream health checking and have avoided the cost of Nginx Plus. I'd love a good alternative, but we have a number of lua plugins we depend on for rate limiting, metrics, auth, etc.
Well, HAProxy Community can do all this :)
I'm using caddy because it takes care of Lets Encrypt TLS certs. I'd use HAProxy but I don't know if it can do this without additional scripts.
I also use caddy, for my personal sites and I also have a work site that is a bunch of TLS redirectors for domains that the company owns to redirect to other domains. It works spectacularly in those use cases. At the time I set it up it wasn't possible to do that with an S3 endpoint, but now it is.
I do have an haproxy setup that does the right thing with LetsEncrypt, but it's just a path-based forwarder that works with an acme script on the system, not directly built in.
Hopefully it comes soon: https://github.com/haproxy/haproxy/issues/1864
ACME support must be integrated, this was turned into a mandatory feature nowadays…
Stay tune (tips: dataplaneapi)
What's the current (2022) view on HAProxy vs. Nginx?
I know the sentiment has changed over the years. Curious what's the opinion today on which is the best tool for a new project.
(Yes, I realize they do different things but there's also considerable overlap as well)
I've been using Varnish instead; while "caching proxy" was/is the main tagline of Varnish, there's significant overlap with HAProxy and Ngix as well and Varnish works well even if you just want to do generic proxying/load balancing without any caching.
I really like Varnish's VCL configuration; something like HAProxy's more declarative configuration is easier to get started with, but in more advanced configurations the HAProxy declarative configuration becomes "declarative in name only" where you're embedding logic in the "declarative" configuration. I found it pretty awkward and VCL just becomes so much easier because the logic is expressed "directly" rather than through a declarative configuration. Varnish also comes with some tools to inspect/log requests, which can be handy for debugging Weird Issues™ at times.
Nginx kind of sits in between VCL's full-on DSL and HAProxy's configuration.
I don't know about the exact performance comparisons; Varnish has always been "fast enough" for me. I have a $5/month Linode VPS running Hitch and Varnish and thus far that's been enough for my ~1M requests/day with plenty of room to spare.
I agree with you. Varnish is made for being a website cache that's a real part of the application. It needs to be extremely flexible and programmable. HAProxy needs less flexibility on the caching stuff but more on the conditions to switch a server on/off, regulate traffic or deal with DDoS. Nowadays we see a lot of combinations of the two, initially to get SSL offloading at the haproxy layer, but also for cache deduplication: when you configure haproxy to perform load-bounded consistent hashing to your varnish caches, you get a very efficient cache layer that avoids duplication unless the load starts to grow for some objects in which case these objects will be fetched from adjacent caches. That delivers the best power: unified storage with replication of highly loaded objects only. The cache in haproxy was initially called the "favicon cache" because we don't want it to become a complex cache. And we've succeeded at this, nobody ever manages nor flushes that cache. It's strict and in case of doubt it will refrain from caching. You can use it to avoid passing certain tiny requests to varnish when the cost of passing them over the network outweights their benefit (e.g. favicon or tiny gif bullets that compose pages). But varnish definitely is the de-facto standard cache. The 3 components you name are so commonly found together that some people tend to confuse them (due to the small overlap), but they all excel in their respective areas.
My point was mainly that Varnish can often serve as a complete replacement for HAProxy, rather than a complement to it. No doubt there are features/things in HAProxy that are not easily expressed in Varnish (although in principle you can do pretty much anything in Varnish since you can extend it), but for many common use cases Varnish will work as a fine load balancer, eliminating the need for HAProxy altogether. The only additional thing you need is something like Hitch or Pound to do TLS.
OK I see. Yes for some use cases that's possible. But that seriously limits your load balancing capabilities. And having to deploy Hitch or Pound for TLS still requires an extra component that's not designed to deal with traffic spikes, denial or services etc that are often met in environments where you deploy caches.
I don't know HAProxy that well yet, but I think that Nginx "the company" has degraded severely since purchased by F5.
I used to use nxinx plus, but after moving to a new company and trying to purchase it again, not only had the price quadrupled. I also wasn't able to actually buy a license without four meetings, a shitload of mails and a lot of frustration.
In the end I gave up and kept the community edition while I explore the options.
Nginx clearly moved from the cheap and performany segment to the expensive and cumbersome.
HAProxy have always been better proxy (front traffic, distribute it via rules to other stuff, mangle some headers and such to fix stuff that app refuses to etc.) but it lacked a lot on terms of programmability, with Lua addition and SPOE[1] it gained a lot on that front.
Then again, if you already use nginx (say for serving static files), using it to also front your app is the easiest solution and you only need to start wondering about alternatives once you hit a lot of traffic. Any benefits or drawbacks would only show once you start shoving in massive amounts of traffic, and your app will be bottleneck way before that.
* [1] https://www.haproxy.com/blog/extending-haproxy-with-the-stre...
> Then again, if you already use nginx (say for serving static files), using it to also front your app is the easiest solution and you only need to start wondering about alternatives once you hit a lot of traffic. Any benefits or drawbacks would only show once you start shoving in massive amounts of traffic, and your app will be bottleneck way before that.
This. Unless you're serving to tens of millions of users just use the thing that you're comfortable with.
Second release of HAProxy with HTTP/3 support. Nginx doesn't support it.
HAProxy support 103 Early Hints since 2018. Nginx doesn't support it.
I would say HAProxy is a better load balancer, than a web server. I like to have a few NGINX instances as the actual web servers sitting behind a HAProxy instance that spreads load across them.
I definitely agree. If you need the best load balancer, take haproxy. If you need the best web server, take nginx. The two combined work amazingly well together, that's why they're very often found together :-)
HAProxy is not a web server in the first place.
The only thing it serves, from memory only (need reload to change) are error pages.
Or the cache :-) (also "return" directives but that doesn't count).
haproxy has a lot of features in the free edition which are only in nginx enterprise, like balancing load by number of open sessions, or OpenResty, like lua scripting.
These days I always go for haproxy for reverse proxying.
"paid edition" AFAIK is mostly stuff around haproxy, management and analytics mostly.
OpenResty is only available via Nginx Enterprise?
Definitely not, Kong (for example) is "OpenResty plus a management plane" and they're Apache 2: https://github.com/kong/kong#license
OpenResty itself is BSD https://github.com/openresty/openresty#copyright--license
I mean some features you get with haproxy are only available with either enterprise or OpenResty.
We've used HAProxy for load balancing and High availability systems with failover for postgres and redis.
Then we've used Nginx behind that as the Reverse Proxy, static media serving, and basic static caching.
Basically HAproxy seems better for well... High availability and as a load balancing proxy. And Nginx has seems to be more suited for the HTTP(s) type pipeline.
You if all you need is basic load balancing, or some types of HA, then nginx will work fine. Last I checked (few years ago) nginx has a lot more http routing tools and http focused settings, headers, static file serving, etc. But HAProxy might have improved in the last 3 years with those.
I just switched from nginx to haproxy. Nginx just serves http now. Makes things easier for me.
I like having haproxy in front of nginx (and other services) as a load balancer.
It's fast, easy to tune/configure and cheap to run.
This is very biased because it's published by NGINX[1]:
"NGINX suffers virtually no latency at any percentile. The highest latency that any significant number of users might experience (at the 99.9999th percentile) is roughly 8ms.
What do these results tell us about user experience? As mentioned in the introduction, the metric that really matters is response time from the end‑user perspective, and not the service time of the system under test."
[1] https://www.nginx.com/blog/nginx-and-haproxy-testing-user-ex...
Hehe read till the end with the comments, especially the collapsed graphs!
There were big mistakes in this test, failing on a large number of settings, and even the NGINX tests were not as good as they ought to be in some aspects. This conducting Amir, Libby and I to join efforts on documenting better methodology for benchmarks, that we published under the Data Plane Benchmark project at https://github.com/dpbench/dpbench/ and later to the new test in this article: https://www.haproxy.com/blog/haproxy-forwards-over-2-million...
This was great and productive cooperation between engineers, despite working on different (and sometimes competing) projects, resulting in more reliable methods for everyone in the end.
Hey, Willy, I just wanted to take a moment of my day to thank you for haproxy. It is among my most favorite tools in my tool belt!
Thank you, you're welcome!
At HAProxyConf last month, a lot of the community members presenting on stage explained why they chose HAProxy over alternatives like Nginx. Benchmark performance was a big part of it, but so was the community. LinkedIn said they compared HAProxy with Nginx, Envoy, and Zuul, and chose HAProxy because of the open-source model, the roadmap and release cycle, and all the features available in the non-Enterprise version. You can see some of their slides and performance benchmark results in this conference recap: https://www.haproxy.com/blog/haproxyconf-2022-recap/
Disclosure: I am part of the HAProxy team.
haproxy is awesome, I have a Raspberry Pi where it runs dedicated and never causes me issues. The best software can be forgotten.
That's a good example. I was at HAProxyConf in November, and Willy showed HAProxy running on a solar-powered device to demonstrate its efficiency. Disclosure: I'm part of the HAProxy team.
For those interested, the device is a breadbee. (https://github.com/breadbee/breadbee).
If we forget about it, it might be because it gives us so little issue. Software you don't think about is a treat :)
I agree. I'm used to saying this to the rest of the development team: we are very critical about our own code because we receive all the bug reports, which tends to give us a feeling that there's always something to fix. But seeing it another way, with hundreds of thousands of deployments, having 2-3 issues a week is ridiculously low and means that the vast majority of deployments will never face a bug in their life.
It still poses us a problem which is that users don't upgrade. For example 1.6 is still routinely met in production despite having been unsupported for 2 years, and sometimes with 3 or 4 years of uptime because users forget about it or just don't want to risk an upgrade for no perceived benefit. I can't blame them honestly, as long as they upgrade before reporting problems or asking for help!
I've used HAProxy in several roles over the year, been pretty much bullet-proof everywhere.
I don't know if my testing was right. If anyone from HAProxy team reading this.
retry count is 3 and you have 5 server in backend and 1 backup server and you have health check.
if all servers are down, request will be forwarded to backup server
but what if all the servers are down, but health check is not updated yet. (extreme timing)
the request will be retried 3 times, servers will be mark down. as all the 3 request were failed, HAProxy will return 503.
I think the request should go to the backup server, even if retry limit was 3, HAProxy was not able connect to any of the 3 server and servers were down actually.
You can think of it as having layers of redundancy.
* Retries are one layer. By default set to 3 retries. HAProxy will retry the failed connection or request with the same server.
* "option redispatch" is another layer. If HAProxy can't connect to a server that is reporting as healthy, it will send the request to a different server.
* health checks are a layer. HAProxy removes unresponsive servers from load balancing.
* Backup servers are another layer. If all servers fail their health checks and are down, then backup servers come online to service requests.
All these things can be enabled in combination, and it would reduce the chance of a client getting a server error.
To answer your question, HAProxy will not connect with a server that is down (failed all its health checks). It will not retry with it either.
One approach that some users who want this mechanism use is the "on-error sudden-death" mechanism: the server responses are inspected, and upon error (from selectable types), the server can be marked dead. If all of them are dead the backup server will come into the dance and the last attempt will end up on it.
This looks like a fantastic update for bigger servers.
I've been running HAProxy on my home server for almost 2 years. I don't use the load-balancing features though.
I do the same for my personal machine. It's acting as the router/gateway in front of several Docker containers (both HTTP and also other protocols, such as SMTP or IRC).
Disclosure: I'm a community contributor to HAProxy.
I need to start using reverse proxy and such things. Have you decided upon that instead of Nginx Proxy Manager or Caddy? Or are you using HAProxy to solve other challenges?
I use it for a few other things than reverse proxying. HAProxy seemed like a light weight solution to what I was looking for: parse an incoming request and hand it to something else as fast as possible, and doesn't do any web hosting itself.
Are you only using it for reverse proxying/rate limiting?
I primarily use it to terminate SSL, route based on hostname (subdomain), and cache.
I use mine to also terminate TLS and proxy mosquitto (mqtt) traffic. mosquitto's TLS configuration is a pain and it's so much easier to have it all in one place.
Love HAProxy, the most reliable service in my homelab :)
just curious, what do you use it for?
Sorry for the late response. I use it to load balance http traffic to my kubernetes cluster. Also for http -> https redirect
With the recent discussions about memory safe languages, HAProxy is still surprisingly written in C [0].
Nothing surprising here, it is old project, and one very performance-conscious.
Written in C and probably one of the most rock-solid pieces of user level software anyone could imagine. I doubt, connection-per-connection, any other piece of software is more battle hardened than HAProxy.
The vast majority of the software running on your machine is still written in C.
Why is that surprising.
You'd be surprised what's still written in C.