Faster MySQL with HTTP/3

185 points by mattrobenolt 3 years ago · 50 comments

Reader

Love love love planetscale.

When I found it (you can thank Theo), shocked this isn't what AWS' serverless DB offering already was.

I agree with what the author mentioned in another comment, not dropping performance for non-serverless use cases is a decided win. I deeply appreciate the work being done to enable serverless applications, so thank you for the work and thank you for sharing your findings OP.

thatwasunusual 3 years ago

I've barely heard about PlanetScale, but from the looks of it, it looks interesting. However, is it compatible with "normal" MySQL, meaning that I just can change my connection information from `currentMySQLhost` to `planetscaleMySQLhost`, and it just works (after creating the database, tables etc., of course)?
- mattrobenoltOP 3 years ago
  
  Yup! There are a few caveats, but for the most part it'd be compatible.
  https://planetscale.com/docs/reference/mysql-compatibility
  - thatwasunusual 3 years ago
    
    That is pretty great, especially with such a generous free plan for testing.
    However, no support for FOREIGN KEYs is a bit of a bummer. However, they explain it very well.[0]
    Thanks for the reply! Will definitely give PlanetScale a try over the weekend!
    [0] https://planetscale.com/docs/learn/operating-without-foreign...
mattrobenoltOP 3 years ago

Much love. <3

erulabs 3 years ago

Awesome work! I'm loving planetscale for lighting a new fire under the butt of MySQL again. Have been demoing Vitess (Planetscale's underlying software) and have only positive things to say so far - the Kubernetes operator is wonderful.

I'm also curious about the comparison to the MySQL Classic protocol - would be interesting have an "as-close-as-possible" benchmark between Aurora MySQL "Serverless V2" and Planetscale. Even if it was as naive as "Given 100$ of credits, how many reads can you do at what average latency".

dveeden2 3 years ago

Would be nice to see a comparison with MySQL's X Protocol (protobuf based) as well as the "Classic" protocol. I don't think I've seen any MySQL compatible databases implement this protocol yet, so that's probably why it wasn't included.

https://dev.mysql.com/doc/dev/mysql-server/latest/page_mysql...

mattrobenoltOP 3 years ago

I have vaguely looked into it, but since we don't support it at PlanetScale and unlikely we will, it didn't seem worth it. It'd be hard to construct a practical test.
Similarly, since support is so low, it didn't make a lot of sense to double down and support it when we could do what works for us.

muhammadusman 3 years ago

I'm a huge fan of using PlanetScale, I've been using it for a few projects recently and can't wait to try out the "Boost" feature when it becomes available. I think it will reduce one more thing from my stack (caching/redis)!

The developer experience with PlanetScale has been my favorite so far, I use it with a few Next.js apps and the "scaling" part has been the easiest as I haven't had to think about a burst of traffic b/c PlanetScale handles it without me lifting a finger.

uluyol 3 years ago

I can see how you might want HTTP/3 across data centers, but you may want to stick to HTTP/2 for intra-DC workloads.

There's a ton of work to optimize TCP including hardware offloads that help push higher throughput. Basically we're talking library + kernel + hardware changes. It might be possible to get some of these into QUIC, but since QUIC is most compelling for WAN traffic, there's probably not much incentive for that.

mattrobenoltOP 3 years ago

Mostly agree here. I don't think in practice, you'd default want to use HTTP/3. I think as technology, it's still just very immature for reasons you mentioned.
In our case though, lots of our customers and lots of use cases do communicate over a WAN, and potentially large geographic distances. I think having this as an option is super interesting to see what we can do with it in the future.

alberth 3 years ago

What's so great about PlanetScale?

I feel like I live under a rock because I just don't get what's so great.

erulabs 3 years ago

They're a hosted version of Vitess, which is sharded MySQL (and more), which is generally a pain in the butt to implement / gets built in-house at major tech companies over and over.
Planetscale is doing for Vitess what Fastly does for Varnish, if that makes sense. Or maybe, what Datadog does to statd? It's a hosted platform around an awesome and complex-to-maintain bit of open source software.

avargas 3 years ago

One of the best ways I've managed latency with MySQL is basically this:

1) use persistent connections, let the OS handle them and tweak it to allow (both connecting server and mysql server). And never close the connection on the application side. (This could lead to potential deadlocks, but there are ways around it, like closing bad connections to clear thread info on mysql).

2) run the whole thing in a transaction, simply begin transaction or autocommit if allowed (same thing)

Doing so, when you are done rendering the content, flush it and send the correct signal to say nginx or apache to say it's done (like PHP's fastcgi_finish_request when working with FPM), and then run your commit. Obviously used when you can safely disregard failed inserts.

mattrobenoltOP 3 years ago

> 1) use persistent connections, let the OS handle them and tweak it to allow (both connecting server and mysql server). And never close the connection on the application side. (This could lead to potential deadlocks, but there are ways around it, like closing bad connections to clear thread info on mysql).
This is definitely ideal, but one thing that you can't entirely control is the server side or what's between. Sometimes your connections get interrupted, and it's not possible to maintain a connection forever. Yes tho, this is the ideal thing you should do with a connection pooler.
> 2) run the whole thing in a transaction, simply begin transaction or autocommit if allowed (same thing)
This shouldn't really help with latency. Being in a transaction doesn't reduce latency. If we're being pedantic, it would likely increase latency due to having to execute a BEGIN and COMMIT query, which is typically two more round trips, one per query.
I think what you're getting at is something like pipelining, where you can send multiple queries in one request, and get multiple results back. This is technically supported over the mysql protocol, but isn't very widely adopted in practice.
- still_grokking 3 years ago
  
  > but one thing that you can't entirely control is the server side or what's between.
  Why?
  If you're not running stuff on other peoples computers you're very much in control.
  What do I miss?
  > If you're not running stuff on other people's computers you're very much in control.
  - mattrobenoltOP 3 years ago
    
    We're a service provider. As a client and customer, you connect to us as a third party service. You don't control our uptime or connectivity. Nor do you control whatever network hops may be between.

YarickR2 3 years ago

This is optimizing edge cases. Handshake is done once per execution , and it's constant time , so "connect + select 1" should be measured as "connect", then "select 1". Other than that, 5ms latency is too high anyway for modules doing hundreds of small requests, so you better have read replicas real close to your code, and shard your masters.

mattrobenoltOP 3 years ago

Maybe in your case, but not always. But I do explicitly call out that I intentionally wanted a test of a "cold start". This is extremely relevant for short lived applications and processes. Think PHP, and serverless environments, etc.
The other tests are measuring already a warmed up connection.
There's also reason why I intentionally coupled "connect + select 1" as the test, because I wanted to make it as close of a comparison as possible. If it was simply a "connect", our HTTP API would be even more favorable since connecting doesn't do authentication or anything like that like the mysql protocol does.
- tookledoodle 3 years ago
  
  You call out PHP in both the blog post and this comment, which is interesting, because persistent pooling has been a part of PHP (and advised) for years.
  Which leaves serverless and scripts (your other example from the blog post). Which, let’s be honest, are both edge cases at this point in time. Maybe that’ll change, but today it’s true.
  Twenty year SRE here backing up the person you’re dismissing: you’re optimizing an edge case. Literally step one of operationalizing every system in existence is burying your DB behind a pooler. 100ms off a connect call in a script is not useful. The serverless improvement has some potential, but one would be forgiven for asking why you’d use an environment which doesn’t let you speak network protocols you’d like to speak.
  - orware 3 years ago
    
    I'd be curious to hear about experiences with persistent pooling usage within PHP, since I don't feel that I've heard about it being advised as much myself over the years, but that could perhaps be due to old misperceptions about it.
    I know there is the "Persistent Database Connections" section of the PHP manual and the mysqli extension within PHP supports connection pooling / persistent connections, but in my own experiences I've rarely seen them utilized, especially by the bigger open source projects out there such as WordPress, which has an 8 year old enhancement topic on the subject: https://core.trac.wordpress.org/ticket/31018. Putting your database behind a pooler, like ProxySQL let's say, is another option as the level of sophistication for a company/application increases, but most typical PHP setups I've used don't have that immediately available.
    I've generally been under the impression that most projects/applications don't use the built-in pooling features for some of the reasons discussed in the link above, leading to those applications being more impacted by lengthier connection times due to a new connection being created at the beginning each request and then closed at the end of the request.
    Now I'm inclined to experiment a bit with the built-in mysqli pooling feature though since it would seem a worthwhile feature for developers to experiment with more if it would lessen the connection time impact for each PHP request, particularly for databases that are further away and require secure connections.
    Shaving off 100ms for a connection would be significant for most PHP users if they are currently having to open fresh connections on each request, especially if they were previously used to connection times of < 1ms when connecting to a local MySQL database.
  - aarondf 3 years ago
    
    > because persistent pooling has been a part of PHP (and advised) for years.
    It may be advised, but I can assure you that it's not very common! I would guess that the vast, vast majority of PHP applications are _not_ pooling their connections. Especially when it comes to PHP hosted on Lambda, which is surprisingly a non-trivial amount of applications at this point.
  - mattrobenoltOP 3 years ago
    
    I'm not a PHP expert, so I don't know the landscape there fully. I do know our customer complaints and can say people care about cold start times in the PHP space and others.
    So while it may be an edge case for you, it's not for others. It also doesn't discredit any of the other testing that doesn't focus on cold starts.
    Edit since you edited yours after I posted:
    I'm not going to argue the merits of what platforms people choose and it's not really our position as PlanetScale to do that. We serve our customers.
    
    tookledoodle 3 years ago
    
    No, it objectively remains an edge case, no matter whether it is for me or you or the person you’re replying to or whatever. Your customers doing inefficient things does not change the fundamentals of the situation. And the other testing that didn’t focus on cold starts looked like basically a non-result looking at the graphs, so I didn’t bother talking about it. I can, though, with the understanding that it will strengthen my point.
    
    kamikaz1k 3 years ago
    
    > fundamentals of the situation
    On what grounds? Why are you able to unilaterally decide what the situation is?
    
    mattrobenoltOP 3 years ago
    
    Sorry about that.
jeffbee 3 years ago

TCP connect is not constant time. It stands a surprisingly large chance of failing and waiting TWO SECONDS to start again.

mgaunard 3 years ago

Most likely the parallel requests are done using a single connection.

So of course HTTP/2 will outperform, that's what it's designed to do.

Now try again, but use one connection per thread, and connect it before you start benchmarking, i.e. use it the way it's meant to be used.

mattrobenoltOP 3 years ago

The tests cover both cases if you read.
But either way, yes, that's fundamentally a benefit of being able to use HTTP. We can multiplex multiple sessions over one underlying connection.
- mgaunard 3 years ago
  
  Multiplexing is already done for you by your kernel, it's called having multiple TCP sessions.
  The whole premise of HTTP/[23] is to do the same thing as you do with N TCP sessions, but paying for the session establishment latency only once instead of N times.
  And most applications couldn't care less about that latency, because you only do it once.
  - mattrobenoltOP 3 years ago
    
    My apologies for not meeting your bar. I guess you missed the parts where it's faster in a lot of other cases too, and not slower in any.
    To me, the fact that it's not slower at all is the big win. I didn't anticipate that the results of this are going to say "this is 5x better". The stereotype is that if it's over HTTP, it must be slower.
    And by every measure, it's not slower. In cases, that may be edge to you, or don't care about extra latency, they're still improved. Why would you not want something that's generically better?
    There are many other things that are beneficial with using HTTP as a transport that haven't even been discussed here since this was entirely focused on performance. Without at least matching in performance, not many of the other things would matter.
    
    mgaunard 3 years ago
    
    99% of benchmarks are wrong and lead to misleading conclusions. You didn't provide the code so it cannot be independently reviewed.
    This article IMO contributes to spreading the misinformation that HTTP/[23] is useful for many applications, when it is actually a very niche protocol only useful to web browsers, or other similar applications that continuously need to connect to endpoints they don't know in advance.
    Web tech has already done sufficient damage by pushing HTTP/1.1 and SSL everywhere in IT, we don't need to force those protocols onto everything.
    
    hardwaresofton 3 years ago
    
    Would you mind expanding on other protocols that would be better?
    I personally cant think of a protocol I’d rather have go everywhere than QUIC/HTTP3.
    Forced TLS is disappointing (there’s some discussion), but the upside is that this means SNI works and you can now very easily route UDP traffic via SNI.
    Given the flexibility of HTTP/3 I wouldnt mind having it everywhere.
    
    mgaunard 3 years ago
    
    HTTP/3 is just a way to have multiple reliable connections with less latency in establishing them than with TCP.
    What you can use instead is simply TCP.
    TCP is simple, mature, has all kinds of support in software and hardware.
    The only advantage of HTTP/3 doesn't matter to most use cases and doesn't warrant its complexity and throwing away all the networking ecosystem built on TCP.
    
    hardwaresofton 3 years ago
    
    Sure -- but I do value the fact that UDP + some upper layer option is better than plain TCP (essentially you can always build TCP on top of UDP but not UDP on top of TCP per-say). There's no reason that advancements in TCP cannot be "backported" to QUIC or UDP
    I'd go so far as to say that it's possible TCP should always have been another layer up the stack rather than an alternative to UDP.
    But coming back to concrete use cases, here's one I've been kicking around in my head lately -- it would be great to build a video calling application that relied exclusively on SFUs (so just redistributing frames, no WebRTC stuff) but only needed one port for all the traffic. It seems that HTTP/3 brings benefits to this use case, two things in particular:
    - Running all off of ~one port (443/tcp or 443/udp)
    - The ability to easily take advantage of a bidirectional stream with UDP semantics (I guess HTTP/2 could have done this), something we rely on WebRTC to do now.
    I say this because I've felt the pain of trying to set up complicated video conferencing software and while WebRTC is awesome, it is a bit heavy weight, if you're alright with an intermediary forwarding the frames.
    [EDIT]: as far as perf goes, Cloudflare has a good blog post on this: https://blog.cloudflare.com/cubic-and-hystart-support-in-qui...
    
    Sean-Der 3 years ago
    
    You can run all your WebRTC traffic off a single port. You use the remote 3 tuple (IP, Port, Protocol) to demux traffic.
    https://github.com/pion/webrtc/tree/master/examples/ice-sing... is one example of that.
    
    mattrobenoltOP 3 years ago
    
    Check your bias. This was clearly communicated as an experiment with surprising results. In some contexts, we don't have a choice but to support HTTP. So simply wanted to see how it stood up to a traditional mysql client.
    But you do you.
    
    mattrobenoltOP 3 years ago
    
    Also ever heard of gRPC? It's pretty popular and widely used. Whether you like it or not or whether it's the most optimal in theory, in practice HTTP is heavily heavily used for communicating between components.
    To me, HTTP sits in a similar boat as JSON. Is it perfect? Is it good? Not necessary. But it's extremely extremely scrutinized and optimized due to its ubiquity in ways that other protocols and formats haven't been.
    This was the entire point of this experiment and it's proved successful. The bias I wanted to challenge is exactly what you mentioned. Turns out, using HTTP as a transport and protobuf for encoding (which is basically gRPC) is comparable.

exabrial 3 years ago

This may be a stupid question, but wouldn't connection pooling offer the same benefit?

mattrobenoltOP 3 years ago

It helps some cases for sure and what I'd strongly recommend in practice. Connection pooling isn't always a viable option depending on the application though.
Connection pooling doesn't solve all the things we can improve by using HTTP as a base. We can be faster in just data transfer through compression, for example.
Using HTTP/3 starts to help tail latency that we can't solve with TCP. Unreliable networks with packet loss suffer greatly with TCP and not as badly with QUIC.
- exabrial 3 years ago
  
  Isn't that solved by this? https://dev.mysql.com/doc/refman/5.6/en/connection-compressi...
  - mattrobenoltOP 3 years ago
    
    In theory. In practice nothing implements this.
    But in any case, even if your client did support this and the server supported it, we still need HTTP for other things. I don't think it's particularly a "gotcha". HTTP is also stateless, which has lots of benefits for us.
    
    bgrainger 3 years ago
    
    MySqlConnector (the most popular .NET library for MySQL, and the one that I authored) has supported protocol compression for many years: https://github.com/mysql-net/MySqlConnector/issues/31.
    
    exabrial 3 years ago
    
    but it's here?
    https://dev.mysql.com/doc/connector-j/8.0/en/connector-j-con...
    And it's in the C API too...

kurtextrem 3 years ago

Crazy results, really interesting experiment!

MuffinFlavored 3 years ago

Could the same also be done for Postgre?
- mattrobenoltOP 3 years ago
  
  Anything is possible with computers.

Settings

Faster MySQL with HTTP/3

Keyboard Shortcuts