Settings

Theme

Understanding the Go Runtime: The Scheduler

internals-for-interns.com

156 points by valyala 11 days ago · 49 comments

Reader

pss314 8 days ago

I enjoyed both these GopherCon talks:

GopherCon 2018: The Scheduler Saga - Kavya Joshi https://www.youtube.com/watch?v=YHRO5WQGh0k

GopherCon 2017: Understanding Channels - Kavya Joshi https://www.youtube.com/watch?v=KBZlN0izeiY

withinboredom 7 days ago

My biggest issue with go is it’s incredibly unfair scheduler. No matter what load you have, P99 and especially P99.9 latency will be higher than any other language. The way that it steals work guarantees that requests “in the middle” will be served last.

It’s a problem that only go can solve, but that means giving up some of your speed that are currently handled immediately that shouldn’t be. So overall latency will go up and P99 will drop precipitously. Thus, they’ll probably never fix it.

If you have a system that requires predictable latency, go is not the right language for it.

  • mknyszek 7 days ago

    > Thus, they’ll probably never fix it.

    I'm sorry you had a bad experience with Go. What makes you say this? Have you filed an issue upstream yet? If not, I encourage you to do so. I can't promise it'll be fixed or delved into immediately, but filing detailed feedback like this is really helpful for prioritizing work.

  • _rlh 5 days ago

    “It’s a problem that only go can solve”

    I had this discussion a decade ago and concluded that a reasonable fair scheduler could be built on top of the go runtime scheduler by gating the work presented. The case was be made that the application is the proper, if not only, place to do this. Other than performance, if you encountered a runtime limitation then filing an issue is how the Go community moves forward.

  • melodyogonna 7 days ago

    > If you have a system that requires predictable latency, go is not the right language for it.

    Having a garbage collector already make this the case, it is a known trade off.

  • pjmlp 7 days ago

    It misses having a custom scheduler option, like Java and .NET runtimes offer, unfortunely that is too many knobs for the usual Go approach to language design.

    Having a interface for how it is supposed to behave, a runtime.SetScheduler() or something, but it won't happen.

    • MisterTea 7 days ago

      I find it hard to believe the people who built Go, coming from designing Plan 9 and Inferno, would build a language where it is difficult to swap out a component.

      I have this feeling that in their quest to make Go simple, they added complexity in other areas. Then again, this was built at Google, not Bell Labs so the culture of building absurdly complex things likely influenced this.

      • pjmlp 6 days ago

        The same people refused to support generics for several years, and the current design still has some issues to iron out.

        Go also lacks some of Limbo features, e.g. plugin package is kind of abandoned. Thus even though dynamic loading is supported, it is hardly usable.

  • kjksf 7 days ago

    > No matter what load you have, P99 and especially P99.9 latency will be higher than any other language

    I strongly call BS on that.

    Strong claim and evidence seems to be a hallucination in your own head.

    There are several writeups of large backends ported from node/python/ruby to Go which resulted in dramatic speedups, including drop in P99 and P99.9 latencies by 10x

    That's empirical evidence your claim is BS.

    What exactly is so unfair about Go scheduler and what do you compare it to?

    Node's lack of multi-threading?

    Python's and Ruby's GIL?

    Just leaving this to OS thread scheduler which, unlike Go, has no idea about i/o and therefore cannot optimize for it?

    Apparently the source of your claim is https://github.com/php/frankenphp/pull/2016

    Which is optimizing for a very specific micro-benchmark of hammering std-lib http server with concurrent request. Which is not what 99% of go servers need to handle. And is exercising way more than a scheduler. And is not benchmarking against any other language, so the sweeping statement about "higher than any other language" is literally baseless.

    And you were able to make a change that trades throughput for P99 latency without changing the scheduler, which kind of shows it wasn't the scheduler but an interaction between a specific implementation of HTTP server and Go scheduler.

    And there are other HTTP servers in Go that focus on speed. It's just 99.9% of Go servers don't need any of that because the baseline is 10x faster than python/ruby/javascript and on-par with Java or C#.

    • jerf 7 days ago

      "There are several writeups of large backends ported from node/python/ruby to Go which resulted in dramatic speedups, including drop in P99 and P99.9 latencies by 10x"

      But that's not comparing apples to apples. When you get a dramatic speedup, you will also see big drops in the P99 and P99.9 latencies because what stressed out the scripting language is a yawn to a compiled language. Just going from stressed->yawning will do wonders for all your latencies, tail latencies included.

      That doesn't say anything about what will happen when the load increases enough to start stressing the compiled language.

    • withinboredom 7 days ago

      Do I need to share the TLA+ spec that shows its unfair? Or do you have any actual proof to your claims?

      • 9rx 7 days ago

        It would be helpful for you to share a link to the Github issue you created. If the TLA+ spec you no doubt put a lot of time into creating is contained there, that would be additionally amazing, but more relevant will be the responses from the maintainers so that we're not stuck with one side of the story.

        Of course, expecting you to provide the link would be incredibly onerous. We can look it up ourselves just as easy as you can. Well, in theory we can. The only trouble is that I cannot find the issue you are talking about. I cannot find any issues in the Go issue tracker from your account.

        So, in the interest of good faith, perhaps you can help us out this one time and point us in the right direction?

        • withinboredom 7 days ago

          I’m not interested in contributing to go. I tried once, was basically ignored. I have contributed to issues there where it has impacted projects I’ve worked on. But even then, it didn’t feel collaborative; mostly felt like dealing with a tech support team instead of other developers.

          That being said, I love studying go and learning how to use it to the best of my ability because I work on sub-ųs networking in go.

          When I get home, I’ll dig it up. But if you think it’s a fair scheduler, I invite you to just think about it on a whiteboard for a few minutes. It’s nowhere near fair and should be self-evident from first principles alone.

          • withinboredom 7 days ago

            Here’s a much better write up than I’m willing to do: https://www.cockroachlabs.com/blog/rubbing-control-theory/

            There are also multiple issues about this on GitHub.

            And an open issue that is basically been ignored. golang/go#51071

            Like I said. Go won’t fix this because they’ve optimized for throughput at the expense of everything else, which means higher tail latencies. They’d have to give up throughput for lower latency.

            • 9rx 7 days ago

              > And an open issue that is basically been ignored. golang/go#51071

              It doesn't look ignored to me. It explains that the test coverage is currently poor, so they are in a terrible position of not being able to make changes until that is rectified.

              The first step is to improve the test coverage. Are you volunteering? AI isn't at a point where it is going to magically do it on its own, so it is going to take a willing human hand. You do certainly appear to be the perfect candidate, both having the technical understanding and the need for it.

              • withinboredom 7 days ago

                Heh. I've had my fair share of mailing list drama. This is political AND technical. Someone saying "let’s cut throughput" is going to get shot down fast, no matter the technical merit. If someone with the political clout were to be willing to champion the work and guide the discussion appropriately while someone like me does the work, that's different. That's at least how things like this are done in other communities, unless go is different.

                • 9rx 7 days ago

                  > If someone with the political clout were to be willing to champion the work and guide the discussion appropriately while someone like me does the work, that's different.

                  There is unlikely anyone on the Go team with more political clout in this particular area than the one who has already reached out to you. You obviously didn't respond to him publicly, but did he reject your offer in private? Or are you just imaging some kind of hypothetical scenario where they are refusing to talk to you, despite evidence to the contrary?

                  • withinboredom 7 days ago

                    > You obviously didn't respond to him publicly, but did he reject your offer in private?

                    I literally have no idea what you're talking about here.

                    • 9rx 7 days ago

                      You must not have read all the comments yet? One of Go's key runtime maintainers sent you a message. Now is your opportunity to give him your plan so that he can give you the political support you seek.

  • red_admiral 7 days ago

    > If you have a system that requires predictable latency, go is not the right language for it.

    I presume that's by design, to trade off against other things google designed it for?

  • desdenova 7 days ago

    > If you have a system, go is not the right language for it.

    FTFY

Someone 7 days ago

> a goroutine’s state is surprisingly small. The mcall() assembly function only saves 3 values — the stack pointer, the program counter, and the base pointer — into a tiny gobuf struct. That’s it. Why so few? Because goroutine switches happen at function call boundaries, and at those points the compiler has already spilled any important registers to the stack following normal calling conventions.

Wouldn’t that mean go never uses registers to pass arguments to functions?

If so, that seems in conflict with https://go.dev/src/cmd/compile/abi-internal#function-call-ar..., which says “Because access to registers is generally faster than access to the stack, arguments and results are preferentially passed in registers”

Or does the compiler always Go’s stable ABI, known as ABI0 in functions where it inserts code to potentially context switch, and only uses the (potentially) faster ABI that passes arguments in registers elsewhere?

  • mknyszek 7 days ago

    The compiler generates code to spill arguments to the stack at synchronous preemption points (function entry). Signal-based preemption has a spill path that saves the full ABI register set.

avabuildsdata 7 days ago

The unfair scheduling point resonates. I run a lot of concurrent HTTP workloads in Go (scraping, data pipelines) and the scheduler is honestly fine for throughput-oriented work where you don't care about tail latency. But the moment you need consistent response times under load it becomes a real problem. GOMAXPROCS tuning and runtime.LockOSThread help in narrow cases but they're band-aids. The lack of priority or fairness knobs is a deliberate design choice but it does push certain workloads toward other runtimes.

  • valyalaOP 3 days ago

    If the server cannot keep up with the given workload because of some bottleneck (CPU, network, disk IO), then it cannot guarantee any response times - incoming queries will be either rejected or queued in a long wait queue, which will lead to awfully big response times. This doesn't depend on the programming language or the framework the server written in.

    If you want response time guarantees, make sure the server has enough free resources for processing the given workload.

Horos 7 days ago

Isn't a dedicated worker pool with priority queues enough to get predictable P99 without leaving Go?

If you fix N workers and control dispatch order yourself, the scheduler barely gets involved — no stealing, no surprises.

The inter-goroutine handoff is ~50-100ns anyway.

Isn't the real issue using `go f()` per request rather than something in the language itself?

  • withinboredom 7 days ago

    No. Eventually the queues get full and go routines pause waiting to place the element onto the queue, landing you right back at unfair scheduling.

    https://github.com/php/frankenphp/pull/2016 if you want to see a “correctly behaving” implementation that becomes 100% cpu usage under contention.

    • Horos 7 days ago

      fair point on blocking sends — but that's an implementation detail, not a structural one.

      From my pov, the worker pool's job isn't to absorb saturation. it's to make capacity explicit so the layer above can route around it. a bounded queue that returns ErrQueueFull immediately is a signal, not a failure — it tells the load balancer to try another instance.

      saturation on a single instance isn't a scheduler problem, it's a provisioning signal. the fix is horizontal, not vertical. once you're running N instances behind something that understands queue depth, the "unfair scheduler under contention" scenario stops being reachable in production — by design, not by luck.

      the FrankenPHP case looks like a single-instance stress test pushed to the limit, which is a valid benchmark but not how you'd architect for HA.

  • vlowther 7 days ago

    My usecase was building an append-only blob store with mandatory encryption, but using a semaphore + direct goroutine calls to limit background write concurrency instead of a channel + dedicated writer goroutines was a net win across a wide variety of write sizes and max concurrent inflight writes. It is interesting that frankenphp + caddy came up with almost the same conclusion despite vastly different work being done.

    • Horos 7 days ago

      this makes sense for your workload, but may the right primitive be a function of your payload profile and business constraints ?

      in my case the problem doesn't arise because control plane and data plane are separated by design — metadata and signals never share a concurrency primitive with chunk writes. the data plane only sees chunks of similar order of magnitude, so a fixed worker pool doesn't overprovision on small payloads or stall on large ones.

      curious whether your control and data plane are mixed on the same path, or whether the variance is purely in the blob sizes themselves.

      if it's the latter: I wonder if batching sub-1MB payloads upstream would have given you the same result without changing the concurrency primitive. did you have constraints that made that impractical?

      • vlowther 6 days ago

        In my case, "background writes" literally means "do the io.WriteAt for this fixed-size buffer in another goroutine so that the one servicing the blob write can get on with encryption / CRC calculation / stuffing the resulting byte stream into fixed-size buffers". Handling it that way lets me keep the IO to the kernel as saturated as possible without the added schedule + mutex overhead sending stuff thru a channel incurs, while still keeping a hard upper bound on IO in flight (max semaphore weight) and write buffer allocations (sync.Pool). My fixed-size buffers are 32k, and it is a net win even there.

        • Horos 4 days ago

          right — no variance, question was off target. worth noting though: the sema-bounded WriteAt goroutines are structurally a fan-out over homogeneous units, even if the pipeline feels linear from the blob's perspective. that's probably why the channel adds nothing — no fan-in, no aggregation, just bounded fire-and-forget.

GeertVL 7 days ago

This is an excellent idea as a blog. Kudos!

capricio_one 7 days ago

Go missed a big opportunity to be Rust when we needed Rust more than anything. I have long since moved on from Go and C#/.NET is widely available nowadays and in many respects less held back by some strange political choices when it comes to DevEx (I am of course talking about generics).

  • 9rx 7 days ago

    Rust is the older project of the two, kicking off in 2006. Go, which set sail in 2007, duplicating the work of Rust would have been pointless. We already had Rust.

    Go's objective was to become a faster Python. Which was something we also desperately needed at the time, and it has well succeeded on that front. Go has largely replaced all the non-data science things people were earlier doing with Python.

    • za3faran 6 days ago

      If you saw the early presentations, they complained about the slow compile times and high complexity of C++. It seems that they were targeting that, not Python.

      • 9rx 6 days ago

        I did see the early presentations. And since you did too, you will recall that one of the primary priorities was for it to "feel like a dynamically-typed language". You know, because it was trying to be a faster Python.

        What you might be confusing that with is that their assumption was that Google services were written in C++ because those services needed C++ performance, not because the developers wanted to write code in C++, and that those C++ developers would jump at the chance to use a Python-like language that still satisfies them performance-wise. It turns out they were wrong — the developers actually did want to write C++ — but you can understand the thinking when Google was already using Python heavily in less performance-critical areas. Guido van Rossum himself was even on the payroll at the time.

        For what it is worth, Google did create "Rust" after learning that a faster Python doesn't satisfy C++ developers. It's called Carbon. But it is telling that the earlier commenter has never heard of it, and it is unlikely it will ever leave the heap of esoteric languages because duplicating Rust was, and continues to be, pointless. We already had Rust.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection