Settings

Theme

The Green Tea Garbage Collector

go.dev

173 points by 0xedb 2 months ago · 38 comments

Reader

turtletontine 2 months ago

Wow… this is an excellent article. I’ve always been fascinated by GCs (well, as long as I’ve known what they are), and I just love seeing this kind of technical but accessible explanation of how they work, their bottlenecks, and a great new idea about solving those bottlenecks. This is exactly the kind of article that I hope to see every time I load up hacker news

sirwhinesalot 2 months ago

Congratulations to Michael Knyszek and Austin Clements for writing an absolutely top tier blog post that is as clear as it gets. I wish my writing was this good. I don't even use Go and it was still 100% a great use of my time to read this.

antonchekhov 2 months ago

Acceleration by using the x86 AVX-512 extensions is especially compelling. Since ARM64 processors are becoming pervasive in server-side systems, is-there/will-there-be any optimization using the ARM64 NEON vector instructions in current or future Go versions? (The NEON instructions are 128-bit, instead of 512 bits in the AVX-512 set, but may still be useful.)

luafox 2 months ago

the two little slide decks showing each garbage collector in action are simply wonderful, and really help communicate how this improves go's GC situation

  • Cthulhu_ 2 months ago

    It's also a great CS primer on garbage collection; Go has made me interested in that aspect of software engineering again, it feels importaint again unlike with higher level languages like Java / JS.

pizlonator 2 months ago

This is very cool.

I've already been using bitvector SIMD for the sweep portion of mark/sweep. It's neat to see that tracing can be done this way.

VGF2P8AFFINEQB FTW

matthewmueller 2 months ago

Appreciated the human element paragraph at the end!

  • yvdriess 2 months ago

    Yep, it's awesome how Michael keeps crediting collaborators, given how much of the work is his. Good job!

jecel a month ago

If we label the combinations of the seen and scanned bits as:

00: white

10: gray

11: black

then we cam describe it as a very cool variation of the tri-color gc algorithm.

https://en.wikipedia.org/wiki/Tracing_garbage_collection#Tri...

boris_m 2 months ago

What's a page?

  • hedgehog 2 months ago

    A (usually) small amount of memory that is the standard size all the memory management hardware and software use. Often 16Kb or 4Kb. If physical memory gets mapped to logical address space, address space marked read only, data swapped in or out, or logical address space gets mapped to other hardware (say GPU memory or a network card's buffer) it's usually done by page.

    https://en.wikipedia.org/wiki/Page_%28computer_memory%29

  • aclements a month ago

    Thanks for this question! We added a couple sentences to the blog post to explain what a page is. In general, a page is a region of memory that has a large-ish fixed power-of-two size and is also aligned to its size. Virtual memory structures memory around pages, which are typically 4 KiB to 64 KiB depending on the hardware. The Go memory manager, and many other memory managers, also structure memory around pages, which may or may not match the hardware page size. In Go, pages are always 8 KiB and aligned to 8 KiB.

cyberax 2 months ago

I wonder if it can be abused with malicious actors that can arrange the RAM to be filled with pages containing just one alive object.

  • Someone 2 months ago

    FTA: “The implementation of Green Tea has a special case for pages that have only a single object to scan. This helps reduce regressions, but doesn’t completely eliminate them.”

    Also FTA: “One surprise result of this work was that scanning a mere 2% of a page at a time can yield improvements over the graph flood.”

    ⇒ I think you’d have to try and get two objects on each page, and they would have to be small (you’d have to be able to fit over 100 objects in a page to have 2 live objects be <2% of all objects in the page)

  • Cthulhu_ 2 months ago

    I think you've got other worries if malicious actors have that kind of influence over the internal memory usage of a running application.

btreecat 2 months ago

Really great read. Both as a refresher for GC and as an explanation on how approaches are having to change due to hardware.

rurban a month ago

Still not a state of the art copying collector.

  • itsTyrion a month ago

    what about recent JVM GCs? Shenandoah (incl generational) and ZGC?

  • mfru a month ago

    What would be and which language has it?

    • rurban a month ago

      Good collectors are language independent. Bad collectors like Mark & Sweep are just needed for stable extern pointers, like in ffi callbacks.

      All better languages use a modern copying collector, if they have enough memory. It's also compacting, and doesn't stop the world. I think lisps just do mark & sweep on phones or embedded, and the mentioned ffi callbacks.

thenthenthen 2 months ago

Curious where the name is coming from/hinting at?

  • aclements a month ago

    I'm a long-time fan of matcha and wrote the initial prototype that demonstrated Green Tea was viable while cafe crawling in Yokohama and drinking lots of matcha. "Matcha" didn't seem like a great name for a garbage collector, but matcha is a form of green tea and "Green Tea GC" rolled off the tongue, so I called my prototype Green Tea and the name stuck.

krbaccord94f 2 months ago

gc varx as an enumeration of a centrifugal cycle per average cost

dzonga 2 months ago

what revenue / profitable google services are actually relying on golang ?

  • parliament32 2 months ago

    Kubernetes as a whole is the best example I can think of, given that it's deployed in most modern tech companies and every cloud provider offers a managed service.

    • Cthulhu_ 2 months ago

      That's an application (as is Docker, also built in Go), but the question was about internal Google services and... we don't know because company secrets, but it's likely on the rise as it was written as a replacement for C++ which was their previous main language for backend services alongside Java/Kotlin. One source with the charming name "assbuttass" [0] says all new services are written in Go, with a follow-up by "deathmaster99" saying only 10% of code is Go, but this was a year ago and even 10% at Google's scale probably represents tens of millions of LOC.

      [0] https://www.reddit.com/r/golang/comments/1c9fhet/how_much_go...

  • kelseyhightower a month ago

    Google Cloud products including GKE (Kubernetes), Cloud Run/Functions, the gcloud CLI, and a number of other utilities and control plane components sit it direct revenue paths. In the case of Cloud Run/Functions (Go support) and GKE, those products generate direct revenue, and the amount is much higher than you would think.

  • 0xjnml a month ago

    YouTube is one such.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection