High-performance header-only container library for C++23 on x86-64

78 points by mattgodbolt 5 months ago · 29 comments · 1 min read

Reader

From the readme:

The B+tree implementation provides significant performance improvements over industry standards for large trees. For some workloads with large trees, we've observed:

- vs Abseil B+tree: 2-5× faster across insert/find/erase operations - vs std::map: 2-5× faster across insert/find/erase operations

plorkyeran 5 months ago

2-5x faster than both abseil's b+tree and std::map means that abseil's b+tree had to be the same performance as std::map for the tested workload. This is... very unusual. I have only ever seen it be much faster or moderately slower.

sedatk 5 months ago

Not necessarily. Insert could be 5x faster in one, and 2x faster in another, and there would still be orders of magnitude difference between both. 2x-5x is a long range.

the_arun 5 months ago

There is also new Adaptive Radix Tree implementation - https://www.db.in.tum.de/~leis/papers/ART.pdf which is supposed to be faster than B-Tree

barishnamazov 5 months ago

Also want to share B- tree implementation from the Algorithmica HPC book: https://en.algorithmica.org/hpc/data-structures/b-tree/

ognarb 5 months ago

> History/Motivations This project started as an exploration of using AI agents for software development. Based on experience tuning systems using Abseil's B+tree, I was curious if performance could be improved through SIMD instructions, a customized allocator, and tunable node sizes. Claude proved surprisingly adept at helping implement this quickly, and the resulting B+tree showed compelling performance improvements, so I'm making it available here.

It seems the code was written with AI, I hope the author knows what he is doing. Last time I tried to use AI to optimize CPU-heavy C++ code (StackBlur) with SIMD, this failed :/

klaussilveira 5 months ago

Both Codex/Claude Code are terrible with C++. Not sure why that is, but they just spit out nonsense that creates more work than it helps me.
Have you tried to do any OpenGL or Vulkan work with it? Very frustrating.
React and HTML, though, pretty awesome.
- inetknght 5 months ago
  On the other hand, I've been using Claude Code for the past several months at work in several C++ projects. It's been fine at understanding C++. It just generates a lot of boilerplate, doesn't follow DRY, and gets persnickety with tests.
  I've started adding this to all of my new conversations and it seems to help:
  You are a principal software engineer. I report to you. Do not modify files. Do not write prose. Only provide observations and suggestions so that I can learn from you.
  My question to the LLM then follows in the next paragraph. Foregoing most of the LLM's code-writing capabilities in favor of giving observations and ideas seems to be a much better choice for productivity. It can still lead me down rabbit holes or wrong directions, but at least I don't have to deal with 10 pages of prose in its output or 50 pages of ineffectual code.
  - tarnith 5 months ago
    
    Yeah, it's a decent rubber duck.
    As soon as it starts trying to write actual code or generate a bunch of files it's less than helpful very quickly.
    Perhaps I haven't tried enough, but I'm entirely unsold on this for anything lower level.
    
    dustbunny 5 months ago
    
    Gemini & ChatGPT have not done well at writing or analyzing OpenGL like rendering code for me, as well. And for many algorithms, it's not good at explaining them as well. And for some of the classical algorithms, like cascading shadow mapping, even articles written by people and example source code that I found is wrong or incomplete.
    Learning "the old ways" is certainly valuable, because the AIs and the resources available are bad at these old ways.
- simonw 5 months ago
  
  Which models?
  It's possible Opus 4.5 and GPT-5.2 are significantly less terrible with C++ than previous models. Those only came out within the past 2 months.
  They also have significantly more recent knowledge cut-off dates.
  - klaussilveira 5 months ago
    
    I'll be specific:
    I've been recently working with Opus 4.5 and GPT-5.2. Both have been unable to migrate a project from using ARB shaders to 3.3 and GLSL. And I don't mean migrating the shaders themselves, just changing all the boring glue code that tells the application to use GLSL and manage those instead of feeding the ARB shaders directly.
    They have also failed spectacularly at implementing this paper: https://www.cse.chalmers.se/~uffe/soft_gfxhw2003.pdf
    No matter how I sliced it, I could not get a simple cube to have the shadows as described in the paper.
    I've also recently tried to get Opus 4.5 to move the Job system from Doom 3 BFG to the original codebase. Clean clone of dhewm3, pointed Opus to the BFG Job system codebase, and explained how it works. I have also fed it the Fabien Sanglard code review of the job system: https://fabiensanglard.net/doom3_bfg/threading.php
    As well as the official notes that explain the engine differences: https://fabiensanglard.net/doom3_documentation/DOOM-3-BFG-Te...
    I did that because, well, I had ported this job system before and knew it was something pretty "pluggable" and could be implemented by an LLM. Both have failed. I'm yet to find a model that does this.
    
    dustbunny 5 months ago
    
    It's funny, I've also been trying to use AI to implement (simpler) shadow mapping code and it has failed. I eventually formed a very solid understanding of the problem domain myself and achieved my goals with hand written code.
    I might try to implement this paper, great find! I love this 2000-2010 stuff
    
    klaussilveira 5 months ago
    
    Oh, boy, then I have something for you:
    https://artis.inrialpes.fr/Publications/2003/HLHS03a/SurveyR...
    https://mrelusive.com/publications/papers/SIMD-Shadow-Volume...
    https://terathon.com/gdc05_lengyel.pdf
    
    dustbunny 5 months ago
    
    Perfect. Now I can continue to be confused! Beginner mindset!
    
    simonw 5 months ago
    
    Thanks, that's very specific! Sounds like that's out of reach of the current generation of models.
    Will be interesting to see if models in six months time can handle this, since they clearly can't do it today.
- DrBazza 5 months ago
  
  In what scenarios are they terrible? I hope not every scenario. I've found Codex adequate for refactoring and unit tests. I've not used it in anger to write any significant new code.
  I suppose part of the problem is that training a model on publicly available C++ isn't going to be great because syntactically broken code gets posted to the web all the time, along with suboptimal solutions. I recall a talk saying that functional languages are better for agents because the code published publicly is formally correct.
- FpUser 5 months ago
  
  I use ChatGPT with C++ but in very limited manner. So far it was overall win. I watch the code very closely of course and usually end up doing few iterations (mostly optimizing for speed, reliability, concurrency).
  Also to generate boilerplate / repetitive.
  Overall I consider it a win.
- nurettin 5 months ago
  
  I use Claude to generate C++ 23, it usually performs well. It takes a bit of nudging to avoid repeating itself, reusing existing functionality, not altering huge portions without running tests, etc. But generally it is helpful and knows what to do.
- seg_fault 5 months ago
  
  I had the same experience. C++ doesn't even compile or I have to tell it all the time "use C++23 features". I tried to learn OpenGL with it. This worked out a bit, since I had to spot the errors :D
  - TingPing 5 months ago
    
    Same here. C++ changes fast and can be written in many styles so not a ton of training data I assume.
leopoldj 5 months ago

I apologize if this is common knowledge. Modern C++ coding agents need to have a deep semantic understanding of the external libraries and header files. A simple RAG on the code base is not enough. For example, GitHub Copilot for VS Code and Visual Studio uses IDE language services like IntelliSense. To that extent, using a proper C++ IDE rather than a plain editor will improve the quality of suggested code. For example, if you're using VS Code, make sure the C/C++ Extension Pack is installed.
LoganDark 5 months ago

Oh hey, I wrote a Stackblur implementation in Rust. The trick I used is to SIMD across multiple rows/columns of the image rather than trying to SIMD the algorithm itself.
https://github.com/logandark/stackblur-iter
shihab 5 months ago

I'd love to see a breakdown of what exactly worked here, or better yet, PR to upstream Abseil that implements those ideas.
AI is always good at going from 0 to 80%, it's the last 20% it struggles with. It'd be interesting to see a claude-written code making its way to a well-established library.

dicroce 5 months ago

Ok, maybe someone here can clear this up for me. My understanding of B+tree's is that they are good for implementing indexes on disk because the fanout reduces disk seeks... what I don't understand is in memory b+trees... which most of the implementations I find are. What are the advantages of an in memory b+tree?

wffurr 5 months ago

https://github.com/abseil/abseil-cpp/blob/master/absl/contai... mentions that b-tree maps hold multiple values per node, which makes them more cache-friendly than the red-black trees used in std::map.
You use either container when you want a sorted associative map type, which I have not found many uses cases for in my work. I might have a handful of them versus many instances of vectors and unsorted associative maps, i.e. absl::flat_hash_map.
dataflow 5 months ago

Memory also has a seek penalty. It's called a cache miss penalty. It might be easier to think of them in general as penalties for nonlocality.

Settings

High-performance header-only container library for C++23 on x86-64

Keyboard Shortcuts