C++: Associative map containers with compile-time lookup
github.comCppCon has super interesting content. That comes from a person who has written maybe two dozen lines of C++ code, mostly just hello world. E.g. this talk by Matt Godbolt or anything by Chadler Carruth.
Also highly recommended: Anything by Jason Turner and Ben Deane.
Examples:
- Ben Deane - Using Types Effectively [0]
- Jason Turner - Rich Code for Tiny Computers: A Simple Commodore 64 Game in C++17 [1]
- Ben Deane & Jason Turner - constexpr ALL the Things! [2]
[0] https://www.youtube.com/watch?v=ojZbFIQSdl8
Thanks! And just today, compile time regular expressions
Anything from Chandlery on C++ performance is worth gold.
This is a sore point for me in D. Assoc arrays in D don't work at compile time. It's a really longstanding issue:
https://issues.dlang.org/show_bug.cgi?id=1578
It's a real shame because I prefer just about everything else about metaprogramming and compile-time evaluation in D over C++.
You can write your own in D though.
I don't know how to yet. And if this is possible, is it also possible to fix standard assoc arrays?
> is it also possible to fix standard assoc arrays?
AFAIK, yes, it's just that nobody has done that yet.
The D runtime was written before there were templates, and it shows.
I can see how `semi::map` could be useful, but `semi::static_map` just seems to be an alternate syntax for global variables.
Yes, in a way you are right: the compiler really does boil it down to global variables. However, `semi::static_map` gives you some additional features, which may be useful for a lot of situations
1) `semi::static_map` gives you the optional run-time look-up with the same API (`semi::static_map::get`). Looking up a global variable with a runtime key isn't straight-forward. This makes it particularly nice to use an API where a function might take either a compile-time or run-time key. I'm thinking a Font API with a `getFont` method, for example. The way you use the method is the same regardless if you are using a compile-time key or not.
2) `semi::static_map` also gives you control over the lifetime of the objects. For example, it's easy to delete all the values in the map at once. Again, this is not straight-forward to do with a bunch of globals.
3) the keys are also values themselves and it's straight-forward to evaluate them at run-time. Printing the name of a global variable, for example, requires all sorts of hackery.
4) you do not need to know all your keys in advance. Simply using getFont with a unique constexpr key will create a new global variable.
Edit: added point (4)
Re: 3), here's a clean and technically humble way that I typically use.
It's both low-tech and maintainable. It compiles super quick. If one doesn't like that one has to type KEY_FOO twice (in the enum and in the array definition), one can use X-macros or code generation. But personally I don't care.enum { KEY_FOO, KEY_BAR, KEY_BLAH, NUM_KEYS, }; struct KeyInfo { const char *name; int info1; float info2; }; const static struct KeyInfo keyInfo[NUM_KEYS] = { #define MAKE(x, y, z) [x] = { #x, y, z } MAKE( KEY_FOO, 1, 1.0 ), MAKE( KEY_BAR, 7, 3.5 ), MAKE( KEY_BLAH, 42, 127.2 ), #undef MAKE }; void print_key(int key) { printf("%d's name is %s\n", key, keyInfo[key].name); }Yes, that's a nice approach. However, this approach requires you to list all your keys in advance (in the enum).
The `semi::static_map` does not require this.This is especially useful if you are writing library code: imagine you are programming a `getFont` method - you don't know with which constexpr keys your method will be called, so there is no way for you to list all these keys in an enum.
If you don't like pointlessly repeating occurrences of identifiers, you should hardly be favoring C++.
Ad nauseum.class foo { public: foo(); ~foo(); }; foo::foo() { } foo::~foo() { }Oh, but C++ can eliminate redundancy in this one cool case I'm thinking of!
C++ doesn't support designated initialisers.
Thanks for this clarification! I agree it can be useful.
True, but I think this could really be useful for cases similar to google flags (https://github.com/gflags/gflags) wherein each file in a library adds to the global configuration. Then your main can more easy parse out these options to be shared across the entire codebase.
But somehow slower! "In fact, when using semi::static_map and looking up a value with C++ literal as a key, then the value lookup is nearly as efficient as looking up a global variable."
Yes with the optional run-time lookup there is an extra two machine instructions (a cmp and jne) needed to check if this is the first time accessing the value. It is exactly as fast as a global variable if you don't need the optional run-time lookup.
As stated in the talk, the main use case for a map like this is to cache objects which are expensive to load/compute (again something like `getFont` comes to mind). In these use-cases you would likely also need a check like this anyway: a global pointer object which you need to check if it's a nullptr or not before using it.
Not exactly as fast. Without -Bsymbolic, a global variabile with global symbol visibility gets its own GOT entry, meaning access to your global goes through an indirection table that supports ELF symbol interposition. (Which, IMHO, was a bad idea, but you can't unbreak an egg.) With a data structure like this (or with a conventional "struct globals"), there's one symbol at the ELF level to look up instead of one per variable, so your code might end up doing fewer GOT lookups.
The last paragraph was just arcana though. You should be passing -fvisibility=hidden and explicitly exporting from your DSO only the symbols you might legitimately expect some other DSO to use, obviating the issue.
Thanks, I'll have a look at this!
Look at the approach from my other comment. Requires only 1 symbol.
Both your approach and the `semi::static_map` will generate the same type of load instruction. In both cases, the compiler knows the offset of the values in memory at compile-time. The number of symbols is not really important here.
C++ is getting more and more like Lisp.
The functional programming concepts which have made it into C++1[147] have made programming in C++ much easier, more concise, and quite fast. I think they’re among the better recent additions to the language. Lambdas, for_each, transform, and accumulate all see regular use in my code.
A secret cabal of Lisp programmers took over the C++ steering committee.
Is their goal to keep adding more and more angle brackets until one day C++ is completely homoiconic?
Yes, because they desire speed.
It has started long ago with STL.
Seems like Greenspun's tenth rule is in effect here.
#define ID(x) []() constexpr { return x; }
Interesting! Macro substitutions can expand to anything of course. I hadn't really considered adding lambda functions to that list.
I'm looking forward to C++-20 when this is not needed anymore :-) - it will support string literals as non-type template parameters.
are there fundamental differences with https://github.com/serge-sans-paille/frozen ?
The map is not frozen.
Yup. The frozen map cannot be updated once it is initialized:
<<Once initialized, the containers cannot be updated, and in exchange, lookups are faster. And initialization is free when constexpr is used :-).>>
This looks really cool. Why add a different type static_map instead of using a constexpr declaration?
Because with a constexpr declaration everything will be constexpr, i.e. you couldn't modify the values at run-time.
The idea here is that finding the storage of the value is done at compile-time, but the values themselves are run-time values.
Thanks, I wasn't very clear but now I see the problem. The C++ invocation syntax (basically the difference between expressions and statements inherited from Algol) makes it impossible to call if you want to update it at runtime.
You could declare `static static_map<>::Value& get()` constexpr. (Actually you wouldn't need to; I think you could simply declare `semi::map<>::get()` to be constexpr) and I believe it would allow calling to look up a value to inlined at compile time.
However using it as an lvalue isn't possible with current c++ syntax. As a long time Lisp programmer this limitation continually trips me up.
Sorry about the backquotes; I know that HN doesn't use MD but it was the simplest way to denote code inline.
Hmmm, I'm not sure I understand. You can't really mark `static static_map<>::Value& get()` constexpr because it returns a value which is only known at run-time - again the value is a run-time value, only the lookup (where in memory) is at compile-time. This has nothing to do with inlining.
Ahh, I looked in the source and see that runtime_map is simply a std::unordered_map.
I didn't look closely enough and assumed you'd made your own hash lookup so that parts of it could be run at compile time (in fact you could intern compile-time keys statically in the front of an object (in the initialized data section of the object file) adding a single additional probe at runtime) and let the linker patch up the locations in the usual fashion.
This is all a move in the right direction!
Thanks for sharing. I'm a big fan of your work on the JUCE framework!
Now even the linker gets reinvented as unreadable templates? Is there a benefit?
just tried C++ after some time and was set back by:
* no []= operator in custom maps
* no enum printing
* no backtrace
* other random desasters
At least they don't work without jumping through some hoops in all of your code. So for me c++ is about as fixable/appealing as Fortran.
https://stackoverflow.com/questions/3342726/c-print-out-enum...
https://stackoverflow.com/questions/691719/c-display-stack-t...
https://stackoverflow.com/questions/3581981/overloading-the-...