Settings

Theme

C++: Associative map containers with compile-time lookup

github.com

155 points by hogliux 7 years ago · 44 comments

Reader

drej 7 years ago

CppCon has super interesting content. That comes from a person who has written maybe two dozen lines of C++ code, mostly just hello world. E.g. this talk by Matt Godbolt or anything by Chadler Carruth.

https://www.youtube.com/watch?v=bSkpMdDe4g4

jordigh 7 years ago

This is a sore point for me in D. Assoc arrays in D don't work at compile time. It's a really longstanding issue:

https://issues.dlang.org/show_bug.cgi?id=1578

It's a real shame because I prefer just about everything else about metaprogramming and compile-time evaluation in D over C++.

  • atilaneves 7 years ago

    You can write your own in D though.

    • jordigh 7 years ago

      I don't know how to yet. And if this is possible, is it also possible to fix standard assoc arrays?

      • atilaneves 7 years ago

        > is it also possible to fix standard assoc arrays?

        AFAIK, yes, it's just that nobody has done that yet.

        The D runtime was written before there were templates, and it shows.

petters 7 years ago

I can see how `semi::map` could be useful, but `semi::static_map` just seems to be an alternate syntax for global variables.

  • hogliuxOP 7 years ago

    Yes, in a way you are right: the compiler really does boil it down to global variables. However, `semi::static_map` gives you some additional features, which may be useful for a lot of situations

    1) `semi::static_map` gives you the optional run-time look-up with the same API (`semi::static_map::get`). Looking up a global variable with a runtime key isn't straight-forward. This makes it particularly nice to use an API where a function might take either a compile-time or run-time key. I'm thinking a Font API with a `getFont` method, for example. The way you use the method is the same regardless if you are using a compile-time key or not.

    2) `semi::static_map` also gives you control over the lifetime of the objects. For example, it's easy to delete all the values in the map at once. Again, this is not straight-forward to do with a bunch of globals.

    3) the keys are also values themselves and it's straight-forward to evaluate them at run-time. Printing the name of a global variable, for example, requires all sorts of hackery.

    4) you do not need to know all your keys in advance. Simply using getFont with a unique constexpr key will create a new global variable.

    Edit: added point (4)

    • jstimpfle 7 years ago

      Re: 3), here's a clean and technically humble way that I typically use.

          enum {
              KEY_FOO,
              KEY_BAR,
              KEY_BLAH,
              NUM_KEYS,
          };
      
          struct KeyInfo {
              const char *name;
              int info1;
              float info2;
          };
      
          const static struct KeyInfo keyInfo[NUM_KEYS] = {
          #define MAKE(x, y, z) [x] = { #x, y, z }
              MAKE( KEY_FOO,   1,   1.0 ),
              MAKE( KEY_BAR,   7,   3.5 ),
              MAKE( KEY_BLAH, 42, 127.2 ),
          #undef MAKE
          };
      
          void print_key(int key)
          {
              printf("%d's name is %s\n", key, keyInfo[key].name);
          }
      
      It's both low-tech and maintainable. It compiles super quick. If one doesn't like that one has to type KEY_FOO twice (in the enum and in the array definition), one can use X-macros or code generation. But personally I don't care.
      • hogliuxOP 7 years ago

        Yes, that's a nice approach. However, this approach requires you to list all your keys in advance (in the enum).

        The `semi::static_map` does not require this.This is especially useful if you are writing library code: imagine you are programming a `getFont` method - you don't know with which constexpr keys your method will be called, so there is no way for you to list all these keys in an enum.

        • kazinator 7 years ago

          If you don't like pointlessly repeating occurrences of identifiers, you should hardly be favoring C++.

             class foo {
             public:
               foo();
               ~foo();
             };
           
             foo::foo() { }
          
             foo::~foo() { }
          
          Ad nauseum.

          Oh, but C++ can eliminate redundancy in this one cool case I'm thinking of!

      • IshKebab 7 years ago

        C++ doesn't support designated initialisers.

    • petters 7 years ago

      Thanks for this clarification! I agree it can be useful.

  • Nalta 7 years ago

    True, but I think this could really be useful for cases similar to google flags (https://github.com/gflags/gflags) wherein each file in a library adds to the global configuration. Then your main can more easy parse out these options to be shared across the entire codebase.

  • nightfly 7 years ago

    But somehow slower! "In fact, when using semi::static_map and looking up a value with C++ literal as a key, then the value lookup is nearly as efficient as looking up a global variable."

    • hogliuxOP 7 years ago

      Yes with the optional run-time lookup there is an extra two machine instructions (a cmp and jne) needed to check if this is the first time accessing the value. It is exactly as fast as a global variable if you don't need the optional run-time lookup.

      As stated in the talk, the main use case for a map like this is to cache objects which are expensive to load/compute (again something like `getFont` comes to mind). In these use-cases you would likely also need a check like this anyway: a global pointer object which you need to check if it's a nullptr or not before using it.

      • quotemstr 7 years ago

        Not exactly as fast. Without -Bsymbolic, a global variabile with global symbol visibility gets its own GOT entry, meaning access to your global goes through an indirection table that supports ELF symbol interposition. (Which, IMHO, was a bad idea, but you can't unbreak an egg.) With a data structure like this (or with a conventional "struct globals"), there's one symbol at the ELF level to look up instead of one per variable, so your code might end up doing fewer GOT lookups.

        The last paragraph was just arcana though. You should be passing -fvisibility=hidden and explicitly exporting from your DSO only the symbols you might legitimately expect some other DSO to use, obviating the issue.

        • hogliuxOP 7 years ago

          Thanks, I'll have a look at this!

        • jstimpfle 7 years ago

          Look at the approach from my other comment. Requires only 1 symbol.

          • hogliuxOP 7 years ago

            Both your approach and the `semi::static_map` will generate the same type of load instruction. In both cases, the compiler knows the offset of the values in memory at compile-time. The number of symbols is not really important here.

nemoniac 7 years ago

C++ is getting more and more like Lisp.

  • stochastic_monk 7 years ago

    The functional programming concepts which have made it into C++1[147] have made programming in C++ much easier, more concise, and quite fast. I think they’re among the better recent additions to the language. Lambdas, for_each, transform, and accumulate all see regular use in my code.

  • carlmr 7 years ago

    A secret cabal of Lisp programmers took over the C++ steering committee.

    • krapp 7 years ago

      Is their goal to keep adding more and more angle brackets until one day C++ is completely homoiconic?

    • w8rbt 7 years ago

      Yes, because they desire speed.

  • de_watcher 7 years ago

    It has started long ago with STL.

  • vaylian 7 years ago

    Seems like Greenspun's tenth rule is in effect here.

sriram_sun 7 years ago

#define ID(x) []() constexpr { return x; }

Interesting! Macro substitutions can expand to anything of course. I hadn't really considered adding lambda functions to that list.

  • hogliuxOP 7 years ago

    I'm looking forward to C++-20 when this is not needed anymore :-) - it will support string literals as non-type template parameters.

jcelerier 7 years ago

are there fundamental differences with https://github.com/serge-sans-paille/frozen ?

  • Jyaif 7 years ago

    The map is not frozen.

    • hogliuxOP 7 years ago

      Yup. The frozen map cannot be updated once it is initialized:

      <<Once initialized, the containers cannot be updated, and in exchange, lookups are faster. And initialization is free when constexpr is used :-).>>

gumby 7 years ago

This looks really cool. Why add a different type static_map instead of using a constexpr declaration?

  • hogliuxOP 7 years ago

    Because with a constexpr declaration everything will be constexpr, i.e. you couldn't modify the values at run-time.

    The idea here is that finding the storage of the value is done at compile-time, but the values themselves are run-time values.

    • gumby 7 years ago

      Thanks, I wasn't very clear but now I see the problem. The C++ invocation syntax (basically the difference between expressions and statements inherited from Algol) makes it impossible to call if you want to update it at runtime.

      You could declare `static static_map<>::Value& get()` constexpr. (Actually you wouldn't need to; I think you could simply declare `semi::map<>::get()` to be constexpr) and I believe it would allow calling to look up a value to inlined at compile time.

      However using it as an lvalue isn't possible with current c++ syntax. As a long time Lisp programmer this limitation continually trips me up.

      Sorry about the backquotes; I know that HN doesn't use MD but it was the simplest way to denote code inline.

      • hogliuxOP 7 years ago

        Hmmm, I'm not sure I understand. You can't really mark `static static_map<>::Value& get()` constexpr because it returns a value which is only known at run-time - again the value is a run-time value, only the lookup (where in memory) is at compile-time. This has nothing to do with inlining.

        • gumby 7 years ago

          Ahh, I looked in the source and see that runtime_map is simply a std::unordered_map.

          I didn't look closely enough and assumed you'd made your own hash lookup so that parts of it could be run at compile time (in fact you could intern compile-time keys statically in the front of an object (in the initialized data section of the object file) adding a single additional probe at runtime) and let the linker patch up the locations in the usual fashion.

          This is all a move in the right direction!

cjaybo 7 years ago

Thanks for sharing. I'm a big fan of your work on the JUCE framework!

jstimpfle 7 years ago

Now even the linker gets reinvented as unreadable templates? Is there a benefit?

singularity2001 7 years ago

just tried C++ after some time and was set back by:

* no []= operator in custom maps

* no enum printing

* no backtrace

* other random desasters

At least they don't work without jumping through some hoops in all of your code. So for me c++ is about as fixable/appealing as Fortran.

https://stackoverflow.com/questions/3342726/c-print-out-enum...

https://stackoverflow.com/questions/691719/c-display-stack-t...

https://stackoverflow.com/questions/3581981/overloading-the-...

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection