C++ Attribute: Likely, Unlikely
en.cppreference.comWhat everybody should read before using these (Aaron Ballman is a Senior Staff Compiler Engineer for Intel and is the lead maintainer of the Clang open source compiler): https://blog.aaronballman.com/2020/08/dont-use-the-likely-or...
It's an odd article. The basic thesis is "use pgo instead", which is reasonable enough. It starts with a long diversion through edge cases of the attributes, none of which seemed particularly impactful in practice. Perhaps he was worried that just recommending pgo on its own wouldn't convince many people. There are many situations where you can't enable pgo organizationally, for example if you're part of a large company with a centralized build system, or if you package a library meant to be inlined, and want your code to be optimized even when it's built without pgo. The comparison to `register` and `inline` are interesting, but not very useful imo. Whether a given variable will benefit from being put in a register, or a function from being inlined, is usually very local information. The compiler can see when the variable will be accessed down the road, and hence whether moving it to the stack will tend to slow down later code. Whether a branch is likely or unlikely will frequently depend on information the compiler doesn't have (sans pgo), such as the distribution of an argument variable. In fact, it seems like this very information would be useful to a compiler in determining if it should inline a function or keep a variable in a register.
PGO also doesn't always optimize in the correct direction. If I have an error handling path in a hot loop, PGO can only optimize around its branch if it actually sees that error occur, and then it will draw wrong conclusions about the branch into the error handler because, absent fudging the tests, it will think the error path has higher importance than it does. I don't want the compiler to optimize for the error path at all, I want it to pessimize it to prioritize the non-error path. But the PGO analysis doesn't know that, it only sees branch patterns and probabilities, and not all error handling paths use exceptions.
PGO is also a pain to use in some situations. You need to be able to regularly exercise all of the main paths in the program under instrumentation, preferably automated, using a configuration as close to release build as possible. That's hard to do when your release build lacks automation support, has nondeterministic behavior by design, cross-compiles to another platform, or requires networked services to exercise main paths. I don't even know how people deal with PGO when there is a requirement for deterministic builds.
> Whether a branch is likely or unlikely will frequently depend on information the compiler doesn't have (sans pgo),
Exactly! Often it is simply impossible for the compiler to know. In this respect it is similar to std::unreachable.
I think that everybody should instead read proposal that introduced this feature, P0479R2[1]
[1] https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p04...
and look at code in the wild
[2] https://github.com/search?q=[[likely]]+language%3Ac%2B%2B&ty...
I loved it! Thanks! The last part resumes my thinking: These attributes are starting to look a bit more like some other code constructs we’ve seen in the past: the register keyword as an optimization hint to put things in registers and the inline keyword as an optimization hint to inline function bodies into the call site. Using register or inline for these purposes is often strongly discouraged because experience has shown … My take is: 99.9% of the time, when you start shaving some CPU cycles here and there, instead of doing algorithmic optimization, something is going wrong.
This is a pretty coherent argument that that new feature is broken and best ignored. Hopefully that's the approach clang will take.
PGO is a mixed blessing and detracts a bit from the thrust of the article. The more obvious conclusion is to continue using builtin_expect (on the boolean guard of a branch) which works great and has done for ages.
That is also covered by new C++ attributes, namely [[assume(expr)]].
However better not give data to the function that contradicts the condition if you don't want to figth nasal daemons.
Glad assume(expr) is available, if it maps onto builtin-expect but confused by the UB reference. Why would taking the less likely path be undefined behaviour?
Edit. It's because assume does not map to expect, it maps to builtin_assume. So that's just another way to write undefined c++.
Ah, I thought it was the same, I spend most of my C++ coding time on VC++ anyway.
A common use case for these is to prevent the compiler from inlining the unlikely case to avoid thrashing the instruction cache.
if (unlikely_condition) {
// Don’t inline this
expensive_operation();
}
It’s a good idea to check the generated assembly when using these as they can lead to weird reordering of the code.It may also help a bit with a cold branch predictor and with icache hit rate
The compiler can make sure that the body for the likely condition is inline with the rest of the code, while the unlikely condition (e.g. the else block of a likely if) can be outlined behind a forward branch
Keeping the unlikely code further aside and behind a branch helps the happy path stay hot and well-predicted
In GCC you can already use (both on functions¹ and labels²)
__attribute__(hot)
and __attribute(cold)
1. <https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Common-Functio...>; Since GCC 4.3, released March 5, 20082. <https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Label-Attribut...>; Since GCC 4.8, released March 3, 2013
The typical expansion of pre attribute likely/unlikely macros (for example from the linux kernel) is buitin_expected. Hot/cold should also work if a bit extreme.
Note that those are ultimately dealing with different concepts:
likely/unlikely are ultimately about branches - predict that a likely branch is taken and an unlikely branch is not taken. Note that there is some default logic in GCC even if the branches aren't tagged (for example, pointers are assumed to usually not be NULL). Failing that it usually generates the code in the order the source was laid out, but I don't think there's any "probability" weight here, just inertia, so it's easy for the optimizer to change it even by accident.
Note that the default probabilities are 90% and 10% (I've seen other software use 93.75% = 15/16); you can specify other probabilities if meaningful. Notably, choosing 50% encourages the generation of `cmov`.
hot/cold is ultimately about code size and section layout. Keep the hot code sections in cache, keep the cold code sections out of cache (and optimize it for size more than speed). Branches from non-cold to cold code are automatically tagged unlikely (not sure about hot to non-hot, or cold to anything), which is what makes people think they're related.
I often write my code pessimally in this regard so have a note in the back of my mind to someday use these in a few hot paths.
When I say “pessimally” I mean I usually check the unlikely cases right away and then put the normal case last:
Blah foo (something& arg) {
if (is_invalid (arg)) return blah(0);
if (is_inactive (arg)) return blah(1);
// ok do all the normal stuff
}
It makes the code clearer but slightly slower. I could always write a conditional for the hot path up front, but code is for human readers, right?So when people say “this clutters the code” they are right, but most of the time you just don’t worry about it — it need only clutter a few functions in your hot loops, where you’re willing to rewrite it anyway regardless of how ugly it gets.
It’s like looking at the standard library source: super cluttered, but it has to handle all sorts of weird corner cases and is called a lot. Normal code can ignore all that in more than 99.99% of the cases.
Names like 'is_invalid' and 'is_inactive' will lead to a double negation.
Agreed, this reads so much easier...
if (!is_valid) if (!is_active)
> I could always write a conditional for the hot path up front, but code is for human readers, right?
Does C or C++ actually make any promises that it’ll assume the “true” branch of the conditional will be taken? I always assumed that the compiler could make whatever weird decision it wants for that sort of thing.
TBH I’d probably just write normal stuff as a function, and then call that function directly in cases where performance is really crucial, if it can be done safely… if such a case exists…
No. The traditional compiler heuristic is to assume backwards branches are taken (loops) and forward branches are not.
I'm so confused by this. How would you avoid doing these checks? Aren't they invariants for your function?
I don’t think they suggested avoiding the checks, unless I’ve missed something?
The part that's confusing me is the part about avoiding clutter – if you have to do all the checks anyway, what clutter are you avoiding by changing the order of them?
That conditional would be checked, then if it failed, checked again after the normal case has run, in order to choose the arm to follow.
Not a huge deal execution-time-wise, but from a reader’s PoV, the way I write it says, “ok, the special cases don’t apply to the body so I don’t have to worry that the index will be out of range (or whatever) and can just focus on the logic”.
I believe the likelihood annotations are the things they are talking about, for cluttering the code. Not the argument checks.
Maybe the implied question is that, the compiler can optimise the checks to the occur in whatever order it wants.
> I usually check the unlikely cases right away and then put the normal case last.
Don't we all?
If only I lived in such a paradise!
I think this is likely (no pun intended) be confusing in case that we want to optimize for the case that is unlikely to occur. In HFT, you basically want to be fast in a few cases, and don't care that much about the cases where you would not act (which is majority of the case comparing to the number of markets event where you need to make decision on).
HGO, Hunch-guided Optimisation
Replacing hard data with the One True Source: I sensed it.
Any ideas how "likely" it needs to be benefit from likely? more than 50%? or 75%? or 90%? Can it be detrimental if it has higher changes but still close to 50%?
For GCC at least, 90% is what the optimizer assumes by default. With the GCC-specific version you can specify any arbitrary chance; specifyig 50% encourages `cmov`.
Yes, you could imagine a bunch of scenarios where this could hurt you. Imagine a compiler that outlines the rare branch in order to shrink the code size of the function so that the hot path has better icache performance. That function call you inserted is expensive.
This seems like it will clutter code. I wish it was more terse as I find modern C++ code bases to be way too verbose already. It starts to get straining when looking at new modules.
This is standardizing vendor-specific attributes that have existed for many years. The code that use these probably use some preprocessor macro to select the right builtins, and aren't going to gain much new clutter to replace those macros.
I believe this is the proposal that added them:
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p04...
The "references" section has links to GCC and Clang builtins:
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
http://llvm.org/docs/BranchWeightMetadata.html#built-in-expe...
> This is standardizing vendor-specific attributes
Except the standard’s likely and unlikely attributes invented new syntax that is not drop-in compatible with clang’s and gcc’s attributes.
Where clang and gcc would use:
the standard uses:if (__builtin_expect(x > 0, 1)) { … }if (x > 0) [[likely]] { … }It's definitely an unconventional syntax. In addition to the above, OpenMP and shader languages annotate the branch statement for parallelism or branch/predication hints. I can't think of precedent for C++ putting the hints in the branch targets. It does have some advantages, but it's not very intuitive. The first time I tried to use the new hints I did [[likely]] if(), which of course did nothing.
Doesn't look like it. The attribute goes on branches. This goes in the middle of basic blocks. Doesn't seem better to me, might try to find the rationale for the invention instead of standardising existing practice. That proposal shows the existing attributes with different syntax.
These annotations are really only of interest in performance-critical computations. It’s another knob for library writers to use to make the libraries you use magically faster for their users. And, should be quite rare outside of libraries.
And even then they should be handled with extreme care, as they can trigger UB if used incorrectly.
Compilers have always been making guesses about what the most likely code path is behind the scenes, but it still needs to behave correctly in the case where it was wrong (that will just be the less-optimal code path). All these attributes are doing is helping the compiler know instead of guess what the hot path is. if there is any way to confuse the compiler into giving undefined behavior with hints like this, that's a compiler bug. (not saying compiler bugs don't exist, but are you aware of a specific bug like this)?
Do you have examples? It's not clear from the article how you could have a UB with them.
If I remember correctly, Timur Doumler does some remarks on that sense on his presentation:
"Standard Attributes in C and C++"
Could you least give some timestamps? It's nearly two hours.
Anyway, while it is possible that some attributes can cause UB if misused, I very much doubt that's possible with [[likely]] and [[unlikely]], as they are just hints for the optimizer, and the optimizer is supposed to preserve semantical guarantees.
Some attributes can cause UB (most obviously [[unreachable]] in a spot that's reachable), but [[likely]] and [[unlikely]] can't.
I think C++ really just has bad defaults for many of its features. It's understandable given the age of the language, but I wish compiler developers would agree on a set of new default attributes for various language features and make a flag to enable them. That way, older style code can still compile but newer code isn't cursed with explicit attribute hell.
That will never happen given how many compilers exist, each with its own set of use cases.
They can agree at ISO level but even then it isn't enough, as proven by the whole set of issues that are currently being ignored on platforms where breaking the ABI is tabu.
I suppose there is no reason you can’t profile your code and have a tool insert these hints based on actual statistics from execution.
If you are relying on profile-directed optimization, the hints are almost surely redundant.
These are useful when there’s static knowledge about control flow that could assist the optimizer, e.g. with inlining decisions.
For example: It’s not uncommon to have “bi-modal” functions, where simple checks guard simple actions, followed by much more complex logic to handle everything else.
Are those checks for exceptional cases like an invariant violation? Think of an I/O write function confirming the device is open.
Or are they the “fast path” for the most common invocations? Think of std::vector::push_back() checking for available capacity.
The answer helps the optimizer immensely in deciding whether to “partially inline” the simple code into callers or not.
Sometimes the important performance critical path is the one least taken. You can't profile because the profiler has no way to know you don't care about the common path.
In general the profiler is a better tool, but there are rare exceptions and if those apply to you c++ gives you the control you need.
In my experience PGO is absolute garbage (for languages like C and C++). For complex programs all it does is bloat the code with no measurable benefit. And for every set of inputs where it improves performance there is another where it introduces slowdowns.
Any suggestions on how it could be terser while still being readable? If you're reading a new module using the functionality, would you prefer seeing
or[[likely]] return 2;
? Which one is more understandable if you're reading and not familiar to the syntax?@!l return 2;I understand the overall feeling but I’m not sure I understand the specific reason why you say this is making code bases more terse. Are you comparing this with the alternative of using GCC specific extensions or no definition of likely/unlikely code paths at all?
They're saying the opposite - that it makes code bases more verbose.
See also: https://blog.aaronballman.com/2020/08/dont-use-the-likely-or...
tl;dr: these attributes are absolutely full of footguns because the standard is not explicit about precedence and nesting, and you should probably avoid them and prefer to spend time investing in PGO. It’s very easy to make sane-looking code containing these attributes which does the exact opposite of what you intended.
Note that this issue does not exist with the equivalent C macros — those generally behave as expected. But you should probably just invest in PGO instead of static hints there, too.
My experience with PGO it that it's a much larger footgun. If you can't profile your actual production workload then it's very frequently just going to make your program slower. Even if you can profile production, it's still an easy way to completely blow out your long-tail latency for codepaths that PGO decides don't matter.
PGO is not a silver bullet. If you've identified a problem that can be solved by a simple static hint you should do that. I agree littering your code with likely/unlikely will probably make things worse, so it's best to save them for those exceptional cases where you know it will make an improvement.
It's not a silver bullet but PGO does subsume these hints. Consider what do you do if PGO conflicts with your static hints? My hypothesis is that in that case the static hint is most likely incorrect and contributing to slower code.
So either you aren't using PGO at all, which is fine if squeezing out these kinds of optimizations isn't that important to you, but then what's the point having these static hints?
Or you are using PGO, in which case there's no point in having these static hints because PGO will identify the likely and unlikely scenarios on your behalf. If PGO doesn't identify likely and unlikely branches, then the reason is because your profile isn't representative of how your program will actually be run in production, but in that case the solution is to provide a more representative profile instead of using [[likely]] and [[unlikely]].
If someone is going to go crazy and try mark the likelihood of all paths in their code then they clearly don't understand the feature.
That doesn't mean it shouldn't exist and isn't useful.
Oh, I didn't see this before posting. Well, at least the chance is higher that somebody reads the article ;)
Can this reasonably reliably steer speculative execution?
No, because speculative speculation attacks can choose to deliberately mislead the dynamic branch predictor prior to the actual attack.
Use `__builtin_speculation_safe_value` to defend against that.