The Subtle Dangers of the Comma Operator (C++)
humanreadablemag.com> Overloading the Comma Operator Is Powerful
Clearly, it's over 9000!
... more seriously though, overloading the comma operator is a bad idea and you shouldn't do it. In fact, almost nobody does it.
Specifically, you don't want to break the principle of least astonishment with a command such as
v+= 1,2,3,4,5;
Assuming you want to add up a bunch of literals, what you could do is something like: std::array data { 1,2,3,4,5 };
v += std::reduce(data.begin(), data.end());
and if you've implemented some <algorithm> and <numeric> wrappers for containers, you would have something like an template<class Container>
typename Container::value_type
reduce(const Container& container);
and then you would write v += reduce(std::array{ 1,2,3,4,5 } );
which is terse and much clearer.An even more concise form is built in, and I think more obvious to readers:
v.insert(v.end(), { 1, 2, 3, 4, 5 });I was actually not describing adding elements to a container, but adding up values.
See? That's the problem with arithmetic on containers. Python has that already:
Comma doesn't do weird shit: that's just a tuple of numbers. But "+=" then does weird shit: it means concatenation when v is a standard library list, and element-wise addition when v is a numpy vector.v += 4, 5, 6
> Once the vector is constructed, we can only add elements one by one by using the push_back member function.
Nope... std::vector::insert() takes an std::initializer_list [1].
[1] https://en.cppreference.com/w/cpp/container/vector/insert
The example (v += 1,2,3,4,5) is neither "nice" nor "expressive". It's confusing and dumb. The reader of this code can't be expected to know what it does. And, looking at the implementation, there's not even an efficiency benefit from doing this. You'd have a more efficient and literate program with std::iota.
The rest of the article isn't wrong, but it fails to establish why anyone thinks this pattern is "nice".
It might be a bit extreme but the more I'm exposed to operator overloading the more I think that it's a bad idea 99% of the time.
The only use that seems absolutely in my opinion is when the overload is absolutely transparent mathematically, for instance to implement geometrical transformations on a Matrix class.
That works because in this case you don't actually end up with "custom" behavior, you just expand the standard and well understood notation of the language by plugging the standard and well understood mathematical notation for matrix operations. Anybody who understands this mathematical notation will be able to understand what the code does without additional context.
Anything beyond that is just asking for trouble IMO. It's basically code obfuscation.
Of course C++ has precedent for that sort of insanity, especially with the IMO absolutely bonkers use of the bit shift operators for... input/output processing. A notation that you'll note hasn't had a lot of success outside of C++.
> It might be a bit extreme but the more I'm exposed to operator overloading the more I think that it's a bad idea 99% of the time.
Quite a moderate position in practice. There are a substantial number of programmers who enjoy being able to tell what basic syntax does from memory without having to cross-reference documentation and source files.
And as you allude to, the example in question isn't a sum; it is an append. It would be far more appropriate to have an a = append(a, {1,2,3,4,5}) style construct. Or something with pointers if efficiency is important. Or construct the vector inline in older versions of C++. The += should be reserved for actual vector summing.
This whole article stands as a reminder that C++ is a remarkable language, and is starting to make up a lot of ground on achieving feature parity with Common Lisp.
I would add string + and += operators to list but that is it really.
That is pretty extreme. Where would we be without assignment and comparison overloads? Or dereference?
iostream hasn't had a lot of success inside C++ either. Operator overloading is one of those features that you wish for when you don't have it. Besides standard math operators, assignment also ends up playing a major role in making data structures behave well when managing resources.
It can be abused but most of the time is extremely convenient. Iostream and boost set a terrible example however. Boost especially goes into the deep end on templates, operators, crazy dependencies, unacceptable compile times etc. It really isn't fair to judge modern C++ on boost. Instead of being batteries included, boost is now more a last resort.
The "surprise" in the article is related to a very general class of mistake: supporting type something& but not also type something&& despite that parameter type being decisive for overload resolution purposes and despite the missing function being identical to the existing one.
Overloading the comma operator doesn't cause any particular danger, it merely provides a mildly unfavourable situation in which an unnecessary mistake is possible: there's "threat" of the default comma operator that does something completely different, it's not very obvious whether type something or something& or something&& is considered, and the involved overload resolution rules are somewhat lawyer-grade.
In some countries (such as Germany), a comma is the official decimal separator -- so v += 1,5 will look quite innocent...
That's a valid point, but
v += 1.500
may be just as confusing.
A valid point. Nice.
Only if you forget whether you're reading German or C++.
As another example of unexpected behavior:
(This assigns 10 to b but leaves a untouched)a, b = 10, 20;Not if you remember that the precedence of PROGN (“the comma operator”) is even lower than assignment.
What would you expect that to do instead? C++ doesn't have multiple assignment.
Well, the situation is even trickier:
is allowed code in C++ (not in C). The result is that 20 is assigned to b, and a is untouched.(a, b) = (10, 20)Contrast this to:
where (as above) 10 is assigned to b and a is left untouched, because it is parsed as:a, b = 10, 20
and I hope you agree that the behavior is very confusing.a, (b = 10), 20C++17 is capable of multiple variable declaration, which is a similar concept.
#include <tuple> #include <iostream> std::tuple<int, int> divide(int dividend, int divisor) { return {dividend / divisor, dividend % divisor}; } int main() { using namespace std; auto [quotient, remainder] = divide(14, 3); cout << quotient << ',' << remainder << endl; }Yes, the code
is not an assignment. In an assignment, you should write something likeauto [quotient, remainder] = divide(14, 3);
which is a tuple assignment written as a single assignment.std::tie(quotient, remainder) = divide(14, 3);This shows the kind of (imho) "ugly hacks" the C++ committee had to make to cover up the historical mistake with the comma operator.
Such misleading code should trigger an error.
Not really an error, since it is syntactically valid C++. It could definitely trigger a compiler warning, though.
Code that violates C++ static typing rules can be syntactically valid too, yet noone is proposing it should be only a warning.
(I find this the most weird thing about C/C++ culture: they are so proud it is static typed, yet oblivious to all the other problems and traps/memory unsafety/undefined behavior/etc.)
I have to agree.
I never liked the comma operator, even in C. It's that kind of thing that "looks nice" at first but it is confusing in the end. It's not syntactic sugar, it's syntactic raisins.
v += [1, 2, 3, 4, 5] makes sense to me
or, for a less ambiguous meaning
v.extend([1, 2, 3, 4, 5]);
is just fine.
I don't think it makes sense to view comma as an operator.
Today you can use initilizer list to support this syntax:
V+={1,2,3,4,5};
“there's not even an efficiency benefit from doing this. You'd have a more efficient and literate program with std::iota“
I would pick a different syntax, too, but this is more flexible than std::iota. You can do v+=a,b,c, for example.
I also would think any decent compiler would completely optimize away that appender instance, making it quite efficient for short lists of items (for longer lists, compiling a push_back call for each item would get inefficient, memory and cache-wise. I don’t see a compiler converting that into a loop)
An optimizer can do many things until it cannot. Since you have no easy way to assert the overall performance of your program, you may end up with broken optimizations at some point in the future due to unrelated changes, pretty much non-deterministically.
In addition, more templates and more types means way less compiler throughput. That is why major parts of Boosts are avoided.
Not to mention human throughput too.
> compiling a push_back call for each item would get inefficient, memory and cache-wise?
That depends a lot on what you are doing.
By the way, if you are dealing with tons of constants in your code, then it usually means the design of the application is likely inflexible and a code smell overall.
That sure is a weird usage
But it has a Boost package, which means someone thought it was reasonable!
I'm intentionally using a weak argument here...
Reasonable and "C++ library" in the same line is almost an oxymoron.
We like to complain about Java but the C++ guys went all in and apparently can't seem to write a simple array without inheriting from at least 3 primitives and using a couple of templates.
Yes please tell me how you follow the "SOLID" principles when this is as frail as a house of cards
Qt is very reasonable and the standard library also isn't too bad. Boost (some of its sub-projects) is by far the worst in making simple things complicated.
I didn't expect anyone to compliment Qt containers. Most of them are straight up not recommended for any situation because they waste memory, have unnecessary indirections that cause dcache misses, and aren't optimal for multithreading due to COW (and accidental COW due to bad APIs).
This was about ease of use and simplicity of implementation. By the way, QHash's performance will be fixed thoroughly in Qt6. It is significantly faster than std::unordered_map, whose performance is constrained by an API that effectively requires a sub-optimal implementation.
Yes, agreed, Qt is saner. Also the Borland C++ libraries were usually ok.
On the converse, why would you care about SOLID when implementing an array class? Not everything (or even most things) need to be OO.
>> The first one is that if we overload the comma operator, we need to be extra careful to cover all cases and think about lvalues and rvalues. Otherwise we end up with buggy code.
Extra careful to cover all cases is when I start looking at alternatives. In this case it is not even enhancing readability. Even if you may be an excellent programmer think whether your team can be that extra careful most of the times before introducing these quirks in your code.
“Extra careful to cover all cases is when I start looking at alternatives.“
IMO, an advantage of C++ is that, in well-written code, being extra careful is (mostly) limited to the implementer of code, not to its users.
It would be nice if nobody would have to be extra careful, but that’s the price you pay for power, I fear.
Rust shows that you can remove most of the footguns, and keep the power with a much smaller "extra careful" surface.
For example, instead of a comma operator and its overloads, you can have tuples. Then `let (a,b) = (b,a)` works as expected. If you want weird code, you can overload `vec += (1,2,3,4,5)` with a tuple, and that's a straightforward case with no hidden gotchas.
Boost Assign never seemed really useful to me. More like "Here's a zany thing you can do in C++."
Subsequent to the birth of Boost Assign, the language improved:
for (int i : {1,2,3,4,5})
v.push_back(i);
Simple, built in, anyone can understand it.In my life as a professional c++ programmer, overloading the comma operator has been much less useful than using goto. The operator itself is fine and using it in its default behaviour can be good, but overloading it is absolutely useless.
I would rather call this "the subtle dangers of function overloading".
One thing I like about the C (not ++) language is that it's always clear what function that is being called by looking at the function name. In C++ you have to guess a lot of times (or look very carefully).
Nah. The problem is that , doesn't look like a function, isn't treated as a function by most of your tooling, and so it's very surprising if it behaves like one (plus it's not at all clear what you'd expect a function called , to do in most cases). Polymorphic functions are fine when they have sensible names and are understood as functions by the reader and tooling (of course it helps if your language is actually parseable by those tools, which C++ isn't).
I don't think the main problem in this article is that "," doesn't look like a function because we are told that it is overloaded so we know it's a function.
The problem is that the behavior of the code silently changes when a function is moved and this is because of the overloading feature in C++. You can get those types of problems in other cases too when you overload functions that have a good name, but maybe it's less likely since I guess the comma operator is already defined for all different types but named functions have to be defined by the programmer.
> I don't think the main problem in this article is that "," doesn't look like a function because we are told that it is overloaded so we know it's a function.
Well it's a function in one of the code snippets and not in the other. That it can be both is definitely part of the problem.
> You can get those types of problems in other cases too when you overload functions that have a good name, but maybe it's less likely since I guess the comma operator is already defined for all different types but named functions have to be defined by the programmer.
The problem is that it's not just overloaded but overridden. Subtyping is problematic at the best of times, but the subtyping relationship of C++ lvalues and rvalues is particularly insidious since it is completely invisible in the code.
The built-in comma operator has strict sequencing semantics:
Every value computation and side effect of the first (left) argument of the built-in comma operator , is sequenced before every value computation and side effect of the second (right) argument. [1]
The same sequencing guarantee is not applied to a user-defined comma operator.
For the similar reasons it's not a great idea to overload && and || either.
[1] https://en.cppreference.com/w/cpp/language/eval_order (point 9)