The Subtle Dangers of the Comma Operator (C++)

39 points by pekalicious 6 years ago · 48 comments

Reader

> Overloading the Comma Operator Is Powerful

Clearly, it's over 9000!

... more seriously though, overloading the comma operator is a bad idea and you shouldn't do it. In fact, almost nobody does it.

Specifically, you don't want to break the principle of least astonishment with a command such as

    v+= 1,2,3,4,5;

Assuming you want to add up a bunch of literals, what you could do is something like:

   std::array data { 1,2,3,4,5 };
   v += std::reduce(data.begin(), data.end());

and if you've implemented some <algorithm> and <numeric> wrappers for containers, you would have something like an

    template<class Container>
    typename Container::value_type 
    reduce(const Container& container);

and then you would write

   v += reduce(std::array{ 1,2,3,4,5 } );

which is terse and much clearer.

jzwinck 6 years ago
An even more concise form is built in, and I think more obvious to readers:
```
    v.insert(v.end(), { 1, 2, 3, 4, 5 });
```
- einpoklum 6 years ago
  
  I was actually not describing adding elements to a container, but adding up values.
  - owl57 6 years ago
    
    See? That's the problem with arithmetic on containers. Python has that already:
    v += 4, 5, 6
    Comma doesn't do weird shit: that's just a tuple of numbers. But "+=" then does weird shit: it means concatenation when v is a standard library list, and element-wise addition when v is a numpy vector.

mehrdadn 6 years ago

> Once the vector is constructed, we can only add elements one by one by using the push_back member function.

Nope... std::vector::insert() takes an std::initializer_list [1].

[1] https://en.cppreference.com/w/cpp/container/vector/insert

jeffbee 6 years ago

The example (v += 1,2,3,4,5) is neither "nice" nor "expressive". It's confusing and dumb. The reader of this code can't be expected to know what it does. And, looking at the implementation, there's not even an efficiency benefit from doing this. You'd have a more efficient and literate program with std::iota.

The rest of the article isn't wrong, but it fails to establish why anyone thinks this pattern is "nice".

simias 6 years ago

It might be a bit extreme but the more I'm exposed to operator overloading the more I think that it's a bad idea 99% of the time.
The only use that seems absolutely in my opinion is when the overload is absolutely transparent mathematically, for instance to implement geometrical transformations on a Matrix class.
That works because in this case you don't actually end up with "custom" behavior, you just expand the standard and well understood notation of the language by plugging the standard and well understood mathematical notation for matrix operations. Anybody who understands this mathematical notation will be able to understand what the code does without additional context.
Anything beyond that is just asking for trouble IMO. It's basically code obfuscation.
Of course C++ has precedent for that sort of insanity, especially with the IMO absolutely bonkers use of the bit shift operators for... input/output processing. A notation that you'll note hasn't had a lot of success outside of C++.
- roenxi 6 years ago
  
  > It might be a bit extreme but the more I'm exposed to operator overloading the more I think that it's a bad idea 99% of the time.
  Quite a moderate position in practice. There are a substantial number of programmers who enjoy being able to tell what basic syntax does from memory without having to cross-reference documentation and source files.
  And as you allude to, the example in question isn't a sum; it is an append. It would be far more appropriate to have an a = append(a, {1,2,3,4,5}) style construct. Or something with pointers if efficiency is important. Or construct the vector inline in older versions of C++. The += should be reserved for actual vector summing.
  This whole article stands as a reminder that C++ is a remarkable language, and is starting to make up a lot of ground on achieving feature parity with Common Lisp.
- shultays 6 years ago
  
  I would add string + and += operators to list but that is it really.
- jeffbee 6 years ago
  
  That is pretty extreme. Where would we be without assignment and comparison overloads? Or dereference?
- BubRoss 6 years ago
  
  iostream hasn't had a lot of success inside C++ either. Operator overloading is one of those features that you wish for when you don't have it. Besides standard math operators, assignment also ends up playing a major role in making data structures behave well when managing resources.
  It can be abused but most of the time is extremely convenient. Iostream and boost set a terrible example however. Boost especially goes into the deep end on templates, operators, crazy dependencies, unacceptable compile times etc. It really isn't fair to judge modern C++ on boost. Instead of being batteries included, boost is now more a last resort.
HelloNurse 6 years ago

The "surprise" in the article is related to a very general class of mistake: supporting type something& but not also type something&& despite that parameter type being decisive for overload resolution purposes and despite the missing function being identical to the existing one.
Overloading the comma operator doesn't cause any particular danger, it merely provides a mildly unfavourable situation in which an unnecessary mistake is possible: there's "threat" of the default comma operator that does something completely different, it's not very obvious whether type something or something& or something&& is considered, and the involved overload resolution rules are somewhat lawyer-grade.
dukoid 6 years ago

In some countries (such as Germany), a comma is the official decimal separator -- so v += 1,5 will look quite innocent...
- einpoklum 6 years ago
  
  That's a valid point, but
  v += 1.500
  may be just as confusing.
  - thebruce87m 6 years ago
    
    A valid point. Nice.
- anoncake 6 years ago
  
  Only if you forget whether you're reading German or C++.
amelius 6 years ago
As another example of unexpected behavior:
```
    a, b = 10, 20;
```
(This assigns 10 to b but leaves a untouched)
- tempodox 6 years ago
  
  Not if you remember that the precedence of PROGN (“the comma operator”) is even lower than assignment.
- chrisseaton 6 years ago
  
  What would you expect that to do instead? C++ doesn't have multiple assignment.
  - amelius 6 years ago
    
    Well, the situation is even trickier:
    (a, b) = (10, 20)
    is allowed code in C++ (not in C). The result is that 20 is assigned to b, and a is untouched.
    Contrast this to:
    a, b = 10, 20
    where (as above) 10 is assigned to b and a is left untouched, because it is parsed as:
    a, (b = 10), 20
    and I hope you agree that the behavior is very confusing.
  - phoe-krk 6 years ago
    
    C++17 is capable of multiple variable declaration, which is a similar concept.
    #include <tuple> #include <iostream> std::tuple<int, int> divide(int dividend, int divisor) { return {dividend / divisor, dividend % divisor}; } int main() { using namespace std; auto [quotient, remainder] = divide(14, 3); cout << quotient << ',' << remainder << endl; }
    
    amelius 6 years ago
    
    Yes, the code
    auto [quotient, remainder] = divide(14, 3);
    is not an assignment. In an assignment, you should write something like
    std::tie(quotient, remainder) = divide(14, 3);
    which is a tuple assignment written as a single assignment.
    This shows the kind of (imho) "ugly hacks" the C++ committee had to make to cover up the historical mistake with the comma operator.
  - rini17 6 years ago
    
    Such misleading code should trigger an error.
    
    phoe-krk 6 years ago
    
    Not really an error, since it is syntactically valid C++. It could definitely trigger a compiler warning, though.
    
    rini17 6 years ago
    
    Code that violates C++ static typing rules can be syntactically valid too, yet noone is proposing it should be only a warning.
    (I find this the most weird thing about C/C++ culture: they are so proud it is static typed, yet oblivious to all the other problems and traps/memory unsafety/undefined behavior/etc.)
raverbashing 6 years ago

I have to agree.
I never liked the comma operator, even in C. It's that kind of thing that "looks nice" at first but it is confusing in the end. It's not syntactic sugar, it's syntactic raisins.
v += [1, 2, 3, 4, 5] makes sense to me
or, for a less ambiguous meaning
v.extend([1, 2, 3, 4, 5]);
is just fine.
I don't think it makes sense to view comma as an operator.
- gpderetta 6 years ago
  Today you can use initilizer list to support this syntax:
  V+={1,2,3,4,5};
Someone 6 years ago

“there's not even an efficiency benefit from doing this. You'd have a more efficient and literate program with std::iota“
I would pick a different syntax, too, but this is more flexible than std::iota. You can do v+=a,b,c, for example.
I also would think any decent compiler would completely optimize away that appender instance, making it quite efficient for short lists of items (for longer lists, compiling a push_back call for each item would get inefficient, memory and cache-wise. I don’t see a compiler converting that into a loop)
- jfkebwjsbx 6 years ago
  
  An optimizer can do many things until it cannot. Since you have no easy way to assert the overall performance of your program, you may end up with broken optimizations at some point in the future due to unrelated changes, pretty much non-deterministically.
  In addition, more templates and more types means way less compiler throughput. That is why major parts of Boosts are avoided.
  Not to mention human throughput too.
  > compiling a push_back call for each item would get inefficient, memory and cache-wise?
  That depends a lot on what you are doing.
  By the way, if you are dealing with tons of constants in your code, then it usually means the design of the application is likely inflexible and a code smell overall.
radres 6 years ago

That sure is a weird usage
- ahartmetz 6 years ago
  
  But it has a Boost package, which means someone thought it was reasonable!
  I'm intentionally using a weak argument here...
  - raverbashing 6 years ago
    
    Reasonable and "C++ library" in the same line is almost an oxymoron.
    We like to complain about Java but the C++ guys went all in and apparently can't seem to write a simple array without inheriting from at least 3 primitives and using a couple of templates.
    Yes please tell me how you follow the "SOLID" principles when this is as frail as a house of cards
    
    ahartmetz 6 years ago
    
    Qt is very reasonable and the standard library also isn't too bad. Boost (some of its sub-projects) is by far the worst in making simple things complicated.
    
    eska 6 years ago
    
    I didn't expect anyone to compliment Qt containers. Most of them are straight up not recommended for any situation because they waste memory, have unnecessary indirections that cause dcache misses, and aren't optimal for multithreading due to COW (and accidental COW due to bad APIs).
    
    ahartmetz 6 years ago
    
    This was about ease of use and simplicity of implementation. By the way, QHash's performance will be fixed thoroughly in Qt6. It is significantly faster than std::unordered_map, whose performance is constrained by an API that effectively requires a sub-optimal implementation.
    https://codereview.qt-project.org/c/qt/qtbase/+/287053
    
    raverbashing 6 years ago
    
    Yes, agreed, Qt is saner. Also the Borland C++ libraries were usually ok.
    
    gpderetta 6 years ago
    
    On the converse, why would you care about SOLID when implementing an array class? Not everything (or even most things) need to be OO.

indogooner 6 years ago

>> The first one is that if we overload the comma operator, we need to be extra careful to cover all cases and think about lvalues and rvalues. Otherwise we end up with buggy code.

Extra careful to cover all cases is when I start looking at alternatives. In this case it is not even enhancing readability. Even if you may be an excellent programmer think whether your team can be that extra careful most of the times before introducing these quirks in your code.

Someone 6 years ago

“Extra careful to cover all cases is when I start looking at alternatives.“
IMO, an advantage of C++ is that, in well-written code, being extra careful is (mostly) limited to the implementer of code, not to its users.
It would be nice if nobody would have to be extra careful, but that’s the price you pay for power, I fear.
- pornel 6 years ago
  
  Rust shows that you can remove most of the footguns, and keep the power with a much smaller "extra careful" surface.
  For example, instead of a comma operator and its overloads, you can have tuples. Then `let (a,b) = (b,a)` works as expected. If you want weird code, you can overload `vec += (1,2,3,4,5)` with a tuple, and that's a straightforward case with no hidden gotchas.

jzwinck 6 years ago

Boost Assign never seemed really useful to me. More like "Here's a zany thing you can do in C++."

Subsequent to the birth of Boost Assign, the language improved:

    for (int i : {1,2,3,4,5})
        v.push_back(i);

Simple, built in, anyone can understand it.

Asooka 6 years ago

In my life as a professional c++ programmer, overloading the comma operator has been much less useful than using goto. The operator itself is fine and using it in its default behaviour can be good, but overloading it is absolutely useless.

robalni 6 years ago

I would rather call this "the subtle dangers of function overloading".

One thing I like about the C (not ++) language is that it's always clear what function that is being called by looking at the function name. In C++ you have to guess a lot of times (or look very carefully).

lmm 6 years ago

Nah. The problem is that , doesn't look like a function, isn't treated as a function by most of your tooling, and so it's very surprising if it behaves like one (plus it's not at all clear what you'd expect a function called , to do in most cases). Polymorphic functions are fine when they have sensible names and are understood as functions by the reader and tooling (of course it helps if your language is actually parseable by those tools, which C++ isn't).
- robalni 6 years ago
  
  I don't think the main problem in this article is that "," doesn't look like a function because we are told that it is overloaded so we know it's a function.
  The problem is that the behavior of the code silently changes when a function is moved and this is because of the overloading feature in C++. You can get those types of problems in other cases too when you overload functions that have a good name, but maybe it's less likely since I guess the comma operator is already defined for all different types but named functions have to be defined by the programmer.
  - lmm 6 years ago
    
    > I don't think the main problem in this article is that "," doesn't look like a function because we are told that it is overloaded so we know it's a function.
    Well it's a function in one of the code snippets and not in the other. That it can be both is definitely part of the problem.
    > You can get those types of problems in other cases too when you overload functions that have a good name, but maybe it's less likely since I guess the comma operator is already defined for all different types but named functions have to be defined by the programmer.
    The problem is that it's not just overloaded but overridden. Subtyping is problematic at the best of times, but the subtyping relationship of C++ lvalues and rvalues is particularly insidious since it is completely invisible in the code.

leni536 6 years ago

The built-in comma operator has strict sequencing semantics:

Every value computation and side effect of the first (left) argument of the built-in comma operator , is sequenced before every value computation and side effect of the second (right) argument. [1]

The same sequencing guarantee is not applied to a user-defined comma operator.

For the similar reasons it's not a great idea to overload && and || either.

[1] https://en.cppreference.com/w/cpp/language/eval_order (point 9)

Settings

The Subtle Dangers of the Comma Operator (C++)

Keyboard Shortcuts