Bjarne Stroustrup, and Programmers With Class

You can never be indifferent to Bjarne Stroustrup’s C++. You either loathe it and swear revenge, show a sage indifference or welcome it with a benevolent smile. Linus Torvalds has called it a horrible language, Don Knuth said it was too ‘baroque’ for him to use.

There are, of course, the many who use it as the obvious choice for developing demanding Windows system drivers and utilities. C++ is the main development language used by many of Google’s open-source projects and last year the company released a research paper which suggested that the language is the best-performing programming language in the market after it had implemented a compact algorithm in four languages – C++, Java, Scala and its own programming language Go – and then benchmarked results to find “factors of difference”.

Google hosted a live question and answer session on 20th August. to increase the understanding of C++. Not surprisingly, Bjarne is still one of its strongest advocates.

RM:

Bjarne, you’re holding a live Google presentation on C++ followed by a question and answer session on August 20th. What’s the aim of the session?

BS:

My aim, and the aim of the online-community moderators, is to increase the understanding of C++. In particular, I’d like to see greater understanding of modern C++ (C++11 and C++14) and greater use of modern programming and design techniques. Today, we can write code that is much more elegant, shorter, and better performing than we could just a few years ago.

RM:

Several large computer companies are shifting in programming language architecture such as F#. Do you think this will see C++ and other languages pushed aside in favour of functional programming languages?

BS:

I doubt it. Functional languages seem to be fashionable today in a way similar to the way object-oriented languages were in the 1990s. Eventually, people learned that huge class hierarchies wasn’t the answer to every system design problem and similarly people will find that (say) higher-order functions and pattern matching aren’t the answer to everything.

In my opinion, a general-purpose language need to support a variety of approaches and combinations of such approaches. C++ does that and so does some of the more popular languages based of functional programming techniques. Because of its huge code base C++ evolves slower than more modern languages but on the other hand it has more mature tool chains.

Like the OO purists of the 1990s (and today), FP purists will not agree with my analysis, but I think we are searching for a synthesis of language features, library facilities, and tool support that goes beyond the language wars. We need more communication among the language-design and user communities to help develop that synthesis. It is unlikely to emerge from a single community, a single application area, or from academia alone.

Incidentally, I recently had to adjust my estimate of C++ users upwards. I used to say “at least 3 million”, but people have convinced me that 4 or 5 million are likely to be more correct. Look under just about any “rock” in computing, and you find some C++. Unfortunately for the C++ community, much of that code is invisible to most of its users. It is really hard to count programmers. Most, who try, simply do a few unscientific web searches. They measure “noise” and hope that this has a good correlation with use. I’m not so sure. I wouldn’t dare to make a firm estimate of the number of C++ programmers to the nearest million, nor for any other major programming languages.

RM:

Do you think it would be possible to design a language much smaller than C++ and still do what C++ does?

BS:

I still think that’s possible. The real problem is whether such a language could be more than a research toy because of the huge established base of C++ code “out there.” My ideals are (still) type safety, lack of garbage collection, and a direct map to hardware. One problem for a new, simpler C++like language would be that C++ is itself evolving towards those ideals. For example, you can now write things like

vector<string> names = { "CPL","BCPL","C","C++"};

for (auto& x : v) cout << x << '\n'; // write all names

and soon

void sort(Sortable c); // sort anything that's Sortable; Sortable is a concept

sort(v);

for (x : v) cout << x << '\n'; // write all names in order

I guess I have better explain how “lack of garbage collection” can be considered an ideal. Garbage collection has two fundamental problems

It does not handle resources in general; even with the best collector, you can leak file handles, thread handles, locks, etc., so that eventually the system grinds to a halt.
It is a global operation (are there any users of this object in the system?) in a world that is becoming more distributed and where locality is becoming more important; I’m thinking on non-uniform address spaces, cache architectures, cores, and even clusters.

Please note that I’m not saying that garbage collection is inherently bad, and also that performance is not the major part of my reasoning. I consider garbage collection a specialized tool and a last resort for poorly designed systems

Instead, I prefer to rely on scoped objects, on explicitly represented resources, and well-defined (often implicit) transfer of ownership. Constructors, destructors, and user-defined assignment operators have been the core of that in C++ from the early 1980s. With C++11 came the ability to move objects (when we don’t want to copy), so that we can move large objects from scope to scope without copy overhead. For example:

template<typename T, typename Pred>

vector<int> indices(const vector<T>& v, Pred f)

// find indeces of elements x in v for which f(x) is true

{

vector<int> res;

for (x : v)

if (f(x))

res.push_back(i)

return res;

}

auto nb = indices(names, [](const string& s) { return s.size() && s[0]=='B'; );

auto nc = indices(names, [](const string& s) { return s.size() && s[0]=='C'; );

Please ignore that the predicates are silly and the vector too short to be interesting and note that I returned the return vector by value. In C++98, this would be a serious performance bug. In C++ it is an elegant and efficient mechanism for transferring large amounts of data (we move the vector out rather than copy it). This “move semantics” saves us from explicitly messing with pointers and memory management. The resulting code looks more like functional programming than C-style code.

The constructs starting with [] are lambda expressions. They allow us to concisely define functions and function objects that are immediately used.

This is not the place to explain the details of C++11 and C++14, so I’ll just take the opportunity to recommend my new books:

A Tour of C++ for experienced programmers who want an overview of C++
The C++ Programming language (4th Edition) for programmers who wants all the details and many of the techniques supported by C++
Programming: Principles and Practice using C++ (second edition) for novices and non-programmers.

In addition, the C++11 standard library provides unique_ptr to represent exclusive ownership and shared_ptr to represent shared ownership. The shared_ptr is a pointer with a use count for its object so that it can call the destructor when the last shared_ptr to an object is destroyed. That is it is a form of GC that respects destructors.

And, having come full circle back to GC, GC has one property that you really miss in any language with an explicit free() or delete: With GC (in a non-distributed system) you cannot have a pointer/reference to a deleted object. That is a property that must be preserved in any fully type-safe system.

RM:

What do the lessons about the invention, further development and adoption of your language say to people developing computer systems today and in the future?

BS:

It seems that most new language ideas are of the form “I want a language just like X that’s smaller, a bit easier to use, more flexible, and more pure.” Such languages almost never succeed. In general, the success rate for new languages is low, but to succeed it seems that it is best to focus on a specific problem area and make sure that there really isn’t a well-known language “out there” that provides adequate support.

If a language succeeds on a significant scale, it usually does so at the cost of compromising the size, simplicity, and “purity” that were the key initial sales points. Java is an example it has more than trebled in size (the specification of the Java7 language definition is within 5% of that of C++11). Usually, a language is better for that adaptation to the real world.

It is easier to design a potentially viable language than to convince people to use it. You need tools chains, interoperability with other languages, tutorials, community websites, and much more. Having the generous and enthusiastic backing of a large corporation ca be essential.

Elegant expression, generality, maintainability, reliability, and performance. We need all five. Note that we cannot assume error-free hardware, infinite compute power, and a single address space.

We are nowhere near to having a perfect programming language, so there is room for initiative and ideas. Keep them coming!

RM:

In what way has objective criticism from the C++ community changed the development of the language over recent years? How do you choose what parts of the language to evolve and what ideas to throw away? Can you develop a language in a democratic fashion?

BS:

This is a very hard question to answer. The process of adding new language features or a new library components to (ISO standard) C++ is messy and influenced by many people of widely varying backgrounds. I’d say that the ideas “bubble” up from the community at large.

This community includes many users of a wide variety of languages and backgrounds. We are not in a closed system. Ideas get traction in the standards committee based on inherent quality, on ability to integrate into the existing language and library, on the persuasiveness of the proposers, and on the persistence of proposers.

Note that there are about 100 people at a C++ standards meeting and that many more than that takes part in the effort between meetings. This complicates decision making and makes it hard to build a consensus. Few people have a global view and care for the whole language.

There is no really solid technical standard for what it takes for a feature to get accepted. Conventional features and small modifications have an easier time than something with pervasive impact and/or dramatic improvements in programming style. IMO, it is the latter we want and the former might just add to the bulk of the language without really aiding users.

How democratic is the process? Very, I’d say. Pay $1200 a year and turn up to two meetings a year and you can vote. No practical or theoretical experience is needed, but don’t expect people to take particular notice if all you bring to the table is opinions. The committee values practical experience and knowledge of language/library details.

I suspect that the biggest problem is to maintain a comprehensive and coherent view of what the language is and where it should be going. I can recommend my two papers from HOPL, the ACM History Of Programming Languages conference (available on my website www.stroustrup.com) for an idea of what I mean by that.

Of course I have opinions about the way C++ ought to develop. However, I’m not a dictator, and not even a full time language designer. Furthermore, I’m rarely as certain of my opinions as most “true believers.” In language design, you have many alternatives and many thousands of details. It is quite hard to know what will be genuinely useful to millions in a decade’s time. Designing a language simply to directly address today’s problems for a few dedicated programmers is almost certainly foolish. You have to plan for long-term evolution and be acutely aware that you work with far less than complete information and understanding.

RM:

Would you say the successes at community building around C++ have been slower than other dominant languages?

BS:

Yes. In the early years, the C++ community was fractured by competing vendors. We have never really recovered from that. There is no solid, active, and well-funded center of the C++ community. The standards committee takes on some of that role, and there is a relatively new “C++ Foundation” offering a nice website (www.isocpp.org) and sponsors the new Cppcon conference (September 7- 12, 2014, Bellevue, Washington, www.cppcon.org). However, there is no one place to go for C++ information, C++ tutorials, C++ projects, C++ libraries, C++ Implementations, etc.

Part of the problem is lack of resources, but I suspect the major reason is that most people’s efforts go into producing their own C++ sites and end up spending most of their time replicating the efforts of similar sites – all useful, but all too limited to make a major difference. I have some hopes for isocpp.com, but it remains to be seen who are willing to come under that umbrella: corporations, library developers, people financed through adds, and more seem to prefer to have an independent web presence.

RM:

Why do you think we haven’t seen a major new language in the C++ tradition?

BS:

Fundamentally because C++ is very good at what it does and it still evolves. This makes competing with C++ very hard. If you go “higher level,” there are a multitude of established and experimental languages, and if you try to go “lower level,” there are C and low-level-style C++.

What does it mean to be “in the C++ tradition?” Direct access to hardware plus zero-overhead abstraction is – I think – the distinguishing characteristics. I suspect that implies a heavy emphasis on compiler technology, on lexical scope, constructors and destructors, and lack of garbage collection. However, I can’t know for sure until someone had built an example or two.

A successful addition to “The C++ family” would have to be far smaller and far simpler to use than C++. I have conjectured that such a language can exist and still be as expressive and flexible as C++, as well as running as fast and be type safe. However, simplification is hard, and fashion dictates a focus on “advanced features.” Sophisticated functional type systems are fashionable and so is garbage collection.

And of course C++ is a moving target. We can write much better code in C++14 than we could in C++98. I have seen examples where code halved in size as people removed now-unnecessary boilerplate.

RM:

When I interviewed Chuck Moore he said that the average complexity of software grows year after year. Does OOP scale well to this situation or make things as complicated?

BS:

How do you define OOP? The answer to your question critically depends on the definition.

If you mean “large class hierarchies with lots of virtual functions”, the answer is no. The coupling between parts of a hierarchy is too high and the foresight needed to design a good set of interfaces is beyond us – at least given the time available and the scale needed.

If you mean “the full set of abstraction mechanisms including hierarchies, generic programming, functional techniques, and separation into concurrent computations” then maybe. At least we have a fighting chance. However, we need a synthesis and some reasonably simple guidelines for usage. The image of a programming language as a Swiss army knife is not a good one, and I think not an accurate one either.

I think that too few people use a programming language and its associated tools as a vehicle for good design. Some avoid all modern features, making some excuse for using decades’ old techniques and language features. Others see a language as a solution rather than a tool, and obsess over using advanced features to their maximum. The rules for writing good code are rather similar to the rules for writing good English text: say what you want to say as simply and as clearly as you can.

RM:

Is there one ‘root of all evil’ in software design? Or do people underestimate the intellectual difficulties that are involved?

BS:

People consistently underestimate the intellectual effort needed to design, implement, and maintain software. Many consider it – and especially programming – a low-level, “manual” skill. People’s attitude to testing and system evolution (“mere maintenance”) is even more wrongheaded. We need more highly-skilled, well-educated people far more than we need more people. Adding semi-skilled people to a project or a community does more harm than good.

The current fashion of more focussed, more efficient, and cheaper education exacerbates this trend. Good system design and implementation is a highly skilled job deserving the respect granted to the highest levels of professions. However, the current educational systems and the almost complete lack of professional-level life-long education do not produce such professionals in high numbers and they often get drowned in a mass of clever, hardworking, but miseducated programmers caring less about the longer term of the systems they build. Many managers of IT projects are equally short-sighted.

Designing a system that is as simple as possible, but no simpler, and can stand the test of many years of maintenance is a rare stroke of genius. However, the world has not come to an end because of a collapse of our billions of lines of critical code – we must be doing something right! Obviously, we could do better still.

RM:

What is the link between the design of a language and the design of software written with that language?

BS:

For the first edition of TC++PL, I used the Whorf quote “Language shapes the way we think, and determines what we can think about.” Taken literally, that’s nonsense, but obviously the way we think about programming is affected by the programming languages we know. If we only know one language well the effect can be very strong. Consider a Haskell programmer and a Java programmer attacking the same problem. The likelihood that the two solutions, when stripped of their syntactic peculiarities, are similar is about zero.

Every language makes some things easier to express than others, makes some solution run more efficiently than others, and makes certain errors harder to make than others. For example, it is hard to express a recursive algorithm in Fortran-77 and hard to express an iterative one in Haskell. Both reflect a view on what is important embedded in the language design. In addition to the fundamental design decisions in a language, familiarity and idioms shape programs. Usually, a programmer does not have the time to “think out of the box” and discover new techniques outside the common idioms of a familiar language. Typically, conventional solutions are considered “good enough” by default. Often they are, but when they are not that leads to horrendous complexity.

From day one of C++, I used at least two language for every real system: C++ plus some “scripting language” (e.g., Unix shell and AWK) to tie components together. That seems to surprise many who consider it obvious that there ought to be a single language that is best for everything. There obviously isn’t, but my point is that “a single best language for everything” isn’t even my ideal. I do not think there could be a single language that is best for everything and everybody. We should learn and use a variety of languages. Eventually, we’ll get a feel for what works best in various places.

Using many languages, we learn. Using a language for a variety of tasks, we learn. One interesting question is how much can the language communities learn from each other? There is clearly some kind of convergence going on. It seems that most modern languages that aim to be general purpose have lambda expressions, range-for statements, and some form of hierarchy (e.g., C++, Java, C#, Python, and Scala). Twenty years ago, they would all have been condemned as “hybrids” for that. I’m amazed and pleased to see that Python and Java are (finally) getting destructor-like constructs. Clearly, some cross-pollination is going on, and that is good. I am – as I have been for decades – in favour of increasing expressiveness of languages. Simple ideas should be simple to express.

To be a bit more concrete, consider this Python fragment:

def mean(seq):

n = 0.0

for x in seq:

n += x

return n / len(seq)

In C++14 (with concepts), the rough equivalent is:

auto mean(const Sequence& seq) {

auto n = 0.0;

for (x : seq)

n += x;

return n / seq.size();

}

In both languages, the function is generic over sequences and in both a standard-library function can be used to reduce the size to a more maintainable minimum.

Some of that convergence is only skin deep. That is, the syntactic similarities hide fundamental differences in the underlying type systems and too chains.

We rarely, if ever, write a program from scratch. If we did, we would still have to take the practical problems of construction into account: What parts do we have available to build our system? What useful skills do the builders have? We cannot avoid taking that into account. In the physical world, I would be unlikely to build out of stone if all the local workmen were carpenters. Similarly, I can’t imagine starting a significant software project without considering the language tool chains, the available libraries, and the skills of the likely developers. It is not some much the programming language in itself that influences our thinking as the total environment in which a language exists.

A programming language is the programmers’ user-interface to their tools. It helps and hinders them. Make some things simple and others difficult. We always wish for a language that provide a better match for our ideas.

And thanks for the good hard questions.