Reflections on Software Performance

blog.nelhage.com

154 points by cespare 6 years ago · 50 comments

Reader

> It seems increasingly common these days to not worry about performance at all,

You don't even have to continue there. People, who should know better, assume that 'modern cloud stuff' will make this trivial. You just add some auto-scaling and it can handle anything. Until it grinds to a halt because it cannot scale beyond bottlenecks (relational database most likely) or the credit card is empty trying to pull in more resources beyond the ridiculous amount that were already being used for the (relatively) tiny amount of users.

This will only get worse as people generally use the 'premature optimization' (delivering software for launch is not premature!) and 'people are more expensive than more servers' (no they are not with some actual traffic and O(n^2) performing crap) as excuse to not even try to understand this anymore. Same with storage space; with NoSQL, there are terabytes of data growing out of nowhere because 'we don't care as it works and it's 'fast' to market, again 'programmers are more expensive than more hardware!'). Just run a script to fire up 500 aws instances backed by Dynamo and fall asleep.

I am not so worried about premature optimization ; I am more worried about never optimization. And at that; i'm really worried about my (mostly younger) colleagues simply not caring because they believe it's a waste of time.

clarry 6 years ago

> I am not so worried about premature optimization ; I am more worried about never optimization. And at that; i'm really worried about my (mostly younger) colleagues simply not caring because they believe it's a waste of time.
I'm simultaneously worried about both, because I've had to deal with poor architecture and unnecessarily convoluted & difficult-to-work-with code that was only justified by completely misguided optimization attempts (with no experimentation or profiling to back any of it up; and indeed, performance in practice was terrible!). At the same time, there's a constant stream of "oh no this convoluted mess has a bug in it" and "oh no we need a new feature, it can't take long?" tickets but never a ticket that says "profile and optimize the program because it's ridiculously slow, oh and refactor and undo the convoluted mess."
vlovich123 6 years ago

There's also something to be said for building better tooling in this area. Not everyone can achieve expertise in everything. Better tooling helps level the playing feel (& eventually outperform experts when the tooling becomes indispensable).
You may think that's a cop-out, but consider something like coz[1]. Sqlite is managed and maintained by experts. There's significant capital behind investing engineering effort. However, better tooling still managed to locate 25% of performance improvement[2] & even 9% in memcached. Even experts have their limits & of course these tools require expertise so a tool like coz is still an expert-only tool. The successful evolution of the underlying concept for mass adoption will happen when it's possible to convert "expert speak" into something that can be easily and simply communicated outside CPU or compiler experts to meet the user on their knowledge level so they can dig in as deep as they need to/want to.
[1] https://github.com/plasma-umass/coz [2] https://arxiv.org/abs/1608.03676
- gameswithgo 6 years ago
  
  >Not everyone can achieve expertise in everything.
  But if every young or beginner programmer who asked a performance question on reddit, or stack overflow, could get good answers instead of lectures on how what they are doing is "premature optimization" every single time, the world would collect quite a bit more expertise on making things perform well.
  - BubRoss 6 years ago
    
    I try to remind people whenever I can that Knuth was talking about noodling loops in miniscule way. Between optimizing compilers, out of order super scalar CPUs and very different performance characteristics of modern CPUs, what he was talking about basically doesn't exist anymore.
    
    ska 6 years ago
    
    I don't think this is a helpful/accurate view. At a high level the type of optimization activity Knuth was talking about is alive and well, although the details of what people spend that time on has sometimes shifted.
    I agree this quote is often abused but the fundamental idea behind it is intact and important: Sure, if you don't at least thing architecturally about performance early on as your problem domain reveals itself, you can make some poor decisions with long reaching performance implications. But on the other hand, if you spend a bunch of time tuning code when you don't know what the use will look like that time can be a dead loss.
    This latter point was what Knuth was referring to - and in 2020 teams are still prematurely optimizing; i suspect about as much as they were back then.
    
    gameswithgo 6 years ago
    
    This idea is important when trying to finish an actual product of some kind, but when kids (or adults!) are learning, let them fiddle with the loops! Let them learn some intricacies and encourage the curiosity.
    
    ska 6 years ago
    
    Nothing wrong with fiddling with loops either, in that context.
    I was objecting to the idea that premature optimization isn't a real problem anymore because technology. That's just not true.
    
    BubRoss 6 years ago
    
    No one is saying that. Knuth was frustrated with people wasting time on micro optimizations. Those don't really exist in the same form any more. Architecture is far more important to the speed of software and that does need to be dealt with up front.
    The problem is when people YOLO their way through a program thinking optimization is for suckers because of an outdated quote from a different context.

branko_d 6 years ago

Yes, performance is a feature.

You have to plan and architecture for it, and you can't just tack it on after the fact by profiling a few hot codepaths (though you should do that too).

Performance can be different from "scalability" though. Sometimes, there is tension between the two.

sokoloff 6 years ago

As someone who has probably wasted more time than optimal agonizing over performance (I used to be a game dev for console and PC), what you say is absolutely true, but I think engineers have a tendency to think about Facebook scale before they have triple-digits of users. That is usually a mistake.
- GlitchMr 6 years ago
  
  A web application doesn't have to be scalable. Stack Overflow for instance could run on a single web server (source: https://nickcraver.com/blog/2016/02/17/stack-overflow-the-ar...), and this is a very popular website with Alexa rank of 39.
  - hinkley 6 years ago
    
    It’s fundamentally about choosing features based on what can be done well versus done at all. Lots of server bloat is caused by loading our applications up with features that are not cheap resource-wise. If you think about the billions of calculations and the hundreds of millions of bytes you can ship from one machine, we should be able to do something like stackoverflow in one rack with room to spare.
    15 years ago we could sustain 1/10th the traffic on 1/40th of the hardware of my current project, and the old one was badly architected. If we fixed that and added all the redundancy and telemetry of today they might just about cancel out. But factor in all the hardware improvements and this is not a good look.
  - saagarjha 6 years ago
    
    Hacker News runs on a single machine, apparently.
    
    karatestomp 6 years ago
    
    Your average "web scale" cloud system with Node and lambdas and VMs and distributed databases galore feels slow and clunky as hell before it's even under load, to those of us who remember "bad" old LAMP stacks running on a 1U server.
    Not that I want to go back to that, exactly, but our performance expectations have gotten really screwy.
    [EDIT] or, hell, take "Web 2.0". Piles of code and frameworks and shadow DOMs and shit all chasing and touting "performance" while full-page-loading low-JS sites like Craigslist and Basic HTML Gmail (or HN) leave them in the dust. Know what those are doing? Handing HTML to the browser and letting it render it. No JS render step, no fetching JSON then passing it through Redux and then making twenty function calls to eventually modify a shadow DOM to later apply to the real one. The browser is fast. Your JS is what's fucking slow and eating all my memory.
    
    adossi 6 years ago
    
    I would argue JS is a lot faster than you think, and the sluggishness you feel is due to the massive number of files being downloaded. Dozens of JS libraries (think jQuery, Bootstrap, etc.), several CSS stylesheets, and a hundred images or more. If each one of those files is even a few kilobytes each, there is still a 10ms (or even 100ms) download time on each of them, and unfortunately its very common for these files to be downloaded sequentially. JS on its own is quite performant.
    
    karatestomp 6 years ago
    
    Whatever some benchmarks (probably involving handing control off to C or C++ as quickly as possible, like any interpreted language benchmark aiming to demonstrate its "blazing fast" speed) say, real world experience demonstrates the opposite. Web 2.0 "webapps" are slow as dripping molasses and eat memory like they own my whole machine. Input lags, "loading" bars galore for the simplest thing, and that's even when supposed geniuses at Google or wherever are involved. That's Javascript's fault, not HTML and CSS, since those demonstrably still work just fine and aren't that much more memory hungry than they were years ago.
    [EDIT] to be fair to Javascript, there are few or no similarly-robust scripting languages that'd fair much better at half-assedly reimplementing features of their host environment (the browser, in this case). I dislike it for other reasons but it's not because it's slower than its peer languages. And I have plenty of complaints about HTML and CSS for modern "app development" since they've been ill-advisedly pressed into service for that purpose, but speed's not one of them—my browser renders a plain webpage in no time flat.
    
    FridgeSeal 6 years ago
    
    Whilst JS might be somewhat fast, you know what’s even faster?
    Designing your application so that it doesn’t need it. If I never have to download, parse and execute the JS, I’m already way ahead. With better privacy to boot.
    
    adossi 6 years ago
    
    I understand the appeal of rendering websites on the serverside, in their entirety (HTML and CSS), before the response is returned to the user's browser, which is what would need to happen if we all decided to stop using JavaScript today. However, using JavaScript in the user's browser to compute things like dynamic construction of the UI, animations, etc. has its benefits. For one, using the user's machine via JavaScript leverages their CPU and reduces the CPU consumption of the server. This can save cost, and when done correctly provides an overall better user experience. Things like AJAX (or XMLHttpRequests) are also a blessing and vastly improve the usability of websites. I'm comfortably sitting on the fence - I agree JS is used too often for things that don't need to be done on the user's machine, and anything that can be done on the serverside easily should be done there, but there are times when it is useful. Because of that I disagree with disabling it or not using it entirely.
- VBprogrammer 6 years ago
  
  I don't even think that is the problem. In my career I've seen full blown arguments over loops which, as a worst case, could only ever contain a handful of items, or are only executed once per user action. That's not facebook scale or not, it's just worrying about the wrong things.
  - thrower123 6 years ago
    
    Fundamentally, some people are allergic to actually profiling things and collecting data. I don't understand it, but there are lots of people who would rather spend hours talking over things in the theoretical sense, rather than spending a half hour coding it both ways and benchmarking it to get some actual data to make a decision on.
    
    gpderetta 6 years ago
    
    On the other hand you can't benchmark every single line of potentially problematic code and even if you could, syntetic tests are not reliable.
    If by experience you know that a certain solution can be problematic and there is a different solution which has reasonable implementation costs, you should do what your experience tells you especially if fixing it after the fact would have significantly higher implementation costs.
- jcelerier 6 years ago
  
  > engineers have a tendency to think about Facebook scale before they have triple-digits of users.
  I don't think number of users is what matters except for web services. If you make for instance a photo or audio editing software, there will never be enough performance.
  It isn't acceptable to say to your users that your software works fine until, say, 8192*8192 images : if you want to compete against other software, you have to consistently be the fastest at every task that artists may throw at you (else you will get bad reviews on specialist & prosumer press / forums / blogs which can kill your business pretty efficiently... as it takes hundreds of people saying "it's fast" to offset the effect of a single press article saying "it's slow as shit" in art communities).
Koshkin 6 years ago

Sometimes it is helpful to have two implementations - one being a “reference” implementation that may sacrifice performance in favor of the guaranteed correctness, and the other being a high-performance production-quality implementation which may have little in common with its reference counterpart except for the same business logic they both implement.
bachmeier 6 years ago

> Performance can be different from "scalability" though. Sometimes, there is tension between the two.
And extensibility. It's not necessarily fun trying to add a new feature to someone else's "highly optimized" code.
zzzcpan 6 years ago

> You have to plan and architecture for it
And even that is not enough. You also have to know how to plan and architecture for it, have a well developed mental model of what can get you there, which means you have to practice doing high performance things, follow research and high performance ideas and generally have a habit of building things that are fast. Few people actually do that.
- smallstepforman 6 years ago
  
  Very well said. I agree.

simonw 6 years ago

This piece is excellent. I really love how it challenges the "optimize last" philosophy by pointing out that performance is integral to how a tool will be used and designing it in as part of the architecture from the very start can produce a fundamentally different product, even if it appears to have the same features.

jmull 6 years ago

I think premature optimization remains as bad as always.
But you design for performance. The proper time to address it is at design time. That's not premature, that's the right moment.
I wish we could reserve the word "optimization" for the kinds of things you can do after implementation to improve the performance without significantly changing the design.
That is, let's continue to optimize last, but not try to make the word optimize mean address performance in general. That's not what the word means, after all.
- anarazel 6 years ago
  
  It's pretty easy do completely misjudge the real world bottlenecks when still in the design phase. Can often lead to adding complexity to the design to alleviate a guessed performance problem. Making it harder to then later fix the team problems...
  - simonw 6 years ago
    
    I think the original piece makes a strong case that this isn't as true as we all think it is.
    If you have a strong enough understanding of both the technology and the problem space, you really can judge the most likely bottlenecks as part of the design phase. Which can lead to radically different and better software.
    You can always inform the design phase with some quick working prototypes that help validate some of the technical assumptions you are making.
- smallstepforman 6 years ago
  
  Optimisation is redesign. I’m doing a video editor and have to transition to an Actor based asynchronous design if I want the user experience to be fluid. Leaving optimisation for later is not possible.
- gameswithgo 6 years ago
  
  premature optimization is bad since it is a tautology.
  but does it ever happen? =)
  - zzzcpan 6 years ago
    
    Best to call it unneeded optimization then, not premature.
  - jmull 6 years ago
    
    I don't think you mean tautology.
    
    gameswithgo 6 years ago
    
    I definitely do, but maybe I am wrong! If the optimization wasn't bad, it wouldn't have been premature.
    
    jmull 6 years ago
    
    I see. Well, when I said it was as bad as always, I was referencing the quote, "Premature optimization is the root of all evil." I take that has hyperbole, but the point isn't that premature optimization is bad. (So I think I get what you mean: premature includes the idea of being bad. It's one way of being bad. So in a sense "premature optimization is bad" means "a bad kind of optimization is bad". The point is premature optimization is particularly to be avoided. Treating it as a tautology misses the main point.
    Suppose you're a snake charmer talking to an old-timer and the old-timer says, "The venomous king cobra is venomous." That's just a tautology. But if the old-timer says, "The venomous king cobra is the most venomous snake you'll ever handle and a single bite can kill an elephant," then hopefully you don't get stuck on the tautology and can see the important, actionable warning in there.

bcrosby95 6 years ago

I've heard that performance is a feature but I feel like that understates the effort involved in seeking performance for a piece of software.

If you want to call it a feature, its closer to N features: 1 for each feature you have. If you have 10 features, and add performance, the effort involved isn't like having 11 features. It's like having 20 features. The effect is multiplicative.

This is because performance is a cross cutting concern. Many times cross cutting concerns are easy to inject/share effort with. But not with performance. You can't just add an @OptimizeThis annotation to speed up your code. Performance tuning tends to be very specific to each chunk of code.

gameswithgo 6 years ago

If everyone on the team makes it a habit of worrying about it, everyone gets better at it. It becomes a part of the review process - "this looks correct, but is there a faster way?" or "this looks very fast, but we could make it a LOT simpler and only a little slower, maybe we should."

luord 6 years ago

> And while the SQLite developers were able to do this work after the fact, the more 1% regressions you can avoid in the first place, the easier this work is.

That mention of regressions seems, IMO, a slightly out of left field attempt at dismissing how the SQLite example shows that you can, in fact, "make it fast" later. Maybe he should've a picked a different example entirely because it undermined his point a little bit.[1]

All in all, his entire thesis comes from talking about a typechecker, which is indeed a piece of software whose each component in general contributes to the performance of the whole. It isn't a set of disparage moving parts (at least, from what I remember of my time studying parsers in college), so it's very hard to optimize by sections because all components mostly feed off each other. Most software is not a typechecking tool, plenty (dare I say, most) of software does have specific bottlenecks.

Though I do agree that, even if we aren't focusing on it right away, we should keep performance in mind from the beginning. If nothing else, making the application/system as modular as possible, so as to make it easier to replace the slowest moving parts.

[1] Which is a good thing IMO, as it highlights how this is all about trade-offs. "Premature optimization is the root of all evil", "CPU time is always cheaper than an engineer’s time", etc., are, in fact, mostly true, at least when talking about consumer software/saas: it really doesn't matter how fast your application is because crafting fast software is slower than crafting slow software, and your very performant tool is not used by anyone because everyone is already using that other tool that is slower but came out first.

magicalhippo 6 years ago

> What is perhaps less apparent is that having faster tools changes how users use a tool or perform a task.

Important here is that for a user, "faster" means with respect to achieving the goal.

At work we've created a module where, instead of punching line items by hand and augmenting the data by memory or web searches, the user can paste data from Excel (or import from OCR) and the system remembers mappings for the data augmentation.

After a couple of initial runs for the mapping table to build our users can process thousands of lines in 10 minutes or less, a task that could take the better part of a day.

It's not uncommon with some follow-up support after new customers start with this module, so I often get to follow the transformation from before to after.

They also quickly get accustomed. We'll hear it quick if those 10 minutes grows to 20 from one build to another, not much thought is given to how 20 minutes is still a lot faster than they'd be able to punch those 8000 lines :)

ken 6 years ago

> the SQLite 3.8.7 release, which was 50% faster than the previous release

Nit: the link says it’s 10% faster than the previous release. It’s 50% faster than some arbitrary point in the past, perhaps the time when they began their CPU-based profile optimization.

alexeiz 6 years ago

Nice and clean static layout. A rarity these days when blog post web pages tend to be overloaded with headers, footers, and various crappy interactive elements.

igouy 6 years ago

> I’ve really strongly come to believe that…

I’ve come to believe really strongly that…

PouyaL 6 years ago

Great stuff. We need to work on the fact at the moment, though that happens at time goes by.

Settings

Reflections on Software Performance

Keyboard Shortcuts