Comments on Comments
noncombatant.orgIn the “Why” vein, some of the most important comments are “Why I made the compromises I made” aka “Why this looks dumb but is actually for the best” comments.
They can prevent someone, including myself, from undergoing a timely rewrite of strange or bad looking code before inevitably hitting the same wall I hit previously.
Bang on. This practice has changed in the hosted, PR-based collaboration world. Instead of embedding this kind of thing in the source itself, it's written up in these constructs that exist outside. Then they go missing or are just undiscoverable, at least until it's too late.
I quite often try to investigate those constructs before changing code I am not familiar with. Look at `git blame` to figure out commit(s) that touched the code. Try to figure out what PR they were part of to read the PR description.
This isn't as easy as I'd like, so I don't do it as much as I'd like. It's interesting to contemplate what sorts of UIs could make it easier.
At one point I distinctly recall github web UI for a commit showing you what PR(s) the commit was in -- but I'm having trouble reproducing that now, so maybe it no longer does? Or maybe i am confused and imagined it.
Yes, it's a lot of work, and it's very possible to just lose that information, even if you are committed to sticking with the platform.
It's also not unique to something like github. It was still possible to lose context when we used to email patches around. But I'm certainly encountering it more now, because I really do think the behavior has been influenced by the tooling. PRs are great, but in a case like this ("it looks dumb but it's not") the notes really ought to be inline. It's not practical to encourage a protocol wherein, upon every encounter with something fishy, we go traipsing all over trying to find out if someone, somewhere, sometime explained it.
I think my general point is that source management isn't the ideal place for the "whys" of things.
I'm not sure.
In the present case where it is very hard to track, ok. Of course we don't want to be forced to "go traipsing all around", and often won't bother if that's what it takes.
But comments are not a great place for it either. They can take up space, make source files harder to read as a whole (interupting your flow when you don't need them).
And a comment in source code is a commitment (often unmet!) to keep them _up to date_ and matching the code they describe, as it changes (as OP mentions), when actually for "why"/motivation, point-in-time comments at a point of _history_ (or several points as it changes) would often be quite sufficient, without the maintenance burden.
A world where many (not all) can be kept out of band in the source management system, as point-in-time historical notes, rather than in the source code itself and where it is very easy to track them down in source management, to me seems actually ideal.
I realize of course that world is not quite what we've got.
But for instance, when I _do_ track down the actual relevant (eg) PR's, especially aged ones, the _entire discussion history_ captured in them (possibly multiple PRs and commits over time) can often be _super valuable_, to have that context that there is no feasible way to embed in source code comments. Comments themselves are never going to be as good as we want, and we often end up doing "code archeology" regardless, what if it were super easy and frictionless to do?
I really wish (eg) github spent more time on UX to make this history easily followable without all the "traipsing around". But apparently there is not customer demand?
...and the epic "comment the empty `else`-block":
... } else { /* no action needed, enterprise users should not frib or frob... */ }I always try to think through all of the walls before I start on something like that, but it’s usually not clear until you hit it.
Came here for this. The "why not" comments are the most valuable.
The advice in 'Replace What Comments With Names' section was just talked about recently here: https://news.ycombinator.com/item?id=37517329 in an article I agree with 'Linear Code Is More Readable'.
If you're going to reuse those little tiny functionettes, then sure, it might be worth doing, but to do it for readability is misguided. Comments are a perfectly reasonable way of indicating logical blocks within code.
I've also used scopes for this:
I don't do this very often (and I'm having trouble locating an example off-hand, although I'm sure there must be a few in my public code), but it can be pretty useful at times.func DownloadAndVerifyThing(path string) error { var url string { // Build URL ... url = [..] } { // Fetch ... } { // Verify file. ... } }I have and do use this for my 'normal' code, but not very often. I tend to prefer breaking things up into functions (usually named, not lambdas). Where I do use this style extensively is in self-testing code:
This prevents variable result from escaping and contaminating the next test, which happened an awful lot before I started doing this.// set up data to be tested var fred = ... var cathy = ... { // check fred does something correctly var result = fred.reproduce(2); assert(result.length() == 2, "fred should have 2 children"); }Then maybe better ways; suggestions welcome.
Both _awesome_ comments, I made a similar one up above, but hadn't considered the `testing` use case which is actually really really cool to "protect" your test scope like that. Almost like a fake "setup/teardown".
A different pattern I like in Ruby is the assignment with begin...end:
Again, not always "self-documenting" as whipping out a new function, but useful if you need to separate chunks and keep the code linear.fred = begin user_finder = ... user_finder.find('fred') endThe only issue with Ruby itself is that the scope is not lexical, so you can still use `user_finder` outside the block above (linters can catch it, though). But it's still worth to separate code without having a brand new method.
> Comments are a perfectly reasonable way of indicating logical blocks within code.
Comments like these very often end up lying to my face and make me lose precious time. I nowadays tend to semi-ignore them and just read the code anyway. YMMV, I guess.
If you consistently break functions down into tiny “functionettes” the names of the functions can as easily lie to you. You start with `buildURL` but it gets complicated, you break part out into validating the URL, now it should be `buildAndValidateURL` but it’s never updated and the function name “lies”.
I suppose if you prefer to ignore comments, they are more likely to get out of date, which may be why your mileage varies.
Doesn't even have to be an intentional lie. Sometimes there are just multiple "things" a function could be known as and only one can be chosen for the function name. I recently wrote a function crc() that goes and computes the thing. The comments above it talk about what CRC it is (non-obvious without domain knowledge of CRCs and the different forms polynomials can be written in) and why it was chosen. This commonly done with cryptography as well. AES functions are commonly named rijndael with a comment about the standard or vice versa.
Indeed I agree that it can be difficult or impossible to squeeze all the important information about a function into its signature! And why bother playing golf when that’s literally what the doc string is for?
(Obviously you should endeavor to write good names anyway).
No more often than the names of the sub-functions themselves becoming inaccurate ime.
At some point the system does depend on its authors doing the right thing.
The right thing, I can define with code and check for correctness with tests. I guess I mean that if it can be expressed clearly in code, it should.
But I don't completely disagree that they can have their place. They just get used as crutches for unreadable code a lot of time, that's all.
100% agree with most of this, but I think the why can be dug into a little deeper. Under why comments I find they fit into two categories:
1) Why does the code exhibit behavior X? (~80%)
2) Why does the code do X this way? (~20%)
1 is usually a customer facing quirk which should be written down somewhere, but preferably not in the code directly. This stuff fits extremely well in tests which exercise the behavior ("why does a user get an ID before the creation of a ticket?"), in reference docs ("why country appears 3 times in this API schema") or in a glossary (e.g. a quirks section under the "admin user" wiki page). These explanations will get more eyes on them this way, will be more likely to be kept up to date and will be available to non developers.
2 should not be non-existent but should be rare. In practice I find 2s are almost entirely absent from most code bases because to the person writing the code it is obvious.
For 2, because of this, I find it's better to get somebody else to ask why and use answers to the whys on pull requests to write those comments: https://hitchdev.com/approach/communal-commenting/
The vast majority of my comments are left for future-me, because next time I'm on this code I likely won't recognise past-me or past-me's motivations.
For (1), anything outside of the code is going to be missed in the next code review. Eg:
// ACME-1275 Acme uses desc field to identify references
Perhaps you can use the reference to point to wherever you documented it. I'd rather just keep the ticket reference, which won't change. Who wants to read docs that get into that kind of minutia anyway?
But without any comment, I'm wasting time looking for it, if I even realize that there is something to be looked up.
Missing is one other style of comment: how to use this API. This is not targeted at someone understanding or maintaining the code, but someone outside who just knows (or suspects) they want to use your API without understanding it. While this need not be with the code it generally is better that way because tools can extract information from the code (the comment is before the function foo, therefore it must be about foo, foo takes some parameters with some types - we can link to the documentation for those types...).
This style of comment is only needed if you expect your code to have users who are external and thus don't want to look into the details.
Tangential, but see:
https://news.ycombinator.com/item?id=37583258
Explaining "why" is often (revealing) more about how the programmer is thinking at that moment. It reveals hidden or ineffable knowledge about how the coder arrived at that point/design, and may reveal intent not explicit in the code itself.
Mismatches between the declarative and imperative have often been where I've found bugs. Especially in my own code when trying to explain it to myself in a comment opens my eyes to an error.
Those "why" comments are our stories about our code.
I typically write comments for "why" rather than "how". Or if I have to make what appears to be a weird hack or decision I put that too.
Another thing I really like to do is put github issues as comments if I am having to do some weird workaround in some API/library that I found from a github issue - i'll put the github issue URL so that way it is easy to see the latest update from that library later on to see if the workaround is no longer needed.
The fundamental issue with comments in programming is that they're part of the code, which is a ridiculous hack that somehow survives unquestioned. This is not how comments work in Google Docs, Microsoft Word, etc.. Maybe the idea of implementing comments as a greyed out part of the main text did not occur to the designers of these apps?
Counterpoint: keeping comments in the code results in a simpler format (just plain text, which is battle-tested and requires no special tools or lock-in) and it keeps the comments next to their context.
I find Google Docs comments hard to find and navigate, and usually simply forget to read them. A usability nightmare for me.
The is not the natural state of things. That's also not the original state of things.
For the decades that software engineering has existed, people have been busy migrating more and more information from offline in a different context into inline right at the code. Every single one of those times, people have experienced huge gains on the quality of the resulting comments, with moderate gains on project organization and productivity.
So, the reason comments are a greyed text inline with the code is that it has worked better. Having the comments offline (like software used to have) is very likely a flaw of those platforms. But well, the text-creation people never looked at improving their tooling anyway.
Different context meaning random chats and mailing lists, not an integrated system like Google Docs or Microsoft Word.
Different contexts meaning integrated systems, official documentation repositories, parallel files included with the code, and well, almost everything you can think of.
People have not spent those decades doing nothing. They tried a huge diversity of stuff.
That's an interesting take.
Especially that more and more toolchain always comments with formatted contents to runnable example snippet that can be checked by tests, that can be extracted and formatted to be consulted otherwise.
I suspect that the common use of greying it out is that it's hard to concentrate to both code and reading comments, the priority given to code and remove distraction.
Maybe there's a dedicated UI to invent, something that would even more remove comment distraction to just indicate their presence, and provide a handy access to the content in a nicely formatted way without disturbing the main code view.
The default Vim colourscheme has comments in blue, and they stand out rather than fade away. I don't know if Vim is an exception or if this sort of thing was more common in the past, but this always seemed right to me (actually, being designed for 8/16 colour terminals means there probably wasn't really much of an option).
I do have a few lines in my vimrc to make "///" and "##" appear as greyed out; I use this mostly for "literate programming"-ish type of comments, which don't need to stand out so much. It works very well for me, although for other people it looks like it's a "mistake".
I have wondered what a rich text / markup approach to code would look like. Would remove tabs vs spaces if the code included alignment vs scope nesting marks and the reader could format them however they wanted.
Yes, related: Tables for conditionals. For example, for routing, have a table where the colums are methods and the the rows are paths. Or for keyboard shortcuts, have a table where the the colums are modifier keys and the rows are the modified key.
There is also the comments about why something commented out exists. Don't make your fellow developers (including future self) to figure it out again.
//// These lines are a very simple way to enable feature X on your machine:
// obscure.thing=false
// someVar = "probablyAthing";
You all would hate my code, and that's okay. Why? Because I do everything that is suggested, save one: I leave what comments in.
For some reason, it is easier and quicker for me to understand English prose than code, even if the code is simple. That includes checking the actual code after reading the comment to see if it matches; having a target for the code makes it orders of magnitude easier to read for me.
That said, I still weirdly find value in what comments. Even some of the simplest ones communicate intent for me.
For example, I have many comments that take one of the following forms:
// Cache this.
// Get the <item>.
These are the exact comments that people hate so much, but I like them because they communicate these things to me:* The item is gotten for efficiency reasons. This means that I should check that it really is more efficient if something is slow.
* More importantly, the item is expected to not change, so if I'm digging aground for a bug, I should check that the item actually does not change.
So these "useless" comments actually help me.
In addition, the fear that they will go out-of-date is less of a problem for me because I have a strict habit of updating comments with code.
Now, I don't suggest that everybody do what I do; I suggest the opposite, in fact. But I work alone, so I can do things that were ideal for myself.
Yes: I find "what" comments critical. Naming things is hard; getting two people to agree on the concept the name represents is harder. I write "what" comments -- usually at a class, class field and method level -- religiously because it articulates the concept the class/field/method is supposed to represent. The whole point of writing code is to build a logical model of real-world concepts; if you can't articulate the concept, you can't write the code to model it. Sometimes these comments help others, but I find their greatest value comes when I finish writing the comment/documentation and compare it to the code. I often find that they don't quite match -- my code isn't doing what I just said it is -- and that forces me to either clarify the concept or fix the code. Either way, the model ... erm, system ... is better than it would be if I just relied on names alone.
An interesting go-ish thing to do regarding: """Now, DownloadAndVerifyThing is shorter, contains less state, and is more obviously a composition of several tasks."""
...I've experimented with "convert a long sequential, independent function into several internal anonymous functions" (ie: similar to IIFE in javascript).
This generally prevents (or contains) variable leakage and prevents (or contains) "complexity leakage".
In their example: `DownloadAndVerifyThing(...) { ... }`, those functions could be defined internal to the `DownloadAndVerifyThing` function, which is kindof a further form of comment: "thou shalt not be using these weird functions outside of this particular function which does the downloading and verifying..."
Even if it's done not in a function but in an anonymous block, "trapping" any state leakage or side-effects is nice for complexity reduction.
Example:
func Foo() bar {
{
x := 1
y := 2
...lots of math, etc...
}
abc := 123
}
...you're not polluting the function internally with a lot of extra variables hanging around, which makes the inevitable refactoring "more clean" as you know at least the code block can be extracted independently w/o impacting anything _after_ it (although you still have to be a little careful with the "before" and "ordering" portion of it).Tangential but I will say that c# is a great multi-paradigm language with terrible code comment culture that still uses xml-based comments, ugh it drives me crazy. I've even looked into figuring out how to get the compiler to auto transform javadoc-style or implicit-style comments into the structured format it expects, but it's been tricky. I find this verbosity with the xml markup makes it more difficult to read as a human and discourages good comments.
i wonder... things like this are known (mostly from trial and error bit not only) for 50-60+ years.. why they are not studied?
and every new kid on the block has to find-out/invent them ?
I like the beware comments, where weird code is done in a certain why that won't make sense a few days after its released and not touched again.
I'd like to add another comment type to this list, which is searchable #topics and alternative procedure names.
For example, if there's a feature around replies, I may put #replies in the key places in the code. If some code takes part in generating replies.html, I might put "#replies.html" as a comment.
A lot of my procedures have comments with alternative names aliased underneath:
sub SqliteQueryHashRef { # $query, @queryParams; calls sqlite with query, and returns result as array of hashrefs
#sub SqliteGetHash {
#sub SqliteGetHashRef {
#sub SqliteGetQueryHashRef {
#sub SqliteGetQuery {
#sub GetQuery {
#sub GetQueryAsHash {
#sub GetQueryAsArray {
#sub GetQueryAsArrayOfHashRefs {
Each of these represents a time in the past when I went into the global find tool and searched for e.g. "GetQuery ", didn't find anything, and then eventually tracked down this procedure.BTW, if you only put a space after the procedure name in its definition, its much easier to search for.
Code is logic. Comments are wisdom.
I have seen many comments that are the opposite of wisdom.
I have to agree. Even in some cases wisdom went into hiding and applied for witness protection when he saw what was being written.
But I think it's a good phrase to keep in mind when writing comments.
Indeed, and I have also seen much code which does not do what is alleged by the names given to its identifiers.
And sometimes it severely lacking. I heard of this 5000 line "onClick()" function where most of the program, screaming and yelling, had been forcefully stuffed.
Literate programming is a nice middle ground but it doesn't work for every project and it definitely doesn't work for every person.
I like the idea of comments being impossible and one acts with that in mind. Along with the one about thinking the next person to view your code is a psychopath who knows where you live. Holding those 2 things in mind, you really can have your cake at eat it (too), you can have clean expressive code that covers a lot of the “why” without comments littering the screen.
How are you going to make the code itself scream of its “how and why”? Then failing that you can put a comment if you must.
I have worked with codebase with incredible comments. Changing the code was really hard and laborious and far from a joy to work with. When PRs become back and forth about how to change the wordings of the comments in the code, it is soul destroying, it becomes very time consuming. Like writing a joint novel at the same time as writing the code. Some people like it that way and each to their own, I know I can’t convince them.