Commentary about Comments

For some reason, commenting is one of the more polarising discussions about software development.

I am going to upset a lot of you right here which is why it's basically the first actual chapter so you'll either read and nod along and realise I'm right, or you'll bail and ignore the rest of it, which is fine too.

You will spend more time reading code than writing it, and the code is like the second-hand science textbook you had in highschool, with last year's student leaving helpful scribbles in the margin. And in time you will come to find that these are usually helpful, often in ways you don't realise at first.

But comments are a smell

In my experience the people that talk about comments being bad are the sort of people who encounter nonsense like the following and rage at it.

// start counting the total at 0
total = 0

// loop from 1 to 10
for (i = 1; i <= 10; i++) {
    // add this to the total
    total = total + i

    // print what we're doing this time
    print "Currently {i}"

    // print the new total
    print "Total {total}"
}

// output the final total
print "Final total {total}"

This is usually not helpful, because it's describing something that is painfully obvious. The comment is literally more visual noise than the code on its own would have been. And yes, I have seen 3+ year "senior developers" hand in code like this.

For the record: no-one, not even the people who like comments, want this - and inevitably when people talk about comments being a smell, this is what they mean.

But reaching for a comment is a smell

I don't know about you but I've written complex code in my time. Not 'fly this thing to Mars' complex, but I have written stuff that is thousands of lines of number crunching where legal compliance is a small factor, and where the calculations are a little more complex than addition.

Especially when you're wrapping everything in layers of fixed point math library. There's no nice way to deal with that. But sometimes no matter what you're doing, you're not going to be able to nice-name your variables enough and make little functions enough to be able to help you.

Case in point, I have a piece of code that is 5 nested if levels deep crunching through some numbers. Wrapping it in functions makes it exponentially slower because it then has to pack up and wrap the variables in something stateful to pass them back and forth.

Anyway, there are times when you get through the innermost loop and realise that you're done not only on this time, but that you can safely skip not only the rest of the inner loop, but the next outer loop as well. When I then find myself hitting a break 2 or even break 3 style clause, I'd love to have a signpost comment that tells me what we're skipping, and ideally what circumstances they are. Doesn't need to be in-depth, certainly doesn't need to be a novel, but a reminder 'oh hey, we're done with this, and that means we can skip back to that'.

Isn't that what version control commit notes are for?

I mean, there's a thought: if you don't like comments, are you even putting in nice commit messages? But there's two problems with using commit messages for such things.

Good practice is that you put a 72 character message as the note and maybe some explanatory stuff underneath. But all too often fix is good enough.
Even if you get a good commit note, it's in the commit note and not right there in the code. Yes, yes, I know you can spruce up VSCode to show you the commit message for a given line but very often a) people don't have VSCode set up for that and b) there's no guarantee the last commit gives you the context that would actually be useful.

If comments are margin scribbles, commit notes are post-it notes stuck to pages. Maybe they'll be there, maybe not, and maybe they'll even be on the right page, but no guarantees whatsoever.

Why, not what

This is the big thing, ultimately. The code, even the most obtuse and cranky code, will tell me what. But why?

Some of the time, the why is fairly obvious. If you're in a routine about dealing with a calendar and everything is grouped to 7, loops 0-6 or 1-7, it's fairly likely that this isn't a missing magic number as much as the obvious fact that a week has 7 days and you're looping over a week.

(I honestly don't know how I feel about using a magic number for how many days there are in a week vs a constant. After all, I don't expect the number of days in a week to change any time soon.)

Now if you're knee deep in some gnarly stuff and why you're doing anything isn't glaringly, painfully obvious, make a note. I don't care how much it hurts, do it anyway. Even if it's only a 'refer to ticket XYZ-24', anything that gives you a clear place to understand how you got there.

Why does why matter

Because you can bet that in the future, someone will be tasked with going into that particular jungle and looking at it. Maybe it's a bug, maybe it's not but you need to investigate. Maybe it's, worse, a change. Maybe a particular calculation was done a certain way for legal reasons and now that's changing and you need to adjust it.

And, guaranteed, the person who goes in to do it won't be the person who wrote it. Guaranteed.

If it wasn't you, you aren't a mind reader, you have no idea why someone did something the way they did it. You will appreciate the sign pointer.

And if it was you, unless you're a super genius (statistically, you're probably not) with a super memory (statistically, you're probably not), past-you who wrote it is not the same as present-you and present-you has no memory of why past-you did anything.

(You might be like me and consider that anything which happened before today is past history, after all, I've been to sleep since then)

And present-you is now up the proverbial creek without a proverbial steering implement. ANd you're in a bind: you need to understand this logic, and why it came to be. Was it a business decision or a legal one or a technical one?

Is it safe to change on that basis? The code, you can understand its consequences (especially if, you know, you have tests, but even if you don't, you can check) - but whether a change has unintended and undocumented consequences is a place you don't really want to be.

Self-documenting code

Nonsense. Sorry, but it is.

I don't care how beautiful your code is, and how expressive it is. At best, your self-documenting code is capable of telling me in perfect clarity what it is doing.

Does that mean you shouldn't write nice code with good variable names and good function names and such like? Obviously you should, why is that even a debate?

But even the shonkiest code I've ever seen also covers... what it's doing because that's the nature of code. It does what it does. I've had the privilege and joy of reading undocumented assembly and figuring out what it's doing - no comments, but the code is the code. It means what it is.

Expressive, elegant code is lovely. It reduces my cognitive load - but it still is limited to telling me what it's doing.

I have yet to encounter self-documenting code that is capable of telling me why. See above for why the why matters.

Moreover, most of the real world code out there starts out as pristine that never survives contact with the users, and it is for these occasions that comments are worth gold.

It is also my experience that those who preach the gospel of self-documenting code either never write anything particularly complex (so the documentation step is genuinely verging on the unhelpful), or they write stuff so impenetrable that no-one wants to touch it making it a single bus-factor piece of code that needs to be replaced at the earliest opportunity.

If they're too good to add comments, chances are they're too good to write good code anyway.

But what about comments getting out of date?

It happens. If you notice it, fix it. Be the change you want to see.

The reality is that a codebase gets out of step with reality eventually; as real world names for things change, the code often doesn't. You might have started with 'users' but at some point 'members' crept in as the nomenclature, and you fixed all the localisation strings but all the code still uses 'users'. Worse, some code might say users and some might say members and use them interchangeably. (They might even be interchangeable... at first) This of course is also a thing you should fix where possible, but this is usually a larger refactor so gets put off because it doesn't add value.

This is not to say that you shouldn't try. This is to say that you should leave the place at least as tidy as you found it. And if you spot a comment that is out of date while you're there... fix it. Future-you will thank you for it. Or if not future-you, the poor schmuck in there instead of future-you will appreciate it.

The final word

Don't be afraid of putting in comments.
Not every line needs a comment unless it's skull-jarringly complex code, and if in doubt ask someone else for guidance.
If the why is even remotely not-obvious, make a comment.
If the what is not immediately clear to a passing reader, make a comment summarising a chunk of logic.

Header picture by Artur Shamsutdinov on Unsplash

Introduction