Saturday, November 7, 2015

The Problem with x, y, z (Why Write Descriptive Code)

I was going through some algorithm books lately and had a somewhat hard time looking at the code samples. To help I tried copying some onto my computer. While doing that I realized a lot of the variables' names suck... and also they took some shortcuts/optimizations. 

Perhaps they were trying to save pages... but their index is like 40 pages long and who ever reads that? Or maybe they want readers to figure out the mess to help them learn… I don’t think it’s a good strategy though because hard things just turn people away.

But this was also something I fought with at work. When I first took on the code base, it had a lot of non-trivial but short variables. You have 100 lines but one of the important variables is called  x or tmp.

Maybe to PhDs, those variables are trivial to them… but I don’t think they are for someone who is trying to understand the stuff for the first time…

But anyway yes, I usually name variables with a longer name such as an actual word/phrase or abbreviation of one. 

I am also a very liberal commentor sometimes writing small comment blocks to explain what a somewhat large block of code does... usually these are legacy and I write them after spending 30 minutes figuring out what they actually do...

A common scenario is in nested loops,  I don’t want to worry about forgetting what i and j really mean.  Yes they are counters but you’re probably using them in the loops. A common example,  reading data out of a data table or 2-D array.

Usually you think in units of row and column but you decide to call the iterator x, y. Well, come back in a month and tell me which is for row or column in 2 seconds? In fact, you’re probably going to mix them up within the loop while you are coding it. 

But how about r,  c?  For most this will do but sometimes it’s just better to use row,  col…  especially if you have some other variables.

I guess in the end it’s just readability: can I naturally understand the code and clearly tell what it does? Good variable naming is a big one.. . Organization/modularity is another.

Also it avoids unnecessary work, why spend 10 minutes on something when it can be figured out in like 10 seconds,  given good documentation and style? 

Again… it's better to be long term lazy than short-term lazy. Or as they say in the Phoenix Project: reducing and avoiding technical debt, because one day it will come back... with a very big vengeance.

EDIT: Sort of... actually I wrote this in a draft probably over a year ago...

Variable Names Should Be Descriptive

Cryptic names may be good for job security but more often than not you leave on your own free will. However, if you're still around, you're probably going to be banging your head sometime in the not-so-distant future when you need to make a change and you cannot figure out what this 'x' variable is for.

If you're just using for a counter in a small loop, fine. If the variable is a class variable or the logic spans 20+ lines, you may want to think again.

Comments are Basically Free

A lot of people may disagree with me but I tend to use comments pretty liberally. I understand code should be dead simple but there are often times where you are unable or cannot change existing code... or at the time you are writing it, you cannot think of an elegant solution.

Even if the code is good, sometimes you do need a JavaDoc or whatever they call it, to remind you or whoever works on the code, what the method does and any important, but small details, that could be missed like: "You know this code below looks obsolete but trust me if you remove it, people will complain." (this would have been nice to know in a few situations that I've come across).