Saturday 4 August 2012

Hungarian Notation

Here's another old-skool and divisive topic: the usage of Hungarian notation in variable-naming.

As a brief overview for those not interested in following the link, Hungarian notation is a convention that suggests adding a prefix to a variable name to identify the variable's type, eg "qRecords" instead of just "records", or "sFirstName" instead of just "firstName".


This article addresses the usage of "Systems Hungarian Notation" not "Apps Hungarian Notation" (see the distinction in this Wikipedia article: "Systems vs. Apps Hungarian"). Systems Hungarian Notation was not Simonyi's original intent, however it seems to be the one that's been most adopted. That said, I don't think Apps Hungarian Notation has merit either in modern code. Its theory is reasonable, but its approach of using abbrevs. isn't great. If yer going for clean code, then clearly and fully articulate your variable names where you can.

I think this is what Adam Tuttle is alluding to in his comment below, but I don't think he quite nails it.

AC 2016-12-30

A quick aside: either "qRecords" or "records" is a really crap name for a recordset, unless the data describes your old vinyl collection. ALL compound data structures are "records" really. Your variable name should reflect the data, not its structure or where it came from (eg: the database).  I say this only because I see a lot of code with variables like that.

Anyhow, a quick summary of likely prefixes would be as follows:

Data Type Prefix Example
string s variables.sName = "Zachary"
list l variables.lColours = "red,green,blue"
integer i variables.iAge = 42
float f variables.fRatio = 3.1415
boolean b variables.bShowHeader = true
array a variables.aPeople = ["Adam", "Zachary"]
date d variables.dBirthday = 2011-03-24
query q variables.qPeople
struct st variables.stParams
xml x variables.xAddress
wddx w variables.wInvoiceLines
instance of an object o variables.oInvoice
(this is a bit plagiarised from my workplace, but... err... yeah well OK you caught me there [gallic shrug]).

I did my formal programming education back at the beginning of the 90s, with a lot of focus on C (and - chuckle - COBOL). One of the recommendations made in the C classes was to use Hungarian notation to make more clear what variables represented. The practice kinda stuck with me. I think this is at least partially due to my nature being slightly one of "a place for everything, and everything in its place", and I like rules, and I have had a tendency to place adherence to the rules above giving thought to whether the rule is valid ("at all ~", or just in the given context). I've recently found myself reflecting on this - I consider it a significant logic failing in the way my brain works, and is intellectual laziness - so I've been questioning some of my SOPs recently.

Anyway, I bought into the notion that carrying the data type around in the variable name was useful, so even in my home/scratch coding, I use H/N. Equally this wasn't entirely without assessing various nay-saying opinions, and basically disgarding them as facile or specious (which mean much the same thing, I guess) in the context of ColdFusion code - especially as CFML is "forgivingly" typed.

Recently we had a new starter at work: "new to us", not "a newbie", and as with any new starter moving into a senior position (or, hey, any position), he started questioning pretty much everything we do. I think we get some things right in his view, but he did raise a question mark over our usage of Hungarian notation. And for once he made a compelling case against it. When I say "for once", that's not a reflection on how compelling his cases usually are, I mean its compelling in the context of the other cases against Hungarian notation.

Basically it boils down to the notion that one doesn't actually need to know the type of the data until one comes to use it. And then the context of how it's being used will explain what the data type is. Or that - even then - one really didn't need to know that info all the time. For example, given a variable "invoiceLines", what data type is it? One can't say. But what can one say about it is:
  • they're invoice lines;
  • there's clearly more than one of them (the names is plural)...
  • ... so it's a collection of some sort.
That's without any indication of data type; is is - however - a fair appraisal of what's going on.

But let's pretend we're looking through some code, and we're still not savvy as to the data type of these invoice lines. And all we have to go on is:

invoiceLines = invoice.getInvoiceLines();

We need to do stuff with the variable, so need to know what data type it is. OK, if you just made that "get" call you must've had a reason to do so, and you looked up the API to see what method to call, and should have paid attention to the method's return type. So you should already know.

What if it was already being set, and you just need to do stuff with it? Well the variable must've been created for a reason. Look at the code nearby:

numberOfLines = arrayLen(invoiceLines);'s an array.

numberOfLines = invoiceLines.recordCount;

It's a query.

numberOfLines = invoiceLines.getLineCount();

It's an object.

Right, there you go: when the variable comes to being used, one can generally identify what type it is by how it's being used. If it's not being used adjacent (ie: within a few lines) to where it's being created, your code probably needs refactoring: why is it being created if it's not being used?

If you can't tell even after this point just dump the thing out and then you know. Write your code and then your code will be clear to the next person add to what the type is.

So far this was all just an intellectual exercise, and I have no control over the code I write in this regard during my day job but I'm gonna drop H/N from my own coding. I've not used it in my sample code on this blog, and I don't think it's suffered any readability issues at least in that regard.

What are your thoughts on this topic (and the readability of my code, too, for that matter)?

I've had commuting dramas this morning: the tube was down at my station, so instead of one 30 min tube journey on s single line all the way to the office; it's been bus, train, tube, and had taken an hour now. Still: enough time to write this blog entry I guess (on my phone... :-S), but now I need to crack on with some work.

[24 hours pass]

Bloody hell. Yesterday was a bit or a 'mare, and I didn't get a chance to proofread this and press "send".  Then I had to go to the pub with the lads because we all deserved a pint, then 4022 pints later I got home, and collapsed in a heap.  Now it's Saturday morning and I've just had a chance to proofread this, and am about right to press "send".  Am very pleased, btw, that I am off to watch some Olympic boxing (of all things... I actually have no interest in boxing.  Or any knowledge about it at all, other than it involving people bashing the sh!t out of each other.  Should be fun) today.