Friday, 20 November 2015

ColdFusion: a piece of functionality should do one thing, and do it well

One of my pet hates is having a function called doX(), and doX() does X, but it also does Y.

An example of this is CFML's isValid() function which not only checks validity - as one might expect - but also trims any string values its validating first. The function is not called isValidAfterTrim(), so it should not do the trimming. If one wants to trim a value before validating it... call trim() then isValid(). Simple. Another example is <cfdump> which picked up and abort attribute along the way. This was because some mungbean said "well usually I want to dump and abort, so why can't dump just abort too". And for some stupid reason Adobe went "oh yeah... let's do that". No. If you want to abort after you dump... do an abort after your dump. Don't change <cfdump> to be <cfdumpandabortifyoulike>. That's dumb. At least the Lucee ER to have <cfinclude> also do an abort got rejected ("Add optional abort Attribute to cfinclude"). FFS.

The reason to this is to keep units of functionality small, for starters. It makes them easier and more coherent to test. Also the more functionality one piles into a function (and for this purpose, a tag is just a function that looks funny ;-), the more potential for it being broken as it's maintained. Plus it makes code less clear and less clean.

A function should do one thing, and do it well.

Anyway, today I got wind of this ER: "encodeFor attribute for cfoutput, writeOutput". The intent is honourable, but it suffers from trying to do too much. The gist of it is:

While ColdFusion 10 added the various ESAPI encodeFor* functions, it is dependent upon the developer to properly wrap location where used with the appropriate function (e.g. <cfoutput>#EncodeForHTML(</cfoutput> ). Adding an attribute encodeFor negates the need for wrapping individual variables and would process the entire block contained within <cfoutput>cfoutput>cfoutput> for anything within #'s with the appropriate ESAPI EncodeFor* function specified.

On the face of it this sounds reasonable. It still smacks of "doing too much", so I echoed my position as much.

I sometimes doubt I'm being too dogmatic about these things, so I asked on the #CFML Slack channel, and John Whish came up with a great counter example:

<cfoutput encodeFor="HTML">
    <a href="whatever.cfm?" onclick="get(">Go to whatever.cfm?</a>

What's this gonna do? Encode all the expressions within it for HTML. Which is only correct for should be encoded for URL, JavaScript and URL for each of its three respective uses. However a less than discerning dev might not "get" that, and now their output is encoded incorrectly.

So what we'd need to then do is this:

<cfoutput encodeFor="HTML"></cfoutput>
    <a href="whatever.cfm?id=<cfoutput encodeFor="URL"></cfoutput>" onclick="get(<cfoutput encodeFor="JavaScript"></cfoutput>)">Go to whatever.cfm?id=<cfoutput encodeFor="URL"></cfoutput></a>

What a mess. What we could do is this:

    <a href="whatever.cfm?id=#encodeForUrl(" onclick="get(#encodeForJavaScript(">Go to whatever.cfm?id=#encodeForUrl(</a>

Look familiar? Yeah... we already have the correct (and safe) implementation in place.

We don't need to mess with something that's supposed to just output, and make it do something different as well. It's a bad approach to designing the language.

I reckon this ER should be pulled. It's a pity the work has already been done...