Wednesday, 19 December 2012

When functions do too much, and how to get them to do more

G'day:
After soliciting more UDFs for CFLib a while ago, I've been dead slack at processing the inbound queue, and fixing some bugs people have found. Sorry about that. I started playing catch-up at lunchtime today: you might have noticed a coupla bugfixes bubbling through on Twitter if you follow the @cfbloggers feed.


I started the test/approval process for a new UDF "isArrayOfStructs()", which - on the face of it - performed a reasonably useful task: validate that an array is an array of structs. It's not the most useful function in the world, but I can see how it might help people from time to time. A UDF on CFLib does not need to be the greatest thing ever: if you have a generic UDF that does something handy, it is good to share that sort of thing.

Anyway, whilst checking the function out, I noticed as well as validating if the passed-in array was an array of structs (as per the UDF name), it also:
  • had an optional flag to permit an empty array to be considered "valid";
  • had an optional flag to only permit structs of simple values to be considered "valid".
This made me pause for thought, and have a word to the lads in the office.

Firstly, I rejected the notion that an empty array is a valid "array of structs". It's not. It has no structs in it, so it's not an array of structs. That was easy (and the people I polled unanimously agreed). So I ditched that.

The second point was a more interesting one. I can see why the author of the UDF could well have had this specific requirement, but it is too specific for a UDF called "isArrayOfStructs()". It makes the name of the function misleading: it's not simply testing if the passed-in object is an array of structs, it also does this additional validation, which - really - is unrelated to the main validation being done. The name of a function should always reflect what it does, which is a good self-limiting notion in my view. If the UDF needs to be called something like isValidArrayOfStructsWhereTheStructsHaveSimpleValues(), then the function is doing too much. It also suggests the function is too specialised.

So unfortunately we're going to get rid of that bit too. It still leaves a handy UDF though, and I'll get it up on CFLib this evening.

We continued to discuss this though, and agreed it's definitely legit to want to know the composition of the elements in an array: they might all need to be simple values, or might need to be instances of a Animal CFC, or might need to be a struct with keys "firstName", "lastName" and "emailAddress". Or anything like that. One of the problems with CF being so loosely-typed is there's no control over what goes into an array. Which bites if you need this control: one cannot make assumptions about the composition of an array.

What we decided would be cool is to make a generic isArrayOf() UDF, which takes an array and a callback which is then used to check that each element is indeed valid according to [some rule]. This proved to be remarkably easy:

function isArrayOf(object, validator){
    if (!isArray(object)){
        return false;
    }

    for (var element in object){
        if (!validator(element)){
            return false;
        }
    }
    return true;
}
That's that. The function itself is just the infrastructure for "validating an array", it does not busy itself with "an array of what?" That's up to the calling code to provide, eg:

numbers = [
    {number=1, english="one", maori="tahi"},
    {number=2, english="two", maori="rua"},
    {number=3, english="three", maori="toru"},
    {number=4, english="four", maori="wha"}
];

function numbersValdiator(number){
    return structKeyExists(number, "number") && structKeyExists(number, "english") && structKeyExists(number, "maori");
}

writeOutput(isArrayOf(numbers, numbersValdiator));

numbersValidator() is a very simple - and somewhat contrived - validator, but obviously it could be as complex as one likes.

Also note I've written that code to run on my CF9 server I have at work, but with CF10 one could write that validator inline with the function call if that was appropriate. I don't personally like that technique so much, but a lot of people do (and there is a time and place for it, even IMO).

Back to the original UDF candidate on CFLib: we could easily implement its additional validation rule, that the keys must all be simple values:

function allSimpleValues(struct){
    for (var key in struct){
        if (!isSimpleValue(struct[key])){
            return false;
        }
    }
    return true;
}

This demonstrates the neat decoupling of what's needed to validate an array, and what's needed to validate each element of the array.

Indeed - now that I think about it - one could use this to provide the isArrayOfStructs() functionality with a wrapper for CFML's isStruct() function:

function isStructWrapper(struct){
    return isStruct(struct);
}

Nice one.  I'll finish sorting out that isArrayOfStructs() UDF on CFLib later this evening, but first things first: I'm expected at the pub. So I better scarper.

Before heading off I'd like to thank a few of the guys in the office for helping me with this: Simon, Duncan and JayBo. It was a bit of a good team effort going over the code, coming up with the idea of the general validator, and basically the topic for this article. Cheers chaps.


--
Adam

Update, after the pub: the isArrayOf() function has been added to CFLib too.