Wednesday, 26 September 2012

Callbacks, function expressions, IIFEs, delegates (OK, and closure I s'pose)

G'day
Right, so after a coupla delays, here's the article I threatened you with that's not about closure. Obviously it actually is going to mention closure, but consigned to an afterthought (where, IMO, they kinda belong).

So what is it about? Well it's about a technique that's been possible in ColdFusion since CF5 but is not obvious so is under-utilised, a handy new syntactical construct that was added in CF10, and another one that should have been in CF10 but isn't, something for CF11... and a footnote.


Callbacks

A callback is a function that one passes in to another function. Why would one want to do this? Well sometimes the function taking the callback is performing some generic action, but to perform that generic action in a useful fashion it also needs to have it passed-in some specialisation logic. Huh? Well consider a sorting process: as well as the actual sorting algorithm (eg: bubble sort or quicksort or something) which is the job if the sort function itself, to sort data one also needs to know "what it means to be sorted, as far as that data goes": some operation than compares one item to another and determines whether they are in order. An obvious example is that numbers sort differently than alphabetic data: this is why arraySort() - for example - takes an argument to specify "text", "textnocase" or "numeric". But what about if you have a collection of "people" structs to sort? None of arraySort()'s sorting options will help here: ColdFusion cannot be expected to know that you want to sort the people first on family name, then on forename(s), then on date of birth (or something). The most obvious solution is to create a specialised function for sorting people structs, eg: sortArrayOfPeopleStructs(). You can hopefully already tell just from the function name that this is not the way to go: it's too specialised. And what if you change the data structure to be an array of objects? Or need an additional sort to sort by name and location? It's not much hassle to duplicate the first function and change the name/name/DOB comparison logic to compare name/name/location logic, but this approach sucks because it means duplicating the sort algorithm bit too. And what happens when you decide to upgrade your algorithm from a bubble sort to a quicksort? You need to refactor all your search functions. Like I said: this approach sux.

Implementing your logic to decouple the sort algorithm from the algorithm that determines "what it means to be sorted" for your given data type:
  1. reduces the risk of refactoring
  2. simplifies your logic.

Consider this (pseudo ~) code:
sort(dataToSort, comparator){

    // sort algorithm stuff
    
    doINeedToSwapTheseItems = comparator(first, second);

    // swap the items if need be
    
    // more sort algorithm stuff
    return sortedData;
}

comparePeople(firstPerson, secondPerson){
    // I hope I get this right... ;-)
    if (firstPerson.familyName > secondPerson.familyName){
        return true;
    }else if (firstPerson.familyName == secondPerson.familyName) {
        if (firstPerson.foreName > secondPerson.foreName){
            return true;
        }else if (firstPerson.foreName == secondPerson.foreName) {
            if (firstPerson.dob > secondPerson.dob){
                return true;
            }
        }
    }
    return false;
}


sortedPeople = sort(unsortedPeople, comparePeople);


Now you have two simplified functions: on that sorts (and can sort anything), and one that knows how to compare two people and determine whether they are in order. Individually they don't achieve much, but if you modify the sort method to take the comparator method as a callback, you're away laughing. And if the comparison requirements change, one can just write a new comparator. Nice.

Callbacks are utilised extensively in collection-processing functions, where one wishes to perform a specific action - whatever that might be - on each element of the collection. ColdFusion 10 adds a bunch of callback-enabled functions to process various collection data types, eg: arraySort() for sorting an array, arrayFilter() for filtering elements out of an array based on criteria specified in a callback, arrayEach() is just a generic "iterate and do something" function. There are others for structs and lists too. And, of course, one can write one's own functions and methods that take callbacks too (Mark Mandel has done some excellent work with his Sesame project). A function is just a variable in ColdFusion, and can be passed around like any other variable.

Function expressions

Prior to CF10, the only way to create a function variable in ColdFusion was to declare it using the function keyword, or a <cffunction> tag. For most purposes this is fine. If you're familiar with Javascript you are probably aware that as well as having this declarative syntax, one can also create a function variable via an expression:


// using a declaration
function doSomething(withSomething){
    // doSomething withSomething
    return somethingElse;
}

// the same function, using an expression
doSomething = function(withSomething){
    // doSomething withSomething
    return somethingElse;
}; // note the semi-colon (I always forget it)


CF10 added this syntax to CFML as well. This means that wherever one might have an expression, one can now define a function as the expression's value.

This is of limited use in my opinion, because its primary usage is generally to define a callback inline with the call to a function that takes the callback. This is very common when using JQuery (for example). To be frank I think it's a pretty lazy and ill-organised coding style, and - to me - is roughly analogous to the idea of peppering UDFs around one's codebase in an ad hoc fashion rather than grouping them in at least a library, if not a CFC. However it's a really popular style, so it's important to be aware of it. Here's an example:


// definition
function doSomething(Data something, function someFunction){
    somethingElse = someFunction(something);
    return somethingElse
}

// usage
result = doSomething(
    someData,
    function(stuffToHandle){
        // process stuffToHandle
        // return something
    }
);


Personally - as I said - I'd tend to define the callback in a more organised way, and then pass it by name:

// function definition
function doSomething(Data something, function someFunction){
    somethingElse = someFunction(something);
    return somethingElse
}

// callback definition (in a CFC or a library or just anywhere else other than inline)
function stuffHandler(stuffToHandle){
    // process stuffToHandle
    // return something
}

// usage
result = doSomething(someData, stuffHandler);

But for scratch code this is an OK approach.

Another area where I think function expressions are more useful is that one can have a function within a function. This isn't perhaps something that one would need to do that often, but it's handy to be able to do it. On the weekend I was writing a UDF for CFLib which handled conversions between arbitrary bases (here it is). Two of the arguments needed a small amount of parsing before use, the parsing being the same for both. The code to do it was a switch statement, thus:

baseDefinition = {};
switch (base){
    case "BIN":        baseDefinition.digits="01"; break;
    case "DEC":        baseDefinition.digits="0123456789"; break;
    case "HEX":        baseDefinition.digits="0123456789ABCDEF"; break;
    case "BASE36":     baseDefinition.digits="0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"; break;
    case "BASE62":     baseDefinition.digits="0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"; break;
    default:           baseDefinition.digits=base; break;
}
baseDefinition.base = createObject("java", "java.math.BigInteger").init(javaCast("String", len(baseDefinition.digits)));



Now I could have just repeated the code, which would work but is not very DRY, or I could have farmed it it to an external private function if it was in a CFC. But this wasn't possible (CFLib UDFs need to be self-contained), and also the function would have no purpose outside of that precise context, so even in a CFC situation it was not a good fit for a private "helper" method. The correct (IMO) approach was to have an inner function within my UDF. Which function expressions enable me to do. Cool:

string function baseMToBaseN(required string number, required string fromBase, required string toBase){
    // other stuff snipped for brevity
    var getDigitsForBase = function(base){
        var result = {};
        switch (base){
            // etc
        }
        // etc
        return result;
    };

    var from    = getDigitsForBase(fromBase); 
    var to      = getDigitsForBase(toBase); 
    
    // rest of function
}



This will help me with stuff on CFLib because some of the functionality I've wanted to share never quite fit into a single UDF for much the same reason as with this example.

Immediately-invoked function expression

CF10 doesn't support these. I asked for them "at the appropriate time" (NDA prevents me from saying more on that), but they missed the cut. Maybe for CF11 (enhancement request raised as 3346435).

But what are they?

In Javascript, one can do this:

(
    function(message){
        var uppered = message.toUpperCase();
        document.write("Inside: "+ uppered + "<br />");
    }
)("Hello World");



// demonstrate the variable is gone
if ("uppered" in window){
    document.write("After: "+ uppered);
}else{
    document.write("After: not set");
}

Output:

Inside: HELLO WORLD
After: not set


Huh? (Well that was my reaction when I first saw this code, but I am not that experienced with JS... something I am actively working on these days).

What's going on here is that we're declaring a function, executing it immediately and discarding it. The syntactical conceit here is the double usage of parentheses.  My interpretation of the syntax is that the first set signify it's an anonymous function (it's only called once, so it doesn't need a name), and the second set are the normal parentheses one uses in a function call. This anonymous function call is the same as this named-function call:

greeting = function(message){
    var uppered = message.toUpperCase();
    document.write("Inside: "+ uppered + "<br />");
};

greeting("Hello World");


What would be cool about this if ColdFusion supported them is that all the variables declared as local within the function only live for the lifetime of the function's execution, then fall out of scope. This means one wouldn't pollute the variables scope of one's CFM code with random variables that were only ever needed in one file. This is not a huge concern, but if you're like me and don't like clutter, this is a godsend. Well it would be if ColdFusion supported it. One thing I've always wanted is a file-local scope (like the local scope in functions, except accessible only to the CFM file they're declared in: not included files, not subsequently executed files, just the very file they're declared in). This would be a work-around for that.  And even within a single file, it's quite nice to have the ability to have a localised "scope".


Delegates

This is something else not implemented in CF10 although perhaps should have been (enhancement request: 3346444).  This time it did not occur to me to request them (mostly because I was only superficially aware of the concept at the time). One shortfall of CFML's support for callbacks is that one can only be as precise as saying the calling-function needs to be a function.  However it's seldom going to be the case that that is the full extent of the requirements of the callback function.  Taking the earlier sorting function as an example, the comparator callback function needs to fulfil this function prototype:

Boolean function(required any, required any)

This ain't a great example because the two arguments can be anything, but the key thing is it takes two required arguments, and returns a boolean.  If the function does not do that, then it cannot be used as this callback (because the function using it will break).  The way callbacks are implemented in ColdFusion make it impossible to specify the callback's requirements.

This is where delegates come in.  A delegate is to a function what an interface is to a CFC: it specifies a minimum requirement level.

So one might have this:

public Boolean delegate Comparator(any firstElement, any secondElement);

And in the function requiring the callback, have this:
public any sorter(any toSort, Comparator comparator);

This means that if we tried to call sorter() like this:

String function comparator(string first, string second){
    // etc
}

sorted = sorter(myData, comparator);

It'll error with something along the lines of  "invalid callback the callback passed to sorter() was not of type Comparator" or something similar (because a Comparator needs to return a boolean, and the function we're trying to use returns a string).  Just the same as if one passes any other sort of invalid argument type to a function.

Not earthshattering, but would be useful to tightening up code, and report any errors closer to their source than their side-effect.

Closure

One of the side effects of the way ColdFusion has implemented function expressions is that functions declared that way form a closure in some situations.  The Wikipedia link explains what closure is far better than I can be bothered trying to, but my reading of is that if you define a function via an expression within another function, then any variables from the parent function that are referenced in the inner function are "closed over", and their value at the point in time the function is defined persists for the life of the function.  Make sense?  Yeah, probably not.  Here's a quick demonstration.

makeGreeting = function(greeting){
    return function(who){
        writeOutput(greeting & ", " & who & "<br />");
    };
};

sayHello = makeGreeting("Hello");
sayHi    = makeGreeting("Hi");

sayHello("Adam");
sayHi("Zachary");

Here's the perennial "make a greeting function using a closure", which is a banal example, but it does the trick.

Here the value of greeting (which is a local variable in makeGreeting()) is closed over or enclosed or whatever one wishes to say within the function makeGreeting() returns, so when sayHello() or sayHi() are called later, they still have the value that was passed into makeGreeting() when they were created. Yeah: neat.

To be honest: this really is not very interesting or useful in ColdFusion. In other languages closure make sense or are necessary to do stuff, but I have not found a situation where this would be the case in CF. That is not to say there is no situation they'll be the best approach to stuff, but in all the scouring of Google I have done for what Javascript or other languages use closure quite legitimately for don't apply so much to ColdFusion.  Javascript people seem to mostly use them to force a square peg into a round whole: pretending JS is object-oriented.  In functional languages they seem to be useful for stuff I don't really understand (which describes anything to do with functional programming, in the context of "me" and "understanding it"), but are down to vagaries of functional programming which don't apply to CF.  Caveat: this is a nebulous assertion, so please prove / demonstrate me wrong here.

I've also scoured Google for good, real world examples of how closure might actually be useful in a CF context, and I have drawn mostly a blank.  There's a few other CF blogs that have articles on them, but they are either a) not really discussing the closure part of what's being described in CF10 as "Closures" (they discuss function expressions and callbacks, mostly), or the examples demonstrate how closure works, but are more "proof of concept" than anything one might actually want to do in CF.  Most examples are based on how to say hello in different ways (as per the example in the Adobe docs, which I actually gave to Adobe as an example of the best I could come up with, and hoped they could do better.  Apparently not.  My version was saying hello in Maori though, obviously, not Hindi ;-).

Here are a coupla examples anyhow.  I hasten to add I probably wouldn't actually use such code, but it's a more "real" example than saying hello in Hindi or Maori.

makeTimer = function(name){
    var startTick = getTickCount();
    return function(milestone){
        return "#name# (#milestone#): #getTickCount()-startTick#ms<br />";
    };
};

And an example of using it:

firstTimer = makeTimer("Started first");
writeOutput(firstTimer("Start"));

sleep(1000);

secondTimer = makeTimer("Started later");
writeOutput(firstTimer("First Milestone"));
writeOutput(secondTimer("Start"));

sleep(1000);

writeOutput(firstTimer("Second Milestone"));
writeOutput(secondTimer("First Milestone"));

And the output:

Started first (Start): 0ms
Started first (First Milestone): 1000ms
Started later (Start): 0ms
Started first (Second Milestone): 2000ms
Started later (First Milestone): 1000ms

So what this does is when the timer functions are created the current tick count is "recorded", and used as a baseline for each subsequent call to the function.  It demonstrates that each creation of a timer method is independent of each other, and remember what the startTick was when each was individually created.  This is not entirely useless, I guess.

Here's another pretty contrived example.  This one creates functions which wrap text in the pre-specified HTML formatting (that'll make sense when you see the code):


This is a bit silly, but... well... it works.  And I could kinda see how one might maybe do this sort of thing to simplify & standardise outputting mark-up. If, like, just doing the mark-up wasn't an adequate approach of arriving at the same end.  Anyway, it's a better example than saying hello or doing a count down.

Oh, to prove it works, the output is:

Hello World

This is a fairly contrived example of closure in CFML
TahiRuaToruWha

Do you know what?  I spent three separate sessions looking through Google search results trying to find good web-centric (so suitable for CF) examples of closure use which weren't just Javascript examples working around the limitations of JS, or using JQuery.  And I saw no good examples from other languages which apply to a non functional language like CF.  And these were not 5min sessions.  Each were spread over a coupla hours.  I read a lot of stuff and checked an awful lot of search results.

I think they're mostly hype, to be honest.  Do not get me wrong: I like function expressions and callbacks, and even the ability to use function expressions to write inline callbacks to a certain extend.  And if CF could have proper anonymous functions and delegates too: cool!  But all this could have been effected without the function expressions being implemented to use closure.  So I think the marketing message on this new functionality has been overstated by both Adobe and people having written up their investigations (the latter to a lesser extent).

But, hey, I could be missing something.  And secretly hope I am (OK: it's now not a secret), because I'd love to have that watershed moment in which I go "oh right: actually closure in CF merit the hype".  So lemme know.  I'd love to see some decent, practical, real world examples of them being put into use.  And - oddly - I rather like being proven wrong.  Because it means I'm about to learn something new.  Cool.

Right.  Enough.  Time for food.

--
Adam

PS: oh, hey, all this stuff works in Railo 4.x too.