Thursday 25 July 2013

CFML: Scoping or not scoping?

G'day:
I'm back to talking about CFML today, rather than griping about stuff.

The other day I blogged about posting my UDF replaceWithCallback() on CFLib. Someone (I can't find who it was and where it was even, so sorry for not mentioning you by name) Scott Busche, made an observation on Code Review that they were surprised I don't scope the references to arguments within the code, eg:

string function replaceWithCallback(required string string, required string regex, required any callback, string scope="ONE", boolean caseSensitive=true){
    if (!isCustomFunction(callback)){ // for CF10 we could specify a type of "function", but not in CF9
        throw(type="Application", message="Invalid callback argument value", detail="The callback argument of the replaceWithCallback() function must itself be a function reference.");
    }
    if (!isValid("regex", scope, "(?i)ONE|ALL")){
        throw(type="Application", message="The scope argument of the replaceWithCallback() function has an invalid value #scope#.", detail="Allowed values are ONE, ALL.");
    }
    var startAt    = 1;

    while (true){    // there's multiple exit conditions in multiple places in the loop, so deal with exit conditions when appropriate rather than here
        if (caseSensitive){
            var found = reFind(regex, string, startAt, true);
        }else{
            var found = reFindNoCase(regex, string, startAt, true);
        }
        // etc

So I'm just using caseSensitive rather than arguments.caseSensitive. This is not a case of me being slack and not scoping my variables - I'm pretty obsessive about that - it's a conscious decision.

When functions are... err... executed (I guess), all the arguments are copied into the unnamed scope as well. So I'm not referring to arguments.regex in that code, and relying on ColdFusion to find it for me when I just cite regex in my code; I'm specifically referring to the variable regex. Depending on how CF does the copying, that could be a discrete version of the initially passed-in argument, or it could just be an additional reference, but either way regex is a separate variable from arguments.regex.

I'm all for scoping variables as a matter of course, even generally variables-scoped variables in CFM files. I don't do this because of concerns over scope-hunting performance hits, or collisions in same, I just think it's good to know where a variable resides when I'm using it.

The one exception I have to this is in short tracts of code (all onscreen at once, and fairly straight-forward in its complexity) I will not variables-scope things if the variable lives and dies within that file. You'll see this in my code examples on this blog. However if a variable comes from a different file (eg: via an include), then I will variables-scope it. This is just my annotation to myself as to what's going on. And this is just for my own code: at work we have a more rigid coding standard, and that stipulates all variables must be scoped all the time. So I follow that for my 9-5 code.



I did stop to think about the whole performance thing of not scoping my arguments in my functions. Now whilst I am not concerned about whatever performance penalty there might have been, because it would just be so inconsequential as to not matter. But once my interest is piqued, I like to know these things.

So, I banged out this code:

function f(x){
    return x;
}

function g(x){
    return arguments.x;
}

f(1);
g(2);

The difference here is the scoping on the reference to x.

I ran the code, grabbed the compiled Java, and decompiled it to see what was going on. It was interesting! Here's the relevant bit of f()'s decompiled code:

protected final Object runFunction(LocalScope __localScope, Object instance, CFPage parentPage, ArgumentCollection __arguments) {
    Object value;
    parentPage.bindImportPath("com.adobe.coldfusion.*");
    Variable ARGUMENTS = __localScope.bindInternal(Key.ARGUMENTS, __arguments);
    Variable THIS = __localScope.bindInternal(Key.THIS, instance);
    JspWriter out = parentPage.pageContext.getOut();
    Tag parent = parentPage.parent;
    Variable X = __arguments.getVariable(0);
    return parentPage._autoscalarize(X);
    return null;
}

And g()'s:
protected final Object runFunction(LocalScope __localScope, Object instance, CFPage parentPage, ArgumentCollection __arguments){
    Object value;
    parentPage.bindImportPath("com.adobe.coldfusion.*");
    Variable ARGUMENTS = __localScope.bindInternal(Key.ARGUMENTS, __arguments);
    Variable THIS = __localScope.bindInternal(Key.THIS, instance);
    JspWriter out = parentPage.pageContext.getOut();
    Tag parent = parentPage.parent;
    Variable X = __arguments.getVariable(0);
    return parentPage._resolveAndAutoscalarize(ARGUMENTS, new String[] { "X" });
    return null;
}

There's a swag more code than that in each class, but this is the bit that represents the actual function code. And the only meaningful difference between the two is the highlighted bits.

So the properly-scoped version - g(), the latter - is actually doing more work than the unscoped version, f()! That's counter-intuitive to say the least.

That said, there's not much in it, and I would not change any code or any coding practices based on this info. I'll be sticking to my coding standard(s)... and when coding: how clear the code is is more important than minutiae like this going on under the hood.

I've got some more to say on scopes within functions, but that'll need to wait until after work, later today.

Righto.

--
Adam