Friday, 26 July 2013

ColdFusion hoists VAR declarations

G'day:
Whilst messing around with how scopes in functions in ColdFusion work, I came across something I didn't know: like Javascript, ColdFusion hoists its VAR declarations. This could be common knowledge, but I certainly didn't know.

Here's some code that shows how an unscoped variable within a function interacts with a samed-named variable in the arguments scope, and to what unscoped references refer.

Update:
Ray pointed out the way I had originally done the code was very hard to read (harder than usual ;-). I'd tried to take a sort of two-column approach so all the writeOutput() statements were off to the right next to the statement they were outputting stuff for, not underneath. This made it much easier to follow the code in a text editor, but it looked whack when it was in the blog. I didn't notice when I was proofreading. Anyhow, I've sorted it out now. Cheers for the nudge, Ray. And note to any other readers: do let me know if I screw anything up like this!


function f(){
    writeOutput("Top: arguments: #arguments.foo#; unscoped: #foo#<br>");
    var foo    = "var";
    writeOutput("After var: arguments: #arguments.foo#; unscoped: #foo#<br>");
    foo        = "unscoped";
    writeOutput("After unscoped: arguments: #arguments.foo#; unscoped: #foo#<br>");
    return {arguments=arguments,foo=foo};
}

result = f(foo="arg");
writeDump(var=result);

So we have an argument / variable called foo, which we various set via passing an as argument, or as a VAR, or unscoped. The output of which is this (today, brought to you via cflive.net):

Top: arguments: arg; unscoped: arg
After var: arguments: arg; unscoped: var
After unscoped: arguments: arg; unscoped: unscoped
object
arguments [object]
object
foo [string]arg
foo [string]unscoped

So we see that a VAR-defined variable is a different variable from a same-named arguments-scoped variable. And, unsurprisingly unscoped references are references to the VAR-defined variable. See, Adobe, how it's completely legal to have a VAR with the same name as an argument? Hmmm.

So far, so good. As a contrast, here's a variation of that code without the VAR statement:

function f(){
    writeOutput("Top: arguments: #arguments.foo#; unscoped: #foo#<br>");
    //var foo    = "var";
    //writeOutput("After var: arguments: #arguments.foo#; unscoped: #foo#<br>");
    foo        = "unscoped";
    writeOutput("After unscoped: arguments: #arguments.foo#; unscoped: #foo#<br>");
    return {arguments=arguments,foo=foo};
}

result = f(foo="arg");
writeDump(var=result);

This demonstrates slightly different behaviour on ColdFusion:

Top: arguments: arg; unscoped: arg
After unscoped: arguments: unscoped; unscoped: unscoped
object
ARGUMENTS [object]
object
foo [string]unscoped
FOO [string]unscoped

So in this case - where there's no VAR-declared variable - an unscoped reference to foo refers to the arguments-scoped one. Fair enough.

Interestingly, Railo behaves differently here:

Top: arguments: arg; unscoped: arg
After unscoped: arguments: arg; unscoped: unscoped
object
arguments [object]
object
foo [string]arg
foo [string]unscoped

Here initially an unscoped reference finds the arguments scope - using scope hunting - but declaring a variable without a scope reference creates a new variable. I think this is actually correct behaviour (if Adam Tuttle's commentary on scopes is correct, and I have no reason to think it isn't): an unscoped variable look-up will scope hunt, but an unscoped variable assignment should always create a variable in the variables scope. Let's modify that code to look at what goes into the variables scope:

function f(){
    writeOutput("Top: arguments: #arguments.foo#; unscoped: #foo#<br>");
    //var foo    = "var";
    //writeOutput("After var: arguments: #arguments.foo#; unscoped: #foo#<br>");
    foo        = "unscoped";
    writeOutput("After unscoped: arguments: #arguments.foo#; unscoped: #foo#<br>");
    return {arguments=arguments,foo=foo,variables=variables};
}

result = f(foo="arg");
writeDump(var=result);

Ooh dear... that just gives a stack overflow error on Railo! I guess it's because result ends up with a reference to variables in it, and variables has a reference to result in it... which kinda makes <cfdump> swallow itself. So another quick modification of the code will work around this:

function f(){
    writeOutput("Top: arguments: #arguments.foo#; unscoped: #foo#<br>");
    //var foo    = "var";
    //writeOutput("After var: arguments: #arguments.foo#; unscoped: #foo#<br>");
    foo        = "unscoped";
    writeOutput("After unscoped: arguments: #arguments.foo#; unscoped: #foo#<br>");
    var={arguments=arguments,foo=foo,variables=variables, local=local};
    return {arguments=arguments,foo=foo};
}

result = f(foo="arg");
writeDump(var=result);

OK, the result perplexes me. On cflive, I get this:

Top: arguments: arg; unscoped: arg
After unscoped: arguments: arg; unscoped: unscoped
object
arguments [object]
object
foo [string]arg
foo [string]unscoped
local [object]
object
FOO [string]unscoped
variables [object]
object
F [object]


Railo is putting unscoped references into the local scope by default.

But on my local, I get this:

Top: arguments: arg; unscoped: arg
After unscoped: arguments: unscoped; unscoped: unscoped
Struct
ARGUMENTS
Scope Arguments
foo1
stringunscoped
FOO
stringunscoped
LOCAL
Scope
VARIABLES
Scope
F
Public Function f
source:C:\Apps\railo-express-jre-win64\webapps\railo\www.scribble.local\shared\git\blogExamples\functions\hoist5.cfm

The variable is not in the local scope, it's just in the unnamed scope. Which is a reference to the arguments scope.

[furrows brow]

Ah... I remember now... there's a setting on Railo:

Local scope mode
Defines how the local scope of a function is invoked when a variable with no scope definition is used.
  • the local scope is always invoked
  • the local scope is only invoked when the key already exists in it

That's my setting on local, I guess Russ has it the other way around on cflive. Hurrah for Railo doing this, btw!

Yeah, after a Railo restart, I now see the same behaviour on local as I do on cflive. Anyway, that's not the point of this.

OK, so to get back on track, what we have is that if one has a VAR, then unscoped references to a variable refer to that VAR, and it's a separate variable to a same-named argument.

Just to remind you, here's the first two code examples again:

function f(){
    writeOutput("Top: arguments: #arguments.foo#; unscoped: #foo#<br>");
    var foo    = "var";
    writeOutput("After var: arguments: #arguments.foo#; unscoped: #foo#<br>");
    foo        = "unscoped";
    writeOutput("After unscoped: arguments: #arguments.foo#; unscoped: #foo#<br>");
    return {arguments=arguments,foo=foo};
}

result = f(foo="arg");
writeDump(var=result);

The var creates a distinct variable to the argument:

Top: arguments: arg; unscoped: arg
After var: arguments: arg; unscoped: var
After unscoped: arguments: arg; unscoped: unscoped
struct
ARGUMENTS
struct
fooarg
FOOunscoped

function f(){
    writeOutput("Top: arguments: #arguments.foo#; unscoped: #foo#<br>");
    //var foo    = "var";
    //writeOutput("After var: arguments: #arguments.foo#; unscoped: #foo#<br>");
    foo        = "unscoped";
    writeOutput("After unscoped: arguments: #arguments.foo#; unscoped: #foo#<br>");
    return {arguments=arguments,foo=foo};
}

result = f(foo="arg");
writeDump(var=result);

An unscoped reference refers to the arguments scope:

Top: arguments: arg; unscoped: arg
After unscoped: arguments: unscoped; unscoped: unscoped
struct
ARGUMENTS
struct
foounscoped
FOOunscoped

OK, so now I will try an unscoped reference first, and then a VAR:

function f(){
    writeOutput("Top: arguments: #arguments.foo#; unscoped: #foo#<br>");
    foo        = "unscoped";
    writeOutput("After unscoped: arguments: #arguments.foo#; unscoped: #foo#<br>");
    var foo    = "var";
    writeOutput("After var: arguments: #arguments.foo#; unscoped: #foo#<br>");
    return {arguments=arguments,foo=foo};
}

result = f(foo="arg");
writeDump(var=result);

So in this case, the initial unscoped reference should behave the same as the second example, an refer to the arguments-scoped variable.
And then the VAR variable should be a discrete variable (as per the first example).

Right?

Well:

Top: arguments: arg; unscoped: arg
After unscoped: arguments: arg; unscoped: unscoped
After var: arguments: arg; unscoped: var
struct
ARGUMENTS
struct
fooarg
FOOvar

No. Even though when we set the unscoped variable there's not yet a VAR, in this example the unscoped variable creates a different variable from the argument. This is contrary to the preceding example in which an unscoped variable refers to the arguments scoped one.

This code is behaving like the first example, as if the VAR statement is executed first, and the unscoped statement is referenced second.

If I go into the code and comment-out the VAR statement, the behaviour of the unscoped statement reverts to interfering with the arguments scope.

Weird. So the very presence of a VAR declaration of a variable in the code means unscoped references refer to a separate (VAR-scoped) variable from a same-named arguments-scoped one, even if the VAR declaration is after unscoped references. Basically the VAR declaration is being hoisted.

I decompiled the Java code generated for each of these flavours of function, and this is borne out:

This is the one with the VAR statement first:

  protected final Object runFunction(LocalScope __localScope, Object instance, CFPage parentPage, ArgumentCollection __arguments)
  {
    Object value;
    parentPage.bindImportPath("com.adobe.coldfusion.*");
    Variable ARGUMENTS = __localScope.bindInternal(Key.ARGUMENTS, __arguments);
    Variable THIS = __localScope.bindInternal(Key.THIS, instance);
    Variable FOO = __localScope.bindInternal("FOO");
    JspWriter out = parentPage.pageContext.getOut();
    Tag parent = parentPage.parent;
    parentPage._setCurrentLineNo(2);
    parentPage.WriteOutput(new StringBuffer("Top: arguments: ").append(Cast._String(parentPage._resolveAndAutoscalarize(ARGUMENTS, new String[] { "FOO" }))).append("; unscoped: ").append(Cast._String(parentPage._autoscalarize(FOO))).append("<br>").toString());
    FOO.set("var");
    parentPage._setCurrentLineNo(3);
    parentPage.WriteOutput(new StringBuffer("After var: arguments: ").append(Cast._String(parentPage._resolveAndAutoscalarize(ARGUMENTS, new String[] { "FOO" }))).append("; unscoped: ").append(Cast._String(parentPage._autoscalarize(FOO))).append("<br>").toString());
    FOO.set("unscoped");
    parentPage._setCurrentLineNo(4);
    parentPage.WriteOutput(new StringBuffer("After unscoped: arguments: ").append(Cast._String(parentPage._resolveAndAutoscalarize(ARGUMENTS, new String[] { "FOO" }))).append("; unscoped: ").append(Cast._String(parentPage._autoscalarize(FOO))).append("<br>").toString());
    Variable ___IMPLICITARRYSTRUCTVAR0 = __localScope.bindInternal("___IMPLICITARRYSTRUCTVAR0");
    ___IMPLICITARRYSTRUCTVAR0.set(CFPage.StructNew());
    parentPage._structSetAt(___IMPLICITARRYSTRUCTVAR0, new String[] { "ARGUMENTS" }, parentPage._autoscalarize(ARGUMENTS));
    parentPage._structSetAt(___IMPLICITARRYSTRUCTVAR0, new String[] { "FOO" }, parentPage._autoscalarize(FOO));
    return parentPage._get(___IMPLICITARRYSTRUCTVAR0);
    return null;
  }

And this is when the VAR statement comes after the unscoped assignment:

 protected final Object runFunction(LocalScope __localScope, Object instance, CFPage parentPage, ArgumentCollection __arguments)
  {
    Object value;
    parentPage.bindImportPath("com.adobe.coldfusion.*");
    Variable ARGUMENTS = __localScope.bindInternal(Key.ARGUMENTS, __arguments);
    Variable THIS = __localScope.bindInternal(Key.THIS, instance);
    Variable FOO = __localScope.bindInternal("FOO");
    JspWriter out = parentPage.pageContext.getOut();
    Tag parent = parentPage.parent;
    parentPage._setCurrentLineNo(2);
    parentPage.WriteOutput(new StringBuffer("Top: arguments: ").append(Cast._String(parentPage._resolveAndAutoscalarize(ARGUMENTS, new String[] { "FOO" }))).append("; unscoped: ").append(Cast._String(parentPage._autoscalarize(FOO))).append("<br>").toString());
    FOO.set("unscoped"); parentPage._setCurrentLineNo(3);
    parentPage.WriteOutput(new StringBuffer("After unscoped: arguments: ").append(Cast._String(parentPage._resolveAndAutoscalarize(ARGUMENTS, new String[] { "FOO" }))).append("; unscoped: ").append(Cast._String(parentPage._autoscalarize(FOO))).append("<br>").toString());
    FOO.set("var");
    parentPage._setCurrentLineNo(4);
    parentPage.WriteOutput(new StringBuffer("After var: arguments: ").append(Cast._String(parentPage._resolveAndAutoscalarize(ARGUMENTS, new String[] { "FOO" }))).append("; unscoped: ").append(Cast._String(parentPage._autoscalarize(FOO))).append("<br>").toString());
    Variable ___IMPLICITARRYSTRUCTVAR0 = __localScope.bindInternal("___IMPLICITARRYSTRUCTVAR0");
    ___IMPLICITARRYSTRUCTVAR0.set(CFPage.StructNew());
    parentPage._structSetAt(___IMPLICITARRYSTRUCTVAR0, new String[] { "ARGUMENTS" }, parentPage._autoscalarize(ARGUMENTS));
    parentPage._structSetAt(___IMPLICITARRYSTRUCTVAR0, new String[] { "FOO" }, parentPage._autoscalarize(FOO));
    return parentPage._get(___IMPLICITARRYSTRUCTVAR0);
    return null;
  }

And for comparison, without the VAR statement at all:

  protected final Object runFunction(LocalScope __localScope, Object instance, CFPage parentPage, ArgumentCollection __arguments)
  {
    Object value;
    parentPage.bindImportPath("com.adobe.coldfusion.*");
    Variable ARGUMENTS = __localScope.bindInternal(Key.ARGUMENTS, __arguments);
    Variable THIS = __localScope.bindInternal(Key.THIS, instance);
    JspWriter out = parentPage.pageContext.getOut();
    Tag parent = parentPage.parent;
    parentPage._setCurrentLineNo(2);
    parentPage.WriteOutput(new StringBuffer("Top: arguments: ").append(Cast._String(parentPage._resolveAndAutoscalarize(ARGUMENTS, new String[] { "FOO" }))).append("; unscoped: ").append(Cast._String(parentPage._autoscalarize("FOO"))).append("<br>").toString());
    parentPage._set("FOO", "unscoped");
    parentPage._setCurrentLineNo(3);
    parentPage.WriteOutput(new StringBuffer("After unscoped: arguments: ").append(Cast._String(parentPage._resolveAndAutoscalarize(ARGUMENTS, new String[] { "FOO" }))).append("; unscoped: ").append(Cast._String(parentPage._autoscalarize("FOO"))).append("<br>").toString());
    Variable ___IMPLICITARRYSTRUCTVAR0 = __localScope.bindInternal("___IMPLICITARRYSTRUCTVAR0"); ___IMPLICITARRYSTRUCTVAR0.set(CFPage.StructNew());
    parentPage._structSetAt(___IMPLICITARRYSTRUCTVAR0, new String[] { "ARGUMENTS" }, parentPage._autoscalarize(ARGUMENTS));
    parentPage._structSetAt(___IMPLICITARRYSTRUCTVAR0, new String[] { "FOO" }, parentPage._autoscalarize("FOO"));
    return parentPage._get(___IMPLICITARRYSTRUCTVAR0);
    return null;
  }

Here it's being set in the variables scope.

But anyway,the key point is demonstrated in the first two examples. Irrespective of the order of the CFML statements, the VAR declaration (but not the assignment) is hoisted to the top of the code.

Phew. Got there.

Anyway, as fascinating as this all is, I don't see that there's really much scope for this to cause problems in CFML, but it's probably good to tuck away in the back of yer mind, should you see odd behaviour. Remember that VARs are declared at the beginning of a function, even if the assignment of them occurs later.

Righto. Back to doing something interesting.

--
Adam

PS: Adam Tuttle, you might be pleased to know that googling for "adam tuttle nazi" does not reference you at all. "fusion grokker nazi" though... different story...
PPS (later): of course one now can google for "Adam Tuttle nazi", and there is a match. Chuckle/oops.