Wednesday 27 February 2013

How ColdFusion makes a pig's ear of ordered-argument argument collections

G'day:
Apropos of nothing, I'll just point out I'm writing this en route back from Auckland to London, via Kuala Lumpur. This entails a 10 and a 14 hour flight, with a stop to change planes at KLIA (the break is about three hours, I think). It's a bloody long haul. Worse: I did the reverse trip only a fortnight ago, so I've already seem the decent movies they have on offer for Feb (I watched Lincoln, Dredd, Seven Psychopaths, Alex Cross, Taken 2, in decreasing order of quality. The first three were good; the rest watchable). I'm writing this on my wee netbook, which runs for about six hours if I turn the screen brightness down a bit. It's awkward to type on, but it gets the job done. And happily runs CF10, Railo 4.0.2 and CFB simultaneously.


Over the Tasman Sea: one hour out of Auckland

 This article is a gripe. You've been warned.

Dale Fraser - a person I follow on Twitter - ran into some weirdness with numerically-keyed argument collections the other day, and I started looking into it by way of helping him get to the bottom of it. Whilst investigating, I recalled something Ben Nadel wrote about a coupla years ago, and revisited it. And this is related to his findings in that article. Credit for the basis for this article goes to Dale and Ben. I'm just adding all the extraneous verbiage. Playing to my strengths, one might say ;-)

You probably know that when calling a function in CFML, one can pass arguments in a number of ways. But in case it's not occurred to you, here's a recap.

// named arguments
result = myFunction(one="tahi", two= "rua", three="toru");

// ordered arguments
result = myFunction("tahi", "rua", "toru");

// argument collection
args = {
       one            = "tahi",
       two            = "rua",
       three   = "toru"
};
result = myFunction(argumentCollection=args);

It's handy to have those three options, especially the argumentCollection option, as one can build the struct using conditional logic to decide if a given argument should be passed:

// conditional arguments via argument collection
args = {
       one            = "tahi",
       two            = "rua"
};
if (hasThree()){
       args.three = "toru";
}

result = myFunction(argumentCollection=args); 

All good.

There's one potential option missing from this: passing an argument collection of positional arguments (as opposed to named ones). This requirement could come up if your code is fairly dynamic, and for a given function call you might not know the argument names the function uses, but you know you need to pass a varying set of arguments into the function. Here's some pseudo-code to demonstrate what I mean:

// conditional arguments via positional argument collection (PSEUDOCODE)
if (hasFirst()){
        args[1] = "tahi";
}
if (hasSecond()){
        args[2] = "rua";
}
if (hasThird()){
        args[3] = "toru";
}

result = myFunction(argumentCollection=args); 

Looking at this one would quickly think "right, well it makes perfect sense that - in this case - args is an array". Hmmm. Before I start griping, let me digress for a second.

One thing all ColdFusion developers should be aware of is that the arguments scope that a function has is neither fish nor fowl: it seems like a struct, and it also kinda seems like an array. But it's actually neither. This code demonstrates:

// completely vanilla struct
args = {
       1 = "tahi",
       2 = "rua",
       3 = "toru"
};
outputDetails(obj=args, message ="Before" );

// pass said struct into a function
result = checkArgsType(argumentCollection=args);       

// re-check what's returned
outputDetails(obj=result, message ="After" );
       

// simple function which just check's its arguments, and returns 'em
any function checkArgsType(){
        outputDetails(obj=arguments, message="Within");
        return arguments;
}


// as we do this three times, refactor as a function
void function outputDetails(required any obj, required any message){
        writeOutput("#message#<br />");
        writeOutput("isArray(): #isArray(obj)#<br />");
        try {  // we can predict this'll fail on the vanilla struct
               writeOutput("arrayLen(): #arrayLen(obj)#<br />");
        }
        catch (any e){
               writeOutput("arrayLen(): #e.message# #e.detail#<br />");
        }      
        writeOutput("isStruct(): #isStruct(obj)#<br />");
        writeOutput("structCount(): #structCount(obj)#<br />");
        writeOutput("getClass().getName(): #obj.getClass().getName()#<br />");
        writeOutput("<hr />");
}

There's a lot there for what I'm wanting to demonstrate, but basically all it does is:
  1. creates a struct;
  2. outputs some metadata of the struct;
  3. passes the struct to a function as an argument collection;
  4. outputs the same metadata on the arguments scope within the function;
  5. returns the arguments scope;
  6. outputs the same metadata on the returned value.
What we see is this:

Before
isArray(): NO
arrayLen(): Object of type class coldfusion.runtime.Struct cannot be used as an array 
isStruct(): YES
structCount(): 3
getClass().getName(): coldfusion.runtime.Struct

Within
isArray(): NO
arrayLen(): 3
isStruct(): YES
structCount(): 3
getClass().getName(): coldfusion.runtime.ArgumentCollection

After
isArray(): NO
arrayLen(): 3
isStruct(): YES
structCount(): 3
getClass().getName(): coldfusion.runtime.ArgumentCollection

There's a few things to pay attention to here:
  • a struct is a struct: no surprises;
  • the arguments scope claims not to be an array: OK, but;
  • we can call array functions on it. Odd.
  • It does claim to be a struct, so we can call struct functions on it;
  • but in actual fact it's an object of a different class entirely: an ArgumentCollection.
The only thing I have a problem with here is that if one can treat the thing as an array, then isArray() should return true. Analogous with how isStruct() returns true. I call that a bug (I might have already raised one for this, I can't remember & am offline at the mo'. If I dig up a reference, I'll post it).

The summary of the digression is that the arguments scope can be treated as an array. So it stands to reason the incoming argument collection should also be able to be an array. That would be logical.

Somewhere over the Outback: four hours in

I couldn't stay awake (my body clock thinks this is currently about 4:30am, NZDT), so had a kip for a while. I'm in the emergency exit seat against the window, which means sleeping is fairly easy (I can't seem to sleep sitting straight - I need to be on my side-ish - so only the window seats really work with me, as I don't like "cuddling up" with the person next to me. Well more they probably wouldn't appreciate it much ;-).

OK, so back to the positional-argument argument collection. Can you guess what data type one of these needs to be? An array, right? Because - like - that would make sense. And making it a struct with numeric keys would just be dumb, and no-one with any sense would do that. BUt CF doesn't implement it that way... OK, so have a second guess: what type of data structure does a positional-argument argument collection need to be? Yeah. Adobe decided the way to go there was to use a struct. The mind boggles.

What's even better is that it barely works.

This is turning into a very code-heavy article, but here's more:

// we'll pass these args into each test function
args = {
       1 = "one",
       2 = "two",
       3 = "three"
};
iterateObject(obj=args, label ="The argument collection");


// call a variety of functions, each with a differing number of arguments defined
noArgsDefined(argumentCollection=args);

oneArgDefined(argumentCollection=args);

allArgsDefined(argumentCollection=args);

extraArgsDefined(argumentCollection=args);


// the functions, as called from above
void function noArgsDefined(){
        iterateObject(obj=arguments, label="No arguments defined");
}

void function oneArgDefined(string one ){
        iterateObject(obj=arguments, label="One argument defined");
}

void function allArgsDefined(string one, string two, string three){
        iterateObject(obj=arguments, label="All arguments defined");
}

void function extraArgsDefined(string one, string two, string three, string four ){
        iterateObject(obj=arguments, label="Extra arguments defined");
}


// we call this a few times, so factor it out into a function
void function iterateObject(required any obj, required string label){
        writeOutput("#label#<br>");
        for  (var k in obj){
               writeOutput("Key: #k#; value: #obj[k]#<br>");
        }      
        writeOutput("<hr>");
}

So here we have a struct which we use as an argument collection to pass into a bunch of functions, with each function having an increasing number of arguments defined (0, 1, 3, and 4). The functions just dump out their arguments scope. And the results:

The argument collection
Key: 3; value: three
Key: 2; value: two
Key: 1; value: one

No arguments defined
Key: 3; value: one
Key: 2; value: two
Key: 1; value: three

One argument defined
Key: 3; value: two
Key: 2; value: three
Key: ONE; value: one

All arguments defined
Key: ONE; value: one
Key: TWO; value: two
Key: THREE; value: three

Extra arguments defined
Key: ONE; value: one
Key: TWO; value: two
Key: THREE; value: three
Key: FOUR; value: 

What this shows is that CF messes this up unless we have every argument defined. Obviously this is how one should write one's functions, but equally it's completely legit for a function to take an undefined (and accordingly unnamed) number of arguments. And given CF apparently supports positional-argument argument collections, and given we make a point of telling CF which position each argument is for, there's no real excuse for getting this wrong.

Railo, incidentally, gets this right.

Somewhere over Indonesia

I stopped for a while as I could barely keep my eyes open, so watched a movie: Sinister. It was OK, and had potential to be a bit chilling, but never quite got there. The cabin lights are on again, and it's almost feeding time at the zoo. Battery on 45%.

As the cliche goes: "but wait! There's more!". How about if we just have arguments that have numeric names (ie: named arguments with numeric names, as opposed to being an "array" of positional args), mixed in with arguments with string names, eg:

args = {
       1 = "one",
       2 = "two",
       third = "three"
};

include "functionDefinitionsAndCalls.cfm";

The include file here just contains all the code from the previous example, subsequent to the declaration of the args struct. This yields:

The argument collection

Key: THIRD; value: three
Key: 2; value: two
Key: 1; value: one

No arguments defined
Key: THIRD; value: three
Key: 2; value: two
Key: 1; value: three

One argument defined
Key: THIRD; value: three
Key: 2; value: three
Key: ONE; value: one

All arguments defined
Key: THIRD; value: three
Key: ONE; value: one
Key: TWO; value: two
Key: THREE; value: 

Extra arguments defined
Key: THIRD; value: three
Key: ONE; value: one
Key: TWO; value: two
Key: THREE; value: 
Key: FOUR; value: 

Note how the argument values actually get swapped / duplicated unless all the arguments are defined. Sloppy.

Last but not least (well: OK, it actually is least in this case), experimentation with this also shows up a bug in writeDump(). Consider this argument collection struct, which doesn't have contiguous numeric keys:

// we'll pass these args into each test function
args = {
       1 = "one",
       3 = "three"
};
safeDump(var=args, label ="The argument collection" );


// call a variety of functions, each with a differing number of arguments defined
noArgsDefined(argumentCollection=args);

allArgsDefined(argumentCollection=args);


// the functions, as called from above
void function noArgsDefined(){
        safeDump(var=arguments, label="No arguments defined");
}

void function allArgsDefined(string one, string two, string three){
        safeDump(var=arguments, label="All arguments defined");
}


void function safeDump(required any var, required string label ){
        try {
               writeDump(var =var, label=label);
        }
        catch  (any e){
               writeDump(var =e, label=label);
        }
        writeOutput("<hr>");
}

I've pared this example back slightly to only call noArgsDefined() and allArgsDefined(), as the behaviour for the other test functions we were calling before don't vary further than these two. The conceit here is that we're error-trapping writeDump() here to demonstrate its bug. The results are:

The argument collection - struct
1one
3three

No arguments defined - struct
Detail[empty string]
ErrNumber0
MessageVariable KEYVALUE is undefined.
StackTracecoldfusion.runtime.UndefinedVariableException: Variable KEYVALUE is undefined. at coldfusion.runtime.CfJspPage._get(CfJspPage.java:316) at coldfusion.runtime.CfJspPage._get(CfJspPage.java:296) at coldfusion.runtime.CfJspPage._autoscalarize(CfJspPage.java:1522) at coldfusion.runtime.CfJspPage._autoscalarize(CfJspPage.java:1486) at cfdump2ecfm1568701689$funcDUMPSTRUCT.runFunction(E:\cf10_final\cfusion\wwwroot\WEB-INF\cftags\dump.cfm:1961) [etc]
TagContext
No arguments defined - array
1
No arguments defined - struct
COLUMN0
IDCFDUMP
LINE-1
RAW_TRACEat cfdump2ecfm1568701689._factor1(E:\cf10_final\cfusion\wwwroot\WEB-INF\cftags\dump.cfm)
TEMPLATEE
TYPECFML

[etc]
TypeExpression
nameKEYVALUE

All arguments defined - struct
ONEone
THREEthree
TWOundefined

I've truncated the stack trace and tag context there as it gets tedious very quickly ;-)

I dunno what writeDump() is up to here, but I think I'm in good company: writeDump() itself doesn't seem to know either. Again, Railo doesn't have this issue.


39min out of Kuala Lumpur

Right, so that's about that. There's no big revelation, this is pretty much just some info in case you ever come across this... maybe it'll stick in the back of your mind like Ben's earlier article did for me. I might snooze my way in to KLIA, then seek out a beer. Or two. Or three.

And if I find some free wireless, I'll even try to get this dollioed-up so it's suitable for posting, and post it.

Dale has, btw, raised a bug for at least the weirdness with the argument collections with numeric / string arguments... go have a vote if you think it's a shit state of affairs: 3506225.

Righto.

--
Adam


PS (at KLIA)

Free wireless located, fortuitously right next to where the beer comes out of the tap and into pint glasses, and thence down my gullet. I'm posting this now (apologies for shoddy formatting in places. Shoddier than usual, I mean). It's not very proofread, but I need to focus on beer, not blog. Should probably find out which gate my plane's at too, at some stage. After another pint...