Saturday 22 February 2014

CFML: Expressions and operators and doing weird s***

G'day:
I want to just return to the topic I touched on here briefly the other day in the article "ColdFusion 11: good stuff", and discussed more thoroughly on the Railo Google Group as "Probable bug in null-coalescing operator". I have moved my opinion here from "probable bug" to "definite bug". And the buggy behaviour relating to this extends further, and into both Railo and ColdFusion.

My contention is that Railo have implemented the null-coalescing operator (tritely referred to as the "Elvis operator" by some. Spare me) slightly incorrectly, as it confuses the notions of "null" and "non-existent" which are two distinct concepts. Also, in implementing this "non-existence-coalescing operator", it's done some rather fishy things in the process.

All of this has also now found its way into ColdFusion 11, unfortunately.

Operators

Firstly, let's recap about operators. An operator is basically a function, but is implemented with different syntax, using some sequence of characters to reflect the operator itself (eg: + or EQ or ?:), and the operator acts on one or more operands. An expression is the usage of operators and operands to express a value, and in this context, the statement is the expression and its assignment to a variable. (I fully expect Sean to read this very layperson's definition, and offer a better one!). EG:

one = 1
two = one + one

Here = and + are operators, and one and 1 are operands. I guess - strictly speaking - one is an operand in both statements, and two is an operand as well. In the first statement the assignment operator assigns the value of the second operand to the first operand. And in the second, the addition operator adds the first and second of its operands, and then that result becomes the second operand of the assignment operator as its value is assigned to two.

Most operators are obvious, and listed in the doc. However other things that are less obvious are also - for all intents and purposes - operators as well. For example:

myVar = functionName
myResult = functionName()

The first statement just assigns the function reference to another variable (making another function reference, pointing to the same function in memory). The second statement - via the () operator - executes the function. Here the parentheses operator says "execute the left operand as a function, and pass it the rest of the operands - down to the closing parenthesis - as argument values".

Similarly when we have this:

someObject.someProperty

the dot here is an operator too. It is the property accessor operator, and accesses the property reflected by the right-hand operand from the object reflected by the left-hand operand.

One important thing to note here is that an operator in and of itself does not change the operands in any way. And in a compound statement (with a bunch of operators each with their own operands), one operator does not change how another operator works. All an operator does is act on operands. For example:

one = 1
two = 2
three = 3
five = two * three - one

The - operator doesn't change three or one, and it doesn't alter what the * operator does.

So operators have constant and predictable behaviour. If you create and expession with given a set of operands and a set of operators, then each operation will always end up with the same result, and the result of the entire expression will always be the same.

isNull()

Here is where ColdFusion - and by following its precedent Railo - starts going wrong. Consider this code:

a = {};
writeDump(a); 
writeOutput("<hr>");
try {
    writeDump(a.b);
}catch(any e){
    writeOutput("#e.type# exception: #e.message# #e.detail#");
}
writeOutput("<hr>");
try {
    a.b = javacast("null", "");
    writeDump(a);
}catch(any e){
    writeOutput("#e.type# exception: #e.message# #e.detail#");
}
writeOutput("<hr>");

This outputs:

struct [empty]

Expression exception: Element B is undefined in A.

struct
Bundefined


Note how the dot operator behaves when the right-hand operand doesn't exist. It errors. This is what one would expect.

I know I have cheated with the second dump, in that I am only dumping a, not a.b. Because of CFML's inability to deal with null properly, even the second dump of a.b would error. But the dump does demonstrate that after the assignment of null to a.b, it does exist; it's just null.

The chief point here is that the expression a.b results in an error.

OK, so let's have a look at some code using isNull():

a = {};
writeDump(a); 
writeOutput("<hr>");

writeOutput("isNull(): #isNull(a.b)#<hr>");

a.b = javacast("null", "");
writeOutput("isNull(): #isNull(a.b)#<hr>");

Here we are using the same value for a, and the same expression a.b. And that should - prior to actually setting it - result in an error, right? But CF (and Railo) both do this:

struct [empty]

isNull(): YES

isNull(): YES


Huh? What has ColdFusion done here? How is the first call to isNull(a.b) (or any usage of a.b at that point) not erroring?

I can only surmise that Adobe have done something truly daft, and instead of the parser processing each expression in a formalised, predictable fashion, they are doing something like "oh, if it's the special case of the isNull() function, then the expression being passed into it doesn't need to behave like it normally does. Which is random, stupid, and just wrong. That a special function is being called should not mean that expressions being passed as its arguments are somehow messed with before being passed. Equally, no matter how the function's been implemented to treat the argument expression "specially"... it's still broken because a.b isn't null, it doesn't exist. Those are two different things. So isNull() has tried to be special and clever, and as with a lot of times Adobe tries to be special and clever... has messed things up in the process.

Sigh.

Railo have copied ColdFusion's behaviour here, for the sake of cross-compatibility. I used to agree with this approach to things, but I really think they should simply do things right, from now on. And offer a compatibility mode to do things "wrong, but like how ColdFusion does it" to deal with the cross-compatibility thing.

?:

Unfortunately the null-coalescing operator engages in the same shenanigans. Here's roughly equivalent code, using the null-coalescing operator:

a = {};
writeDump(a); 
writeOutput("<hr>");

writeOutput("#a.b ?: 'a.b was treated as null'#<hr>");

a.b = javacast("null", "");
writeOutput("#a.b ?: 'a.b was treated as null'#<hr>");

This outputs:

struct [empty]

a.b was treated as null

a.b was treated as null


Again, the ?: operator has been implemented the same as the isNull() function: it doesn't just do what it's supposed to ("return" the value of the left operand if it's not-null, otherwise return the right one), it also messes around with the left-operand's expression, and also gets the test wrong in the process. In the first example a.b doesn't exist. It's not null. So the expression should error. Simple.

Other languages

I'm not making this up as I go along, btw (well: some of my wording is a bit off the cuff, sure). Here's how equivalent code works in other languages.

Groovy

This is where Railo copied the idea from. So it would make sense if they copied it properly.

a = null
println(a ?: "a was null")
println(a.b ?: "a.b was null")

Result:

a was null
Caught: java.lang.NullPointerException: Cannot get property 'b' on null object
java.lang.NullPointerException: Cannot get property 'b' on null object
    at main.run(main.groovy:3)


JavaScript

Code:
var a
console.log(a==null)
console.log(a.b==null)

Result:
true
TypeError: Cannot read property 'b' of undefined


Ruby

Code:
a = nil
puts a.nil?
puts a.b.nil?

Result:
true
(eval):2: undefined method `b' for nil:NilClass (NoMethodError)


PHP

Code:
$a = null;
echo(is_null($a));
echo (is_null($a->b));

Result:
Line 4: Trying to get property of non-object

Or:

$a = ["b"=>"set"];
echo($a["b"] ?: "it was null");
echo($a["c"] ?: "it was null");

Result:
Line 4: Undefined index: c

All these languages understand how "things" are supposed to work, and just do what they're told without any "special treatment" going on.

CFML, on the other hand, just acts like an adolescent sitting at the grown-ups' table. It's a bit embarrassing, and it shouldn't do it.

Response

I've not heard back from Adobe on this, but there was a lively discussion about it on the Railo Google Group.

Unfortunately the response was disheartening from both Railo (Igal, Mischa, and Gert), as well as community members. Basically the engineers willfully implemented this code the way they did, knowing it wasn't actually really "right", but just did it anyhow. FFS.

And Gert and one of the guys in the community pretty much went "well yeah, but I'd rather have it convenient than right". This is also disappointing. (In Gert's defence he has subsequently reversed on this a bit, and has come up with a solution to work for everyone).

Solution

The thing is: the problem is easily solved. Just make the null-coalescing operator do what it's supposed to do (without monkeying with any of its operands), and then also implement the safe navigation operator. The safe navigation operator (?.) basically does what the ?: operator in CFML is already trying to do, but does it in the right way. It works like this:

a?.b?.c?.d

And what it does is work like a hybrid of the . and ?: operator. It returns the property if it exists, otherwise returns null. So it's safe chaining it together like that, as - for example - if b wasn't defined there, a?.b would be null, so then calling ?.c on null wouldn't exist, so it would instead return null. etc.

This is fine, this is coherent, and it doesn't require any special monkeying.

Obviously isNull() should be reimplemented correctly too.

These two changes in Railo will break existing code, so perhaps there should be a feature toggle in place (this was Gert's idea for the universal fix for this), so that when toggled-on existing incorrect usage still "works", but logs something to stderr to the effect of there being bung code being executed.

It's a good time for Adobe to fix the ?: operator as it's still in beta. But they will have the same challenge with isNull() as Railo does. But it should be similarly fixed.


Bottom line

My proposal is this (and I'll raise tickets for all these, but after I publish so I can cross-ref back to this article):
  • fix ?: (3710381, RAILO-2924)
  • fix isNull() (3712057, RAILO-2925)
  • implement ?. (3614459, RAILO-2580)
  • implement a feature toggle so existing code on each platform can use the toggle to continue to work as now (but the toggle should default to breaking existing code until the toggle is on) (3712059, RAILO-2926)
  • log any calls to these bits of functionality which only work due to the feature toggle. So as to remind the developer they've got flaky code in place (this is part of the preceding tickets).
Thoughts?

--
Adam