Wednesday 8 April 2015

Lucee 5 beta: "lambda" syntax

G'day:
Right, so the second thing I'm gonna look at (after yesterday's "Lucee 5 beta: static methods & properties") is Lucee's new "Lambda" syntax.

This syntax is simply a more terse syntax for defining functions using function expressions. TBH, it's not very interesting.

Here's some code demonstrating it:

// baseline.cfm

function f(x,y){
    return x*y
}

echo("Using function statement: #f(2,3)#<br>")

g = function(x,y) {
    return x*y
}
echo("Using function expression with traditional literal syntax: #g(5,7)#<br>")


h = (x,y) -> x*y
echo("Using function expression with 'lambda' syntax: #h(11,13)#<br>")

j = (x,y) -> {
    return x * y
}
echo("Using 'lambda' syntax with block: #j(17,19)#<br>")

The general syntax is:
  • arguments in parentheses
  • arrow
  • body of function

If the function body is a single expression, then neither the braces nor the return statement is necessary, simply the result of the expression is returned.

The perceived benefit is, I think, that it's less typing. I think this is a specious gain, as it reduces the clarity of the code, which should be the main consideration. Code is read far more often than it is written.

I think there's also a degree of "let's play with the big boys" going on here, as someone who uses a language which has this arrow-oriented syntax was impressed by it, so decided to suggest it make its way into CFML. However they didn't quite get the idea right. A lambda is simply am anonymous function. This is a lambda:

g = function(x,y) {
    return x*y
}

So differentiating this syntax from the arrow-based syntax by using the word "lambda" isn't very helpful.

JavaScript calls them "arrow functions", so perhaps that's the way to go there, so as to not sound ignorant. (BTW: am happy to be proven to be ignorant myself if someone can cite another language which has both verbose and arrow syntax for function expressions, and uses "lambda" to specifically identify the latter).

There is one significant implementation cock-up that one should be aware of. This is not an implementation bug because it's been implemented this way by design. But the design is a cock-up. Arrow functions do not implement closure.  Here's a demonstration:

// C.cfc
component {

    variables.someVar = "Inside component"

}

// closure.cfm

someVar = "In calling code"

function getVariableByNameUsingStatement(name){
    return variables[name]
}

getVariableByNameUsingLiteral = function(name){
    return variables[name]
}


getVariableByNameUsingLambda = (name)->variables[name]


o = new C()

o.getVariableByName = getVariableByNameUsingStatement
echo("Using statement: #o.getVariableByName('someVar')#<br>")

o.getVariableByName = getVariableByNameUsingLiteral
echo("Using statement: #o.getVariableByName('someVar')#<br>")

o.getVariableByName = getVariableByNameUsingLambda
echo("Using statement: #o.getVariableByName('someVar')#<br>")

Here I have three function definitions again:

  • using a function statement
  • using a function expression employing traditional function-literal syntax
  • using a function expression employing arrow-function syntax

A function expression should employ closure, meaning the reference to variables[name] should be bound to the variable of that name in the context where the function expression was first run. This should apply to both function expressions. Only the function-statement version should not use closure.

But here's the output:

Using statement: Inside component
Using traditional struct literal: In calling code
Using arrow function: Inside component


Hmmm. Not great.

Ryan has already raised a bug for this - LDEV-250 (lambdas are not closures) - I recommend you go vote for it. It does not make any sense for arrow functions to not employ closure. Indeed the definition of anonymous functions on Wikipedia mandates it:

"they allow access to variables in the scope of the containing function (non-local variables). This means anonymous functions need to be implemented using closures."
Not that Wikipedia is the anonymous function RFC or anything, but this is the accepted way these things work. There is no precedent I am aware of (not that my awareness is boundless) of differing the closure-behaviour of anonymous functions based solely on syntax style.

To me this is simply wrong, in Lucee.

Lastly, Lucee have kinda messed up identifying what an arrow function is. Check this out:

// identify.cfm

function f(x,y){
    return x*y
}

g = function(x,y) {
    return x*y
}

h = (x,y) -> x*y

dump(var={
    isCustomFunction=isCustomFunction(f),
    isClosure=isClosure(f)
}, label="Function statement")


dump(var={
    isCustomFunction=isCustomFunction(g),
    isClosure=isClosure(g)
}, label="Function expression")


dump(var={
    isCustomFunction=isCustomFunction(h),
    isClosure=isClosure(h)
}, label="Arrow function")


The output here is:


So let me get this straight.

  • A function statement defines a custom function (yes it does), and it's not a closure (no it isn't). Tick.
  • A function expressions is not a custom function (oh... yes it is), and it is a closure (yes, it is). Fail.
  • An arrow function is a custom function (OK, but odd that it disagrees with the function expression result), and it isn't a closure (true, I guess). Conditional pass.

That's all a bit of a mess, really.

Summary:

  • I don't know why they bothered.
  • They need to revise their understanding of industry jargon when it comes to describing this syntax. "Lambda" is not correct (LDEV-251).
  • Arrow functions need to use closure (LDEV-250).
  • And the identification functions need to work (LDEV-252).

I'm unimpressed with the implementation here. Still: this is a beta, so I'm sure they'll sort it out.

Righto, back to PHP for me...

--
Adam