Tuesday, 10 May 2016

CFML (yes, shut-up): sorted struct literal syntax

G'day:
Yeah, all right. I'm gonna be admonished by at least one of my mates (that's you, Brian) for writing another article about CFML, but... oh well. This is not gonna become a habit as I feel bad writing it.

You might remember that CF2016 promised to have sorted structs. I documented this ages ago "ColdFusion 2016: ordered/sorted structs". One thing you might have noticed is that whilst ordered structs made the cut; sorted ones did not. Adobe's briefly documented the ordered ones here: "Collection support (Ordered)".

To recap: structs in CFML have always been intrinsically unordered collections. If you want an ordered collection: use an array. Structs to preserve an internal ordering, but how that's done is specifically undocumented and one should never rely on it. Previously if one's needed to access a struct's values in a specific order, one's needed to maintain a separate index of what that order is, or just use the sort method to return an array of they keys in a prescribed order and then iterate over that, using each array element value to access the struct. It's all in the docs. The chief problem with sorting a struct was it was only ever based on the keys, and only ever in a lexical ordering. Which is almost never a sensible ordering to use. But this is just the penalty one pays for trying to use an intrinsically unordered data type as an ordered one.

ColdFusion 2016 added ordered structs, and was supposed to add sorted ones too, but sorted ones did not make the cut for release (and that's as much as I can say about that due to NDA reasons). I had pointed out to Adobe that "ordered" and "sorted" means the same thing for all intents and purposes... I don't actually know if they've taken this on board. I hope so, so they don't just look like dicks when they do get around to releasing the sorted variety.

I do feel safe to say they do intend to finish this particular job... I don't think that's breaking any NDA. I cannot say when any updates to ColdFusion 2016 might come out though, as I'm completely out of that loop now. Except for elements of the matter that I will not refer to directly, but you can read between the lines that they're perhaps under discussion somewhere.

Now... the contrived difference in definition that Adobe have invented between "ordered" and "sorted" is that an ordered struct preserves the order in which the keys are added:

person = [firstName="Kiri", lastName="Te Kanawa"];

If one iterates over that, we get Dame Kiri's name-order preserved:

C:\Users\adam.cameron>cf
person = [firstName="Kiri", lastName="Te Kanawa"];
person.each(function(key, value){
CLI.writeln("#key#: #value#");
});
^Z

FIRSTNAME: Kiri
LASTNAME: Te Kanawa
C:\Users\adam.cameron>

Compare that to a normal struct:


C:\Users\adam.cameron>cf
person = {firstName="Kiri", lastName="Te Kanawa"};
person.each(function(key, value){
    CLI.writeln("#key#: #value#");
});
^Z

LASTNAME: Te Kanawa
FIRSTNAME: Kiri
C:\Users\adam.cameron>

You see the key order is not preserved.

OK, this is all lovely, but this is about sorted structs. So what's the diff? To recap, a Sorted struct is one in which one can specify the order the keys should iterate in. This is a bit edge-case-y, but one might want to iterate over they values in some different order other than insertion or key-name order. Perhaps alphabetically by value. Or by some other enumeration, or by length of value or something. There's a use case there, even if it's also edge-case.

One of the big challenges with sorted structs is the literal expression syntax for them. You know: array literals are expressed like this:

["tahi", "rua", "toru", "wha"]

Structs:

{red="whero", green="kakariki", blue="kikorangi"}

And now ordered structs:

[firstName="Kiri", lastName="Te Kanawa"]

Sorted structs need to impart two different things:
  • the key/value pairs
  • the sorting mechanism

Within the sorting mechanism, there are some variations:
  • declaratively: sorting ascending, alphabetically, using a specific locale as the collation order
  • functionally: using a callback 

So how the hell to impart all that info in a literal?

There's been a bunch of suggestions (none of which I can share, sorry), but none have been complete or "natural-seeming". Abram floated some suggestions in a bug ticket - 4126544 - which are good suggestions, but not really solving the "literal" problem.

But I was discussing this tonight with my mate and ColdFusion testing machine Aaron Neff and we finally nailed a decent syntax I think (I await Sean to tell me otherwise ;-)

For the declarative approach:

sortedStruct = [key=value, key=value, struct]

Where the struct argument has the config details for the sort:

{direction="ASC|DESC", on="key|value", type="text|numeric", caseSensitive=boolean(false), locale="fr_FR"}

Put together as an example:

sortedStruct = [january=29, february=17, march=31, {type="numeric", on="value"}];

This would yield a struct that when iterated over would return the key/values in order Feb, Jan, March, as it's sorting on value (ascending is implicit given it'd be the default for that setting).

And for using a callback, instead of a struct one just gives the function:

sortedStruct = [key=value, key=value, function(element1, element2, struct)]

 eg:

sortedStruct = [key=value, key=value, function(element1, element2, struct)

{
    var monthLengths = [january=31, february=28, march=31];
    return element1.value / monthLengths[element1.key] - element2.value / monthLengths[element2.key];
}];

This sorts the struct on the proportionate daily value per month. Oh of course the comparator function returns one of <1, 0 or >1 as per all sorting comparator functions. I don't use the third argument to the callback in this example, but that's there to mirror other iteration callbacks which take a reference the whole collection as the last argument (should one need it).

I think that's a reasonable literal syntax for sorted structs, don't you? The only slight quirk is that there's the "special treatment" of the "last" item of the list of elements in the brackets, but I don't mind that. I suppose we could do this:

sortedStruct = [key=value, key=value](struct|callback)

?


But I think either syntax would be unambiguously parseable by ColdFusion, and I think it's clear what's going on when one eyeballs the code.

What do you think? Shall I float this as an idea?

Now... back to doing something that is a good use of my time [cracks another beer, starts Fallout 4].

Righto.

--
Adam