Monday, 25 May 2015

Some CFML code that doesn't work

G'day:
I was sitting at Lord's yesterday watching England v NZ (and, um, we'll have no comments about that, thank-you very much), and a sudden thought popped into my head "Adam Presley might've been onto something there... if I leverage that....I wonder if I could get that code down to one statement?"

And that of course will mean nothing to anyone.

I had been thinking I had better get back to looking at the answers to my "Something for the weekend? A wee code puzzle (in CFML, PHP, anything really...)" puzzle. I posted that back in November, and I'm still only halfway through the answers. PHP and stuff like that got in the way.

To recap, the question is:

Problem
One has an array of numbers, eg:

[100, 300, 100, 50, 50, 50, 50, 50, 500, 200, 100]

One also has a threshold number, for example 500.

For a given array and a given threshold, return the subarray which contains the longest run of consecutive numbers which - in total - are equal-to or less than the threshold.

For example:

series = [100, 300, 100, 50, 50, 50, 50, 50, 500, 200, 100]
threshold = 500
subseries = getSubseries(series,threshold) // [100, 50, 50, 50, 50, 50]
There's a few technicalities, but the full details are in that article I linked to. That's the general gist of it, anyways.

Right, so the next solutions I was gonna look at was Charlie Griefer's and Sean's Clojure examples, and I can recall Sean saying he was able to do it in one statement with Clojure. And I remembered Adam's (Go) solution involved finding all the relevance subseries, then grabbing the longest one. Or something like that. I was at the cricket and not in front of the code.

It suddenly occurred to me that I should be able to do a variation of Adam's idea, use some mapping, reducing and sorting on it, and come up with a single-statement CFML solution. I kinda worked it out in my head, then drank too much beer so left it until today to look at.

I've spend a few hours now fighting with Lucee and ColdFusion to try to make my solution work. But I can't. The logic is sound, but the bloody language implementations aren't.

function getSubseries(series, threshold){
    return series.map(function(v,i,a){
        return a.slice(i,(a.len()-i)+1).reduce(function(best,current){
            return best
                .update("running", best.running && best.run.sum() + current <= threshold)
                .update("run", best.running ? best.run.append(current) : best.run);
        },{run=[], running=true});
    }).sort(function(e1,e2){
        return (sgn(e2.run.len() - e1.run.len()) || (sgn(e2.run.sum() - e1.run.sum());
    })[1].run ?: [];
}

Now... the logic of that is fine. Well: it passes all my tests. And the CFML is fine. However it doesn't bloody work. Because CFML engines.

  • ColdFusion returns a pointless boolean from update(), so calls to it cannot be chained like this.
  • As it does for append(). 3844976, which Adobe claims is fixed! Nuh-uh.
  • Lucee doesn't preserve the numerical values in boolean expressions like these, so doesn't return -1,0,1 from the comparisons, but just true or false. Rendering the returned values useless. LDEV-366.
  • Lucee's ?: operator doesn't work properly when a potentially null array is returned from a function. LDEV-368.

I also found another coupla Lucee bugs whilst working on earlier prototypes of this, and even one whilst creating the repro fo LDEV-368!

Irritatingly, neither platform has both sets of bugs. The Lucee ones are only bung on Lucee, and the ColdFusion ones are only bung on ColdFusion. but still: FFS.

In case yer interested in what the code would have been doing:

  • The function takes an array of numbers (eg: [100,200,300]).
  • That array is used to map to another array which is the sub-array starting from each element of the array in succession (eg: [[100,200,300], [200,300], [300]]).
  • Each of those arrays is reduced to only being the longest run of sequential values which tally to be within the specified threshold. So if our threshold was 300, we'd end up with an array: [[100,200], [200], [300]].
  • Whilst locating the longest run, we pay attention to whether we overflow the threshold, and if we do we stop looking to build the run any further.
  • We will be left with the longest runs starting from each position in the original array.
  • Next we sort that array according to the business rules we have: a longer run is "greater than" a shorter run; if two runs are equal-length, then we sort by tally. So from our example, [100,200] wins. It's equal in total to [300], but we want the longest one.
  • After the sort, the longest result is the one we want, and that's the first item.
  • If we started with an empty array, or there were no sequences within the threshold we won't get anything returned from the reduction operation, so we default to an empty array.
  • By the by... this is the first time I've ever used structUpdate(), and this is a reasonable use case for it, I think?

That works fine, and it's a single statement. Admittedly there's a lot of method-chaining, callbacks and a coupla compound expressions in there, but that was kinda what I came up with at the cricket, and it worked.

Well in theory. It doesn't actually work. Thanks Lucee and ColdFusion.

--
Adam


PS: I hasten to add I would almost certainly never write code like that as it's complication for the sake of complication, but it was a good mental exercise.