Wednesday, 30 June 2021

CFML higher-order functions compared to tag-based code: sort function

G'day:

OK so you've probably got the gist of things with these articles, with my previous treatments of comparing "modern" to "old school" with map, reduce, filter operations. On to sorting now.

I think this is going to involve some awful code.

I don't think I need to explain why we might need to sort a collection, or what "sorting" is. It's really easy using higher-order functions. The need to write the sorting algorithm has been removed, and only a function to compare to elements needs to be provided:

months = [
    {id=1, miSequence=8, mi="Kohi-tātea", anglicised="Hānuere", en="January"}, 
    {id=2, miSequence=9, mi="Hui-tanguru", anglicised="Pēpuere", en="February"}, 
    {id=3, miSequence=10, mi="Poutū-te-rangi", anglicised="Maehe", en="March"}, 
    {id=4, miSequence=11, mi="Paenga-whāwhā", anglicised="Āperira", en="April"}, 
    {id=5, miSequence=12, mi="Haratua", anglicised="Mei", en="May"}, 
    {id=6, miSequence=1, mi="Pipiri", anglicised="Hune", en="June"}, 
    {id=7, miSequence=2, mi="Hōngongoi", anglicised="Hūrae", en="July"}, 
    {id=8, miSequence=3, mi="Here-turi-kōkā", anglicised="Akuhata", en="August"}, 
    {id=9, miSequence=4, mi="Mahuru", anglicised="Hepetema", en="September"}, 
    {id=10, miSequence=5, mi="Whiringa-ā-nuku", anglicised="Oketopa", en="October"}, 
    {id=11, miSequence=6, mi="Whiringa-ā-rangi", anglicised="Noema", en="November"}, 
    {id=12, miSequence=7, mi="Hakihea", anglicised="Tihema", en="December"}
]
monthsInMaoriOrder = duplicate(months).sort((e1, e2) => e1.miSequence - e2.miSequence)

writeDump(monthsInMaoriOrder)

Here I have a list of the months of the year, ordered according to the Gregorian calendar. The Maori calendar has the same ordering, but the year starts around when the Gregorian calendar considers June. So the exercise here is to re-order the array to respect that ordering. The code for the sorting is just the comparator function.

One thing to note here is that despite appearances given we're assigning the return value of the sorting operation to a new variable, the original array is modified when you call sort on it. I think this is less than ideal, but it's the way it works on both ColdFusion and Lucee. If you want you're original array left alone, then duplicate it first like I have here.

If we're going old school procedural: it's a bit of a nightmare. We need to write our own sorting implementation. Well: we grab one from cflib.org anyhow. But even then, the original leverages a callback function, so I've modified this to be truly procedural and have that embedded in the implementation.

<cffunction name="monthsSortedByMaoriSequence" returntype="array" output="false">
    <cfargument name="arrayToCompare" type="array" required="true">

    <cfset var lesserArray = arrayNew(1)>
    <cfset var greaterArray = arrayNew(1)>
    <cfset var pivotArray = arrayNew(1)>
    <cfset var examine = 2>
    <cfset var comparison = 0>
    <cfset pivotArray[1] = arrayToCompare[1]>

    <cfif  arrayLen(arrayToCompare) LT 2>
        <cfreturn arrayToCompare>
    </cfif>

    <cfset arrayDeleteAt(arrayToCompare, 1)>
    <cfloop array="#arrayToCompare#" item="element">
        <cfset comparison = element.miSequence - pivotArray[1].miSequence>

        <cfswitch expression="#sgn(comparison)#">
            <cfcase value="-1">
                <cfset arrayAppend(lesserArray, element)>
            </cfcase>
            <cfcase value="0">
                <cfset arrayAppend(pivotArray, element)>
            </cfcase>
            <cfcase value="1">
                <cfset arrayAppend(greaterArray, element)>
            </cfcase>
        </cfswitch>
    </cfloop>

    <cfif arrayLen(lesserArray)>
        <cfset lesserArray = monthsSortedByMaoriSequence(lesserArray)>
    <cfelse>
        <cfset lesserArray = arrayNew(1)>
    </cfif>

    <cfif arrayLen(greaterArray)>
        <cfset greaterArray = monthsSortedByMaoriSequence(greaterArray)>
    <cfelse>
        <cfset greaterArray = arrayNew(1)>
    </cfif>

    <cfset arrayAppend(lesserArray, pivotArray, true)>
    <cfset arrayAppend(lesserArray, greaterArray, true)>

    <cfreturn lesserArray>
</cffunction>
<cfset sorted = monthsSortedByMaoriSequence(months)>

It's hard to see the bit that the modern implementation needs, but it's buried here.

Note: to an clever clogs who spot the odd shortcoming in that implementation of quicksort: you're missing the point of the article, and also yer talking to the wrong person because I didn't write it. But - yes yes - you're very clever.

The point is: that's awful. Writing old-school tag-based procedural code one needs to re-implement (and re-test!) the sorting function every time you need one. This is an extreme example and only a lunatic would not use the callback approach even with tag based code:

<cffunction name="comparator">
    <cfargument name="e1">
    <cfargument name="e2">
    <cfreturn e1.miSequence - e2.miSequence>
</cffunction>

<cfset sorted = duplicate(months)>
<cfset arraySort(sorted, comparator)>

But still: it's just better to get with the programme (or the decade) and use the modern version for this.

Righto.

--
Adam

Tuesday, 29 June 2021

CFML higher-order functions compared to tag-based code: filter function

G'day

This one will be pretty short I think. It's the next effort in going over how these higher-order functions work compared to writing procedural code in CFML tags. I've previous covered map and reduce. There's less intricacy to filter, so I won't have so much to say.

Yesterday I showed an example of how not to remove records from a collection using reduce

numbers = [1,2,3,4,5,6,7,8,9,10]
evens = numbers.reduce((evens=[], number) => number MOD 2 ? evens : evens.append(number))

This works, but it's not how one ought to do it. It's putting a square peg in a round hole, and it's gonna cause a small amount of FUD when someone comes back to review the code later ("why are they using reduce here? What am I missing?"). So… use the correct tool for the job. The idiomatic way to filter our elements from a collection is with a filter operation. Here's the equivalent operation using filter:

evens = numbers.filter((number) => number MOD 2 == 0)

Filter's callback receive the value of the collection element (and additionally its index/key, as well as the whole collection as additional parameters, if you need to use those too). If the logic in the callback returns true? The element is preserved in the result collection. if it's false? It's filtered out. That's it. The callback logic can be a one-liner like it is here, or as convoluted as it needs to be. As long as it boils down to a true or a false, you'll get your filtered collection. As with the other collection higher-order functions: it does not change the original collection; it returns a new one.

The tag-based equivalent is simple:

<cfset evens = []>
<cfloop array="#numbers#" item="number">
    <cfif number MOD 2 EQ 0>
        <cfset arrayAppend(evens, number)>
    </cfif>
</cfloop>

Just slightly more verbose, and it's mostly boilerplate.

The concept here is simple, and the object of the exercise for these articles is to just show the difference between using the higher-order functions and using a procedural approach with tags, and that's pretty much it.

Righto.

--
Adam

Monday, 28 June 2021

CFML: function expression syntax

G'day:

Just super quickly. One of the newer feaures in CFML is that it now supports arrow-function syntax for function expressions. I say "newer". They were apparently added to ColdFusion in 2018, and they're in Lucee: I dunno from what version.

I've been using arrow functions a bit in my code recently, cos they're just less typing for no loss of clarity compared to function expressions using the function operator. in case yer not used to them, these two code snippets are functionally equivalent:

adder = function(operand1, operand2) {
    return operand1 + operand2
}


adder = (operand1, operand2) => {
    return operand1 + operand2
}

Arrow functions offer some shortcuts though. If the body of the function is a single expression and it's the returned value, then one doesn't need to specify the block braces, or the return keyword. So the above arrow function could simply be:

adder = (operand1, operand2) => operand1 + operand2

What's more, if the function expression has only parameter, then one doesn't even need to specify the parentheses:

double = operand1 => operand1 * 2

(this is currently broken on Lucee: https://luceeserver.atlassian.net/browse/LDEV-2417).

There's no tag-based equivalent of either syntax for function expressions. That said, if one does not care about the closure side of things that function expressions utilise (and neither of these examples do), then the two function expressions above are equivalent to these two <cffunction>-based function statement declarations:

<cffunction name="adder">
    <cfargument name="operand1">
    <cfargument name="operand2">
    <cfreturn operand1 + operand2>
</cffunction>
<cffunction name="doubLe">
    <cfargument name="operand1">
    <cfreturn operand1 * 2>
</cffunction>

In reality though, these are closer to these equivalent statements in CFScript:

function adder(operand1, operand2) {
    return operand1 + operand2
}

function double(operand1) {
    return operand1 * 2
}

The difference is that these are statements, not expressions.

OK. That's enough bloody CFML tags for one evening.

Righto.

--
Adam

CFML higher-order functions compared to tag-based code: reduce function

G'day:

Here's the next effort in going over how these higher-order functions work compared to writing procedural code in CFML tags. The previous one was "CFML higher-order functions compared to tag-based code: map function". Today I'm looking at the reduce method. As per yesterday, I've discussed this before in ColdFusion 11: .map() and .reduce().

So what does reduce to? It helps if we compare it to map. Remember how I said this yesterday:

A mapping operation takes one collection and remaps the values for each key into a different value. The keys and the overall size and order (if it has a sense of order) of the collection is preserved. Also the original collection is not altered; an entirely new collection is returned.

A reduce operation is used to return a different data structure. It doesn't mean "reduce" in the sense of "make smaller"; the resultant data structure might be "bigger" (for some definition of bigger). Or it might be the same length, but a different type.

An example of returning the same length but different type would be similar to yesterday's example of mapping an array of records to an array of objects:

records = [
    {id=1, mi="whero", en="red"},
    {id=2, mi="kakariki", en="green"},
    {id=3, mi="kikorangi", en="blue"}
]
objects = records.map((record) => new Colour(record.id, record.mi, record.en))

A more likely scenario in CFML is for the records to be a query. But one still wants to pass an array of objects back from the storage tier to the application, so we use reduce to make the type conversion:

records = queryNew(
    "id,mi,en",
    "integer,varchar,varchar",
    [
        [1, "whero", "red"],
        [2, "kakariki", "green"],
        [3, "kikorangi", "blue"]
    ]
)
objects = records.reduce((objects=[], record) => objects.append(new Colour(record.id, record.mi, record.en)))

Note the way reduce works. The first argument is an "accumulator" that is passed into every iteration, and is ultimately returned to the calling code. One builds the return value iteration at a time into that. Here I'm appending to the array of objects each iteration. Whatever is returned from each iteration is the first argument of the next iteration. So as I iterate over the query, I start with an empty array. I append the first object to it, and that one-element array is then passed into the accumulator of the second call to the callback in the next iteration; and so on for all iterations so ultimately I have an array that I've appended three objects to. Some pseudo-code might make this more clear. Let's consider the iterations as they progress:

1: objects argument=[]; append Red; return value=[Red]
2: objects argument=[Red]; append Green; return value=[Red, Green]
3: objects argument=[Red, Green]; append Blue; return value=[Red, Green, Blue]
result: [Red, Green, Blue]

We start empty, we append red, we append green, we append blue.

After that first argument, the subsequent arguments follow the same pattern as with map: the second argument is a row of the query (passed as a struct). The callback can also receive the current index / key (or currentRow equivalent to a query loop in this case), and the last argument is the entire query. I don't need these here, so do not mention them in the callback's function signature.

The tag version of this is actually round about the same amount of code (109 bytes vs 112 bytes it seems):

<cfset objects = []>
<cfloop query="records">
    <cfset arrayAppend(objects, new Colour(id, mi, en))>
</cfloop>

Another case is shown here:

transactions = [
    {id=1, amount=.1},
    {id=2, amount=2.2},
    {id=3, amount=33.3},
    {id=4, amount=44.44}
]

sum = transactions.reduce((sum=0, transaction) => sum += transaction.amount)

We're summing the transactions. We are reducing the collection to a single value, I guess.

Oh one thing maybe work making very clear: it's complete coincidence that the final variable is called sum, and the accumulator parameter is called sum. They don't need to be, it just makes sense to me to match them up given we're kinda building the end result in that accumulator argument, and accordingly it's going to be the same sort of values, so makes sense it's called the same thing.

The tag-based version for this is simple again:

<cfset sum = 0>
<cfloop array="#transactions#" item="transaction">
    <cfset sum = sum += transaction.amount>
</cfloop>

Another more complicated example of script-vs-tags when reducing is in yesterday's article "CFML: tag-based versions of some script-based code". There I am reducing a query to a struct, then reducing that struct into another query. Both CFScript and tag versions of the code are in that.


One thing to not use reduce for is to actually reduce the size of a collection by removing records from it, eg:

numbers = [1,2,3,4,5,6,7,8,9,10]
evens = numbers.reduce((evens=[], number) => number MOD 2 ? evens : evens.append(number))

One would not use reduce for that. One would use filter. I guess I'll get to that tomorrow.

Righto.

--
Adam

Do as I say, not as I do

G'day:

One of the more attentive people on the CFML Slack channel observed I had some unVARed variables in the last couple of my articles. And quite possibly in other ones I've written recently.

This was unintentional. I'm being caught out by Lucee's setting to not need to var local variables, which I have switched on in my dev environment. Also I'm still in the throes of porting my brain back from PHP which doesn't have such dumb-arsed notions about needing to be explicit about these things.

To be clear, my position on variable-scoping is as follows:

  • always constrain one's variables to the "nearest" scope. This would mean one ought always use function-local variables in one's functions.
  • If using any other scope: actively scope it. So if a function needs to set something in the variables scope, don't simply rely on not VARing it; explicitly refer to it as variables.myVar.
  • Only use explicit scoping on variables when it would otherwise be unclear which scope is being referenced. If you still to small simple functions, data encapsulation, and clean code, this generally means there's no need to scope stuff. It's just clutter.
  • If you see some code I've written that doesn't do this, presume it's by accident. It won't be the important thing about the code you're looking at, unless the topic under discussion is "let's look at variables scoping" or some such.

That last point is not meant to dismiss the observation our colleague made of my code: I mean I hastily went and fixed it! But just I'm gonna sometimes forget to dot my Ts and cross my Is&hellips; it's safe to assume that's just me being inattentive. For my day job, I take this stuff very seriously.

Please do point out when I mess stuff up! I'll fix it.

Cheers to all who actually read this shite, and pay that amount of attention to it. I really appreciate it :-)

Righto.

--
Adam

CFML higher-order functions compared to tag-based code: map function

G'day:

As I mentioned yesterday ("CFML: tag-based versions of some script-based code") I've been asked by a couple of people to show the tag-based version of the script-based CFML code. This has ben particularly in reference to my typical approach of using higher-order functions to perform data transformation operations on iterable objects (eg: arrays, structs, lists, etc). Here I will briefly do that for some examples of using mapping functions. The process is the same each time, so I'll not dwell on it too much.

I have already written about the nuts and bolts of mapping higher-order functions in CFML back in 2014 in "ColdFusion 11: .map() and .reduce()". I also looked at how to implement arrayMap in older versions of CFML: "arrayMap(): a reverse CFML history".

In short, these collection-iteration higher order functions work on the premise that most looping operations exist solely to perform data transformation, and it makes sense to encapsulate that into a function, rather than having to hand-crank it. Obviously every data transformation is specific to its circumstance, so the collection-iteration functions take a callback as an argument (thus making them higher-order functions), where the callback defines the data transformation operation. Taking this approach makes the code clearer as to what the intent of the transformation is, and also encapsuates the implementation in its own functions, so its variables are all well encapsulated and don't impact the rest of the calling code. It's just a tider way of doing data transformation.

A mapping operation takes one collection and remaps the values for each key into a different value. The keys and the overall size and order (if it has a sense of order) of the collection is preserved. Also the original collection is not altered; an entirely new collection is returned.

That's enough of an explanation. This article is about comparing code styles. Here goes.

keys = ["ONE", "TWO", "THREE", "FOUR"]

translationLookup = {
    "ONE" = {mi = "tahi", jp = "一"},
    "TWO" = {mi = "rua", jp = "二"},
    "THREE" = {mi = "toru", jp = "三"},
    "FOUR" = {mi = "wha", jp = "å››"}
}


maori = keys.map((key) => translationLookup[key].mi)

writeDump(maori)

Here we have a one-liner that takes an array of translation keys and maps them to their actual translations.

Equivalent tag-based code is a bit more effort. We need to hand-crank our array construction:

<cfset japanese = []>
<cfloop array="#keys#" item="key">
    <cfset arrayAppend(japanese, translationLookup[key].jp)>
</cfloop>
<cfdump var="#japanese#">

In the next example I am being less literal about the "key mapping" idea, in case one got a sense that that sort of thing is inate to a mapping operation. I'm doubling each element in the array:

values = [1, 22, 333, 4444]
doubled = values.map((n) => n*2)
writeDump(doubled)

And the tags version (although here I'm halvig the values, for the hell of it). Same as the previous exercise really: just a wee bit clunkier than using the dedicated mapping function:

<cfset halved = []>
<cfloop array="#values#" item="value">
    <cfset arrayAppend(halved, value / 2)>
</cfloop>
<cfdump var="#halved#">

A more real-world example would be when yer getting an array of raw data values back from some sort of data-retrieval operation, and you want to properly model those as objects before returning them to your business logic:

records = [
    {id=1, mi="whero", en="red"},
    {id=2, mi="kakariki", en="green"},
    {id=3, mi="kikorangi", en="blue"}
]
objects = records.map((record) => new Colour(record.id, record.mi, record.en))

vs:

<cfset objects = []>
<cfloop array="#records#" item="record">
    <cfset arrayAppend(objects, new Colour(record.id, record.mi, record.en))>
</cfloop>
<cfdump var="#objects#">

You get the idea.

To show how strings can be remapped too, I knocked-together a quick example of String.map, but then remembered Lucee does not support String.map yet, so needed to use a list instead:

s = "The Quick Brown Fox Jumps Over The Lazy Dog"

a = asc("a")
z = asc("z")

rot13 = s.listToArray("").map((c) => {
    var checkCode = asc(lcase(c))

    if (checkCode < a || checkCode > z) {
        return c
    }
    var offset = (checkCode + 13) <= z ? 13 : -13

    return chr(asc(c) + offset)
}).toList("")
writeOutput(rot13)

And I tested this by feeding the result back into a tag-based version of the operation, to make sure it returned to the original string:

<cfset a = asc("a")>
<cfset z = asc("z")>

<cfset s2 = "">
<cfloop array="#listToArray(rot13, "")#" item="c">
    <cfset checkCode = asc(lcase(c))>

    <cfif checkCode LT a OR checkCode GT z>
        <cfset s2 &= c>
        <cfcontinue>
    </cfif>
    <cfset offset = 13>
    <cfif checkCode + 13 GT z>
        <cfset offset = -13>
    </cfif>
    <cfset s2 &= chr(asc(c) + offset)>
</cfloop>
<cfoutput>#s2#</cfoutput>

All in all using the specific iteration function is slightly clearer as to what sort of transformation is taking place, plus it saves you from having to write the looping and assignment scaffolding that a tags-based / hand-cranked version might. Often remappings are one-liners, and it's just more readable to do it as a simple assignment epression than having to hand-crank the boilerplate looping code.

The code for this article is all munged together in public/nonWheelsTests/higherOrderFunctionsDemonstration.

I'll have a look at how reduce operations work, tomorrow.

Righto.

--
Adam

Sunday, 27 June 2021

CFML: tag-based versions of some script-based code

G'day:

OMFG the things I do for my CFML community colleagues.

I've been asked by a couple of people to show the tag-based version of the script-based CFML code I have been showing as examples when helping people recently. This is so people who are less familiar with CFScript can compare the two, and perhaps get a better handle on the script code.

Editorialisation

I have not written new tag-based code in CFML in probably 15 years, other than when it's been absolutely unavoidable like back before queryExecute existed, so we still needed to use <cfquery> (and similar stuff like <cfhttp>, and what-have-you). I have maintained old tag-based code, but I've been lucky in that I've always been in the position to implement new code using modern practices.

Some CFML History

Since ColdFusion 9 was released in 2009 (that's over a decade ago, yeah?), it's been largely unnecessary to write any business logic in tag-based code, as script-based CFCs were added to the language. The only real relics of tag-only functionality were stuff like the afore-mentioned DB and external system access functionality that was all tags still. But that stuff should be hidden away in adapter CFCs anyhow, so any necessary tag-based code should be well isolated.

It has not been necessary to write CFML in tags at all since 2014 (over seven years ago), when - in ColdFusion 11 - the last bits of tag-only functionality were ported to CFScript.

The only place any tags ought to have been used since then are in views. And really these days your views should probably be being handled by a client-side framework anyhow, so - in my opinion - no new tag-based CFML code should be being written in 2021, and shouldn't have been for over half a decade now. All new CFML code should be written in CFScript. All CFML developers must be fluent in CFScript.

Reality for a lot of people

That's all good in theory, but in practice there is a lot of legacy code out there. We don't all get to choose what codebases we work on daily, and I know some CFML devs don't get to work with modern code much, so: tags it is. And this also means some devs don't get exposed to CFScript as much as they could be, so it could all seem a bit foreign to them. Fair enough.

The code

A week or so ago, I did an exercise "CFML: emulating query-of-query group-by with higher-order functions". The final version of the code for this was (tagsVsScriptDemonstrations/groupByViaCfml/ScriptVersion.cfc):

component {

    public query function groupByYearAndMonth(required query ungroupedRecords) {
        return ungroupedRecords.reduce((grouped={}, row) => {
            var y = row.settlementDate.year()
            var m = row.settlementDate.month()
            var key = "#y#-#m#"
            grouped[key] = grouped[key] ?: {stgl = 0, ltgl = 0}
            grouped[key].stgl = grouped[key].stgl + row.ShortTermGainLoss
            grouped[key].ltgl = grouped[key].ltgl + row.LongTermGainLoss

            return grouped
        }).reduce(
            (records, key, values) => {
                records.addRow({
                    month = key.listLast("-"),
                    year = key.listFirst("-"),
                    ltgl = values.ltgl,
                    stgl = values.stgl
                })
                return records
            },
            queryNew("month,year,ltgl,stgl", "Integer,Integer,Double,Double")
        ).sort((r1, r2) => {
            var yearDiff = r1.year - r2.year
            if (yearDiff != 0) {
                return yearDiff
            }
            return r1.month - r2.month
        })
    }
}

I think a direct analogue of this in tags would be (tagsVsScriptDemonstrations/groupByViaCfml/TagsVersion.cfc)

<cfcomponent output="false">

    <cffunction name="groupByYearAndMonth" returntype="query" access="public">
        <cfargument name="ungroupedRecords" type="query" required="true">

        <cfset grouped = structNew()>
        <cfloop query="ungroupedRecords">
            <cfset var y = year(settlementDate)>
            <cfset var m = month(settlementDate)>
            <cfset var key = "#y#-#m#">

            <cfif not structKeyExists(grouped, key)>
                <cfset grouped[key] = structNew()>
                <cfset grouped[key].stgl = 0>
                <cfset grouped[key].ltgl = 0>
            </cfif>
            <cfset grouped[key].stgl = grouped[key].stgl + ShortTermGainLoss>
            <cfset grouped[key].ltgl = grouped[key].ltgl + LongTermGainLoss>
        </cfloop>

        <cfset var records = queryNew("month,year,ltgl,stgl", "Integer,Integer,Double,Double")>
        <cfloop collection="#grouped#" item="local.key">
            <cfset queryAddRow(records)>
            <cfset querySetCell(records, "month", listLast(key, "-"))>
            <cfset querySetCell(records, "year", listFirst(key, "-"))>
            <cfset querySetCell(records, "ltgl", grouped[key].ltgl)>
            <cfset querySetCell(records, "stgl", grouped[key].stgl)>
        </cfloop>
        <cfset querySort(records, sorter)>

        <cfreturn records>
    </cffunction>

    <cffunction name="sorter" returntype="numeric" access="private">
        <cfargument name="r1" required="true">
        <cfargument name="r2" required="true">

        <cfset var yearDiff = r1.year - r2.year>
        <cfif yearDiff NEQ 0>
            <cfreturn yearDiff>
        </cfif>

        <cfreturn r1.month - r2.month>
    </cffunction>

</cfcomponent>

I'm not going to go through and cross-annotate anything, but I've used analogous variable names, and kept the logic in the exact order where I could. I've also tried to keep the same level of verboseness (or lack thereof) in both examples, so that it's as true to a like-for-like as I can muster. BTW I'm also not using any member functions or other newer CFML constructs / features in these examples.


John Whish gave me a good exercise to do this morning which I'll also reproduce here. In this example we're taking an array, and deriving the two-element combinations of all the elements. For example if we start with this: ["A", "B", "C", "D", "E"], the expected result would be this: ["AB", "AC", "AD", "AE", "BC", "BD", "BE", "CD", "CE", "DE"]

In CFScript it's this (tagsVsScriptDemonstrations/combinations/ScriptVersion.cfc):

component {

    public array function getCombinations(required array array) {
        var working = duplicate(array)
        return array.reduce((combinations=[], prefix) => {
            working.deleteAt(1)
            return combinations.append(working.map((element) => "#prefix##element#"), true)
        })
    }
}

And the tag version (tagsVsScriptDemonstrations/combinations/TagsVersion.cfc):

<cfcomponent output="false">

    <cffunction name="getCombinations" returntype="array" access="public" output="false">
        <cfargument name="array" type="array" required="true">

        <cfset var working = duplicate(array)>
        <cfset var combinations = arrayNew(1)>
        <cfloop array="#array#" item="local.prefix">
            <cfset arrayDeleteAt(working, 1)>
            <cfset var subCombinations = arrayNew(1)>
            <cfloop array="#working#" item="local.element">
                <cfset arrayAppend(subCombinations, "#prefix##element#")>
            </cfloop>
            <cfset arrayAppend(combinations, subCombinations, true)>
        </cfloop>
        <cfreturn combinations>
    </cffunction>

</cfcomponent>

As a last example, I decided to see if I could port the actual test class for the combinations exercise to tags. And - yes - I could. It's really clumsy, but it works. First here's the original script version (tagsVsScriptDemonstrations/combinations/CombinationsTest.cfc):

import testbox.system.BaseSpec
import cfmlInDocker.miscellaneous.tagsVsScriptDemonstrations.combinations.ScriptVersion
import cfmlInDocker.miscellaneous.tagsVsScriptDemonstrations.combinations.TagsVersion

component extends=BaseSpec {

    function beforeAll() {
        variables.testArray = ["A", "B", "C", "D", "E"]
        variables.expectedCombinations = [
            "AB", "AC", "AD", "AE",
            "BC", "BD", "BE",
            "CD", "CE",
            "DE"
        ]
    }

    function run() {
        describe("Testing script version", () => {
            it("returns the expected combinations", () => {
                var sut = new ScriptVersion()
                var result = sut.getCombinations(variables.testArray)

                expect(result).toBe(variables.expectedCombinations)
            })
        })
        describe("Testing tags version", () => {
            it("returns the expected combinations", () => {
                var sut = new TagsVersion()
                var result = sut.getCombinations(variables.testArray)

                expect(result).toBe(variables.expectedCombinations)
            })
        })
    }
}

And the tags version (tagsVsScriptDemonstrations/combinations/CombinationsTestUsingTags.cfc):

<cfimport path="testbox.system.BaseSpec">
<cfimport path="cfmlInDocker.miscellaneous.tagsVsScriptDemonstrations.combinations.ScriptVersion">
<cfimport path="cfmlInDocker.miscellaneous.tagsVsScriptDemonstrations.combinations.TagsVersion">

<cfcomponent extends="BaseSpec" output="false">

    <cffunction name="beforeAll">
        <cfset variables.testArray = ["A", "B", "C", "D", "E"]>
        <cfset variables.expectedCombinations = [
            "AB", "AC", "AD", "AE",
            "BC", "BD", "BE",
            "CD", "CE",
            "DE"
        ]>
    </cffunction>

    <cffunction name="run">
        <cfset describe("Testing script version", testingScriptVersionHandler)>
        <cfset describe("Testing tags version", testingTagsVersionHandler)>
    </cffunction>

    <cffunction name="testingScriptVersionHandler">
        <cfset it("returns the expected combinations", returnsTheExpectedCombinationsScriptVersionHandler)>
    </cffunction>

    <cffunction name="returnsTheExpectedCombinationsScriptVersionHandler">
        <cfset var sut = new ScriptVersion()>
        <cfset var result = sut.getCombinations(variables.testArray)>

        <cfset expect(result).toBe(variables.expectedCombinations)>
    </cffunction>

    <cffunction name="testingTagsVersionHandler">
        <cfset it("returns the expected combinations", returnsTheExpectedCombinationsTagsVersionHandler)>
    </cffunction>

    <cffunction name="returnsTheExpectedCombinationsTagsVersionHandler">
        <cfset var sut = new TagsVersion()>
        <cfset var result = sut.getCombinations(variables.testArray)>

        <cfset expect(result).toBe(variables.expectedCombinations)>
    </cffunction>

</cfcomponent>

Yikes.


And indeed "yikes" was my reaction to each of those examples. The tag-based code is just full of unnecessary and obstructive bloat, and just a mess to read. And a bit clunky to implement.

Ugh. However if there's any other code I've done recently that you'd find helpful to read as tag-based code, let me know, and I'll see if I can do a port. But the quid pro quo is that if yer currently still writing CFML in tags, and have it within your control to stop doing that and join the direction CFML has been taking since mid-last-decade… please try to move on.

PS: also I'm intending to do another article that takes a more focused look on understanding how CFML's collection-iteration higher-order functions (you know; Array.map, Struct.reduce, Query.filter etc) work, and comparing back to tag-based implementations.

Righto.

--
<cfadam />

Saturday, 26 June 2021

CFWheels: an even easier way of dealing with all that plugins pollution

G'day:

In the last article ("CFWheels: improving the usability of plugins), I mentioned this part-way through:

although as I type this paragraph, I think I have worked out a better way than what I'm about to show you

Well. In my never-ending quest to provide quantity rather than quality, as soon as I pressed "send" on that, I tried the idea I had. And it worked. An it's so bloody simple.

Go read that other article for full context, but the object of the exercise here is to write CFWheels plug-in implementations so they:

  1. don't pollute every object in the app with all its public functions;
  2. and don't actually require the plugin to be stateless and have only public functions.

I dicked around using closure to solve this. But it's more easily solved by just leveraging… the file system.

I've created another plugin with file structure thus:

../src/plugins/
└── colourPlugin
    ├── ColourPlugin.cfc
    └── implementation
        └── ColourPlugin.cfc

And code thus:

// src/plugins/colourPlugin/ColourPlugin.cfc

import cfmlInDocker.plugins.colourPlugin.implementation.*

component  {

    function init() {
        this.version = "2.2"
        return this
    }

    function ColourPlugin() {
        return new implementation.ColourPlugin()
    }
}
// src/plugins/colourPlugin/implementation/ColourPlugin.cfc

import cfmlInDocker.dao.ColourDao
import cfmlInDocker.models.Colour

component {

    function init() {
        variables.dao = new ColourDao()

        return this
    }

    public function getColourById(id) {
        record = variables.dao.selectColourById(id)
        if (record.recordCount) {
            return new Colour(record.id, record.en, record.mi)
        }
        throw(type="ColourNotFoundException", message="Colour not found", detail="Data for Colour with ID #id# not found")
    }
}

CFWheels only loads the outer CFC as the plugin, so only finds the one public method, so only pollutes everything with that. That method returns an object that actually implements the plugin. Using it in a controller method looks like this:

function testColourPlugin() {
    colourPlugin = ColourPlugin()
    viewVariables = {
        colour = colourPlugin.getColourById(params.id)
    }

    renderView(template="plugin", values=viewVariables)
}

I've tested (see test/functional/plugins/ColourPluginTest.cfc) that CFWheels doesn't load everything recursively, and there's no sign of implementation.ColourPlugin's getColourById method in the pollution space:

import testbox.system.BaseSpec

component extends=BaseSpec {

    function run() {
        describe("Tests for ColourPlugin", () => {
            it("does not pollute the application with implementation methods", () => {
                expect(application.wheels.dispatch).toHaveKey("ColourPlugin")
                expect(application.wheels.dispatch).notToHaveKey("getColourById")
            })
        })
    }
}

This is still more messing around than one ought to have to do in a framework whose job it is to make coding easier, but it's less complicated than the other approaches I took in the previous article. I'd even be happy (-ish) doing CFWheels plugin development this way.

Righto.

--
Adam

CFWheels: improving the usability of plugins

G'day:

That's perhaps a slightly strange article title. I was trying to be diplomatic, and anyone who's read anything I write here will know: this is not my strong suit. Hopefully it'll make sense by the time I finish.

I recently did that series on mixins in CFML: "CFML: messing around with mixins (part 1)" (and parts 2 & 3). This all stemmed from trying to make head or tail of an architectural choice CFWheels has made, which has resulted in a lot of mixed-in functions polluting everything (view functions in models, controller functions in views, helper functions all over the place), and a necessity to eschew private functions in favour of using a "special code" that when a function is prefixed with $, then despite it being public yer supposed to pretend it's private and not use it. Umm… so be it.

I had a need to look at what CFWheels did with its "plugins" feature the other day, and was unsurprised these have taken the same route as all its other function libraries: they can't really be nicely encapsulated classes, because all their methods need to be public and the objects can have no state because of how CFWheels has been architected. I'm not going to play the game by these particular rules, so I've looked at how I can write a plugin, but still actually implement it using proper coding practices at the same time.

It turns out it's pretty easy, just leveraging a bit of closure. Here's a simple example:

import cfmlInDocker.dao.MyDao
import cfmlInDocker.models.Thing

component {

    function init() {
        this.version = "2.2"

        variables.dao = new MyDao()

        return this
    }

    this.selectThingById = function (id) {
        record = variables.dao.selectThingById(id)
        if (record.recordCount) {
            return new Thing(record.id, record.name)
        }
        throw(type="ThingNotFoundException", message="Thing not found", detail="Data for Thing with ID #id# not found")
    }

    this.getVersion = function () {
        return this.version
    }

    this.getSecret = function () {
        return variables.getSecret()
    }

    private function selectThingById(id) {
        return variables.dao.selectThingById(id)
    }

    private function getSecret() {
        return "sssshhhh"
    }
}

It all comes down to how I'm publicly exposing the private functions via a this-scoped wrapper function expression, and that uses closure to preserve variable references to the context of this object.

I'm not going to dwell on it here because I cover all this in those other three articles (specifically the second one), but the way CFWheels is injecting plugin functions into everything, the context of the functions is not preserved, so references to variables and this refer to where they've been injected into, not the object they started in. Using closure ensures the context is coupled to the function, and this persists when the function reference is injected into some other object.

I have tests for all this in /test/functional/plugins/MyPluginTest.cfc. I won't repeat them here cos all they're doing is verifying the plugin methods do what I expect them to do. Have a look though.


But I'm still not happy about this.

It might be fine in CFWheels's mind that polluting every object with seemingly every function in the entire application, but I'm a bit more conserative when it comes to the single responsibility principle, and I really don't like the fact that my plugin is polluting everything with three unnamespaced functions: selectThingById, getVersion and getSecret. A sensible application of the concept of a plugin would be to inject an object perhaps. And the object then exposes its methods. All crazy and OOP I know. I didn't quite nail this (although as I type this paragraph, I think I have worked out a better way than what I'm about to show you), but I think this is an improvement of sorts:

import cfmlInDocker.dao.MyDao
import cfmlInDocker.models.Thing
import cfmlInDocker.plugins.myPlugin.MyPlugin

component extends=MyPlugin {

    function init() {
        this.version = "2.2"

        variables.dao = new MyDao()

        return this
    }

    this.MyOtherPlugin = function () {
        return {
             selectThingById = (id) => super.selectThingById(id),
             getVersion = () => super.getVersion(),
             getSecret = () => super.getSecret()
        }
    }
}

In this example, just one unnamespaced function is exposed, and that returns a struct that then exposes methods. I'm just borrowing the actual code from MyPlugin for the functionality here. Note that the references to the actual functions are still wrapped in a function expressions so that closure binds the super references to this object's super. Same as with the earlier example.

Usage of this can be seen in my test controller (/src/controllers/TestRoute.cfc):

function testMyOtherPlugin() {
    myOtherPlugin = MyOtherPlugin()
    viewVariables = {
        thing = myOtherPlugin.selectThingById(params.id),
        version = myOtherPlugin.getVersion(),
        secret = myOtherPlugin.getSecret()
    }

    renderView(template="plugin", values=viewVariables)
}

MyOtherPlugin is the only thing that is exposed from this plugin. After that, its return value behaves like an object. Like, you know, how the rest of us have been writing our CFML code for the last decade and a half.

Righto.

--
Adam

Sunday, 20 June 2021

Making the code from the previous article work on both ColdFusion and Lucee

G'day:

A quick one today. I showed just a ColdFusion-only solution to some code in my previous article the other day: "CFML: emulating query-of-query group-by with higher-order functions". I figured when coming up with the completed code in a runnable example, I should make it work on both platforms. Here it is.

<cfscript>
function getUngroupedRecords(required numeric rows) {
    createRows = (number) => repeatString(",", rows).listToArray(",", true)
    date = () => now().add("d", randRange(-365, 365))
    amount = () => randRange(1,10000) / 100

    fakedDbData = queryNew(
        "settlementDate,LongTermGainLoss,ShortTermGainLoss",
        "Date,Double,Double",
        createRows(rows).map((_) => [date(), amount(), amount()])
    )
    return fakedDbData
}
ungroupedRecords = getUngroupedRecords(20)
writeDump(ungroupedRecords)

groupedRecords = ungroupedRecords.reduce((grouped={}, row) => {
    y = row.settlementDate.year()
    m = row.settlementDate.month()
    key = "#y#-#m#"
    grouped[key] = grouped[key] ?: {stgl = 0, ltgl = 0}
    grouped[key].stgl = grouped[key].stgl + row.ShortTermGainLoss
    grouped[key].ltgl = grouped[key].ltgl + row.LongTermGainLoss

    return grouped
}).reduce(
    (records, key, values) => {
        records.addRow({
            month = key.listLast("-"),
            year = key.listFirst("-"),
            ltgl = values.ltgl,
            stgl = values.stgl
        })
        return records
    },
    queryNew("month,year,ltgl,stgl", "Integer,Integer,Double,Double")
).sort((r1, r2) => {
    yearDiff = r1.year - r2.year
    if (yearDiff != 0) {
        return yearDiff
    }
    return r1.month - r2.month
})

writeDump(groupedRecords)
</cfscript>

Some notes on that:

  • I was using ColdFusion-only syntax for arrow functions with only one parameter: CF - correctly - does not require parentheses for the function signature here. Lucee however does, so I added them.
  • As previously-mentioned, Lucee didn't like arrayNew(1).resize(number) here, so I've changed it to being a bit shit, but at least it works on Lucee.
  • Lucee still doesn't return the updated query from Query.addRow, it returns the number of rows added (as per queryAddRow). ColdFusion changed this back in CF2018, so Lucee has some catch-up to do here. Anyway I needed to split this into two statements to make it work on Lucee.
  • Originally I had the sort callback as one expression to demonstrate a "trick" with variable assignment expressions, but whilst this worked in Lucee, ColdFusion choked on it. The callback was: (r1, r2) => (yearDiff = r1.year - r2.year) ? yearDiff : r1.month - r2.month. This pushed well past the bounds of what is clearly understandable, and I think the long-hand version I used is better code. But it was a bug in ColdFusion that the short version didn't work.

Anyway… this version of the code works on both ColdFusion and Lucee.

Righto.

--
Adam

Friday, 18 June 2021

CFML: emulating query-of-query group-by with higher-order functions

G'day:

A week or so ago this question came up on the CFML slack channel:

Does QoQ support month() and year() such as in the following?

queryExecute("
      SELECT MONTH(settlementdate) AS month, year(settlementdate) AS year,
      SUM(LongTermGainLoss) AS ltgl, SUM(ShortTermGainLoss) AS stgl
      FROM data
      group by MONTH(settlementdate), year(settlementdate) 
    ",{},{
      dbtype: 'query'
})

The answer to the question is: Lucee supports MONTH and YEAR SQL functions in their QoQ SQL implementation, but ColdFusion does not. Here's a quick repro:

dates = queryNew(
    "date",
    "date",
    [
        [createDate(2011, 3, 24)],
        [createDate(2016, 8, 17)]
    ]
)

dateParts = queryExecute(
    "SELECT MONTH(date) AS month, YEAR(date) AS year FROM dates",
    {},
    {dbtype="query"}
)

writeDump(dateParts)

This works as expected on Lucee, but on ColdFusion errors with "Encountered "MONTH. Incorrect Select List, Incorrect select column".

The question intrigued me, and I thought… I reckon one could do something interesting with higher order functions to do what John needed here.

Firstly I needed some test data, so I banged out a quick function to do so:

function getUngroupedRecords(required numeric rows) {
    createRows = number => arrayNew(1).resize(number) // doesn't work in Lucee https://luceeserver.atlassian.net/browse/LDEV-2417
    date = () => now().add("d", randRange(-365, 365))
    amount = () => randRange(1,10000) / 100

    fakedDbData = queryNew(
        "settlementDate,LongTermGainLoss,ShortTermGainLoss",
        "Date,Double,Double",
        createRows(rows).map(_ => [date(), amount(), amount()])
    )
    return fakedDbData
}
ungroupedRecords = getUngroupedRecords(20)

NB: Lucee requires parentheses around even a single parameter in an arrow function, which is shouldn't so this breaks where indicated. This is a ColdFusion-only exercise here, so I'm running with a ColdFusion-only solution as the code is a bit nicer than if I make it work with Lucee too.

I thought there must be some better way of creating an array of a specific length just natively in CFML but I couldn't think of one, hence the arrayNew(1).resize(number) hack. It doesn't matter what's in the array, I just need that number of elements to re-map with my data.

Right so I have some data that possibly resembles the sort of thing John had.

From that I reduce that query to a struct keyed on "#year#-#month#", and for each row of the query I accumulate the gain/loss figures into their appropriate struct key entry:

ungroupedRecords.reduce((grouped={}, row) => {
    y = row.settlementDate.year()
    m = row.settlementDate.month()
    key = "#y#-#m#"
    grouped[key] = grouped[key] ?: {stgl = 0, ltgl = 0}
    grouped[key].stgl = grouped[key].stgl + row.ShortTermGainLoss
    grouped[key].ltgl = grouped[key].ltgl + row.LongTermGainLoss

    return grouped
})

At this point we have a "grouped-by" struct, but we need a query. So we reduce the struct back the other way now, to a query with the rows John spec'ed out in his original question:

.reduce( // breaks in Lucee https://luceeserver.atlassian.net/browse/LDEV-2523
    (records, key, values) => records.addRow({
        month = key.listLast("-"),
        year = key.listFirst("-"),
        ltgl = values.ltgl,
        stgl = values.stgl
    }),
    queryNew("month,year,ltgl,stgl", "Integer,Integer,Double,Double")
)

The only complexity there is we need to expand-out the struct key to be individual values again, cos they year and month have their own columns in the query.

Oh and as mentioned Lucee breaks on this. It has a bug in that when an array entry is null, it skips it when performing reduce (or map, etc) operation. So the end result is that it considers the result of the first reduce to be null, so this second one breaks cos one cannot reduce null.

Finally for good measure I sort the results by date:

.sort((r1, r2) => {
    yearDiff = r1.year - r2.year
    if (yearDiff != 0) {
        return yearDiff
    }
    return r1.month - r2.month
})

Initially I tried to be a smart-arse and do all that in one expression:

.sort((r1, r2) => (yearDiff = r1.year - r2.year) ? yearDiff : r1.month - r2.month)

And whilst Lucee was OK with that, ColdFusion could not make sense of the inline assignment expression there. I was just doing that so I only wanted r1.year - r2.year evaluated once, but I needed it in two places.

All of that is just the one expression in the end, when taken together:

groupedRecords = ungroupedRecords.reduce((grouped={}, row) => {
    y = row.settlementDate.year()
    m = row.settlementDate.month()
    key = "#y#-#m#"
    grouped[key] = grouped[key] ?: {stgl = 0, ltgl = 0}
    grouped[key].stgl = grouped[key].stgl + row.ShortTermGainLoss
    grouped[key].ltgl = grouped[key].ltgl + row.LongTermGainLoss

    return grouped
}).reduce( // breaks in Lucee https://luceeserver.atlassian.net/browse/LDEV-2523
    (records, key, values) => records.addRow({
        month = key.listLast("-"),
        year = key.listFirst("-"),
        ltgl = values.ltgl,
        stgl = values.stgl
    }),
    queryNew("month,year,ltgl,stgl", "Integer,Integer,Double,Double")
).sort((r1, r2) => {
    yearDiff = r1.year - r2.year
    if (yearDiff != 0) {
        return yearDiff
    }
    return r1.month - r2.month
})

The final output is along these lines:

I have to admit I did not TDD this work as I just wrote it in trycf.com, so I can't use TestBox, and can only write scripts on that anyhow. I did benchcheck the results by hand, and they're correct as far as I can tell.

I'm not sure how useful it is to have gone through this experience, but it was kinda fun I guess.

Righto.

--
Adam

Monday, 14 June 2021

Using TDD when adding new code to existing untestable code

G'day:

Just a note before I start. This article is based on a real-world situation for a company I used to work at. Because I'm basing this on proprietary information, I am not prepared (or permitted!) to share that. As such I have obfuscated some elements of the narrative, and have changed sample code around to be less specific to the industry concerned. The code is still line-for-line equivalent to the original code though with just names being changed. Nothing has been enhanced or simplified to make it artificially easier to solve the problem we really did face.

This challenge is best contextualised with a real-world situation I encountered. We had a function that was kind of a perversion of the concept of the Builder Pattern, and is pretty much a model example of how not to do it, and an example of what the Builder Pattern is designed to address.

Here's the method, shrunk down so you can get a feel for the scale of the problem, but without needing to see the reallyreally code. The code is too blurry to read, but with the syntax highlighting, you can get an idea of what's going on. And the method is… 387 lines long. Ridiculous:

I have indicated where we need to make out code change. This method (buildResponse) is building a reasonably complex Account object and wrapping it up in a Response object. The Account class itself is poorly designed and has far too many properties. And we need to add one more (don't @ me, it was not my call, this was another team's work, and I was only helping the salvage operation).

None of the existing code in the method had any tests, and the entire class had not been written with testing in mind: it had a raft of inline static method calls to external dependencies, and created other dependency objects inline. The team looking at the work had decided the code was untestable, and were about to continue adding to it, still without tests.

In a better world, even if the method had grown to be close to 400 lines long, as it evolved its tests would have evolved as well, and we'd be starting with all its current logic covered, and it would be a simple matter of adding in one new trivial case, and then adding in our one new bit of trivial code. But… that is not the reality we are living in.

Let's look at the code more closely. The code at the bottom of the method is along these lines:

    // ...
    $account->partnerInfo = $infoModel->jsonSerialize();
    $response = (object) ['account' => $account];
    return new Response($response);
}

Right before we create and return the $response object, we needed to add an apiToken property into the $account object.

The problem is that for a test to get to that last line of code, our test is going to have to execute the previous 400-odd lines of code too. And that code hits the database and a coupla different web services to source its data. Plus a couple of the DB accesses were writes to the database. We can't be doing that in a unit test, and we can't mock-out the DB and web service calls either due to how the code was written. Untestable, right? That was the conclusion the dev team had reached.

Is this code really untestable though? Can we really not use TDD to develop this change? Not at all. Deriving the test case is super easy. Barely even an inconvenience. We even already know what it is: "it must have an apiToken property sourced from the API service". So we know that. On that basis, we can even write the test for it.

/**
* @testDox it must have an apiToken property sourced from the API service
* @covers ::setApiTokenInAccount
*/
public function testBuiltAccountContainsAnApiToken()
{
    $anyUuid = $this->matchesRegularExpression($self::PATTERN_FOR_UUID);
    $anyTime = $this->matchesRegularExpression($self::PATTERN_FOR_TIMESTAMP);
    $apiConnector = $this->createMock(ApiConnector::class);
    $apiConnector
        ->expects($this->once())
        ->method('createApiToken')
        ->with($anyUuid, $anyTime)
        ->willReturn('A_VALID_TOKEN');
    ReflectionHelper::setProperty($this->accountService, 'apiConnector', apiConnector);
    $account = $this->createMock(Account::class);
    ReflectionHelper::invokeMethod($this->accountService, 'setApiTokenInAccount', [$account]);
    $this->assertSame('A_VALID_TOKEN', $account->apiToken);
}

Here it's $this->accountService that is our system under test, and it's where that huge buildResponse method is. ReflectionHelper is just a utility class that implements that ubiquitous code everyone has that uses reflection to set non-public properties, call non-public methods etc. Fortunately apiConnector is not created inline within buildResponse, it's been set into a property earlier, so we can replace it with a mock. This makes our life easier here.

We're going to extract the logic that gets the token out into its own private method - so we're only adding the minimum one line of new code into buildResponse - and we test that method directly, rather than calling buildResponse which we absolutely cannot do. Normally testing private methods is a nono. I also realise there's a strong whiff of "testing the implementation not the behaviour" going on here, which is also a TDD anti-pattern, and in the perfect world we would not do this. But we are about 370 lines of code away from the perfect world here, so we need to make do.

We have a test case that defines and tests the functionality we're expecting, and we will have an implementation to test:

private function setApiTokenInAccount(&$account)
{
    // some logic around the UUID and time argument values elided because it's not relevant here
    $apiToken = $this->apiConnector->createApiToken($uuid, $timestamp);
    $account->apiToken = $apiToken;
}

We then change buildResponse to call this:

    // ...
    $account->partnerInfo = $infoModel->jsonSerialize();
    $this->setApiTokenInAccount($account); // <----- new code
    $response = (object) ['account' => $account];

    return new Response($response);
}

We've still TDDed our solution, and we now have a bit of testing on the requirements of buildResponse even if we're not calling that method. We have to accept that we have added a line of code to buildResponse and it's now 388 lines long, and that addition isn't tested, but we cannot help that. This is a good case of "perfect is the enemy of good".

Now that we have discovered this mess, we can line-up a technical debt ticket to tidy-up buildResponse. It's a critical method, and it should not have been left to rot like this.


As it happens, we were needing to do more work on that method anyhow, and when planning out the story we built in effort to refactor it.

I need to blur the code because I can't think of suitable renamings for all the activity going on here and I can't use the original code, but you will be able to tell how much it's improved:

From 388 lines of code to 25, with most of the 25 just being one in a sequence of steps, which makes for really easy understanding of what's going on for any dev looking at the code in future. This is how a method should read.

All we did to get to here is to identify the 14 different sections of functionality in the original function and use PHPStorm's "Refactor > Extract Method" facility. We could then write tests for some of those bits of functionality, where there was no issues with inline dependencies being created. Some of them still didn't have tests because there were still untestable, but we undertook to deal with that later when we come to need to maintain them. We could also test buildResponse, because now it's basically just a recipe of steps (private methods) to follow, we could mock-out the steps, and just test that the data flow through the recipe was what was intended.

The testing for the overall method also overcomes the "testing the implementation not the behaviour" smell I mentioned before. The tests for it were testing the outer surface of the unit, and checking the results - what the response object ended up being given each of the variation of inputs - rather than how they were derived. There are still the individual tests on the private methods as well, and that's a bit "testing the implementation", but none of the tests will be impacted by changes elsewhere in the codebase, and as those bits of implementation change, then the overall tests can be updated to shift the cases to there, and retire the implementation-specific tests. When dealing with legacy code and limited time resources, backfilling test coverage and confidence is always a journey and not a destination. So as long was we've moving in the correct direction, I believe it's OK to not do things to the letter of the dogma, sometimes.

It was easy (and pretty fulfilling, I will say) work, and it's made the code a helluva lot better for the next person who comes along and needs to work on it.

In conclusion we don't need to think "oh god this is untestable", and continue making things worse by adding more untested code. The code probably is testable, just maybe in a partial fashion, and maybe also not in an ideal fashion. But we can at least start.

Righto.

--
Adam

Sunday, 13 June 2021

CFML: messing around with mixins (part 3)

G'day:

In the preceding two articles in this series (1 and 2) I looked at how to implement mixins in CFML, both at compile time and at runtime, with an increasing amount of functionality / complexity. My closing conclusion was that the runtime approach I was taking was a wee bit too magic and opaque to be something one should do; but at least it's interesting to look at the issues & techniques involved. Today I'm going to try to come up with a less opaque solution.

Before I go any further I will repeat what I've said in the previous two articles: I would never actually recommend using mixins at all. They just seem like a hack to me: an anti-pattern and poor design. As a concept they're just either a poor-person's inheritance, or a poor-person's dependency-injection technique. Just use the real things.

What do I mean by my previous work was a bit opaque? I'll repeat the example from the last article.

component {

    function getObjects(orderBy="id") {
        return getObjectsFromStorage(orderBy)
    }
}

Externally, I am mixing in that getObjectsFromStorage method at runtime. But how does the person looking at that code in isolation know this? They don't. They need to just already know that there might (or might not!) be some external shenanigans going on to avail that method at runtime. This is rubbish. The code in any given class should be self-contained, and be capable of being executed provided one fulfils the contract the code specifies. One should not need to know what context it's being used in to understand how it works: that's an anti-pattern. Contrast with this:

component {

    function init(required Repository repository) {
        variables.repository = repository
    }

    function getObjects(orderBy="id") {
        return variables.repository.getObjectsFromStorage(orderBy)
    }
}

Someone looking at that code is not surprised at all: they can tell what's going on. One should never write code that seems like the former example; go for the latter. Don't. Put. Surprises. In. Your. Code.

Today I'm gonna try to effect some hybrid of using clear dependency injection to avail my object of external resources, but then using my earlier mixing-in strategy to expose them to the code within the class as "local" functions. So basically still being able to use getObjectsFromStorage rather than qualify it with variables.repository.getObjectsFromStorage.

But… but … why, Adam?

Good question. Note I would never ever do any of this in my own code. This exercise cropped up from a conversation I was having about some framework code, and how it seems to revel in having an impenetrable (and pretty naive) implementation. It just injects methods all over the place - with no consideration as to whether they belong in the object they are being injected into, I might add - and it's all held together with some level-one-quality magic. I do not mean that in a good way, in case you can't tell. It's currently being overhauled a bit for its next version, and I'm going through this exercise to see if I can offer some suggestions as to how to make it seem less magical, and more coherent. This is part of that.

Did I mention I would never ever ever do any of this sort of thing in my own code? Good. Glad we cleared that one up.

The "interesting" thing about this article is that I have not got a single line of code yet - I mean as I type this sentence - and only a vague idea of how I'm going to solve this (and I'll be copy and pasting code from the previous article). I have some goals in place though, by way of the test cases from the previous articles. I've cannibalised those to a subset of cases I need to address for this exercise.

So that's a place to start. The cases might change as I go, but I'll start with the first one and work from there.

It tests unmodified setup

As some context, for this first case we are going to already assume some existing code. This part is not breaking TDD, but to wire things into objects, I need to have those objects' classes first. So I've got this lot:

//MyModel.cfc
import cfmlInDocker.miscellaneous.mixins.runtime.advanced.MyRepository

component {

    public MyModel function init(required MyRepository repository) {
        variables.repository = repository
        
        return this
    }

    public Colour[] function getObjectsViaInjectedRepository() {
        return variables.repository.getAllObjects()
    }
}
// MyRepository.cfc
component {

    public MyRepository function init(MyDao dao) {
        variables.dao = dao
        
        return this
    }

    public Colour[] function getAllObjects() {
        return dao.getRecords("id", -1).reduce((array=[], record) => array.append(new Colour(record.mi)))
    }
}
//MyDao.cfc
component {

    public query function getRecords(required string orderBy, required numeric maxrows) {
        records = queryNew("id,en,mi", "integer,varchar,varchar", [
            [1,"red","whero"],
            [2,"green","kakariki"],
            [3,"blue","kikorangi"]
        ])
        return queryExecute(
            "SELECT * FROM records ORDER BY #orderBy#",
            {},
            {dbtype="query", maxrows=maxrows}
        )
    }
}

And I couldn't help but pop a test in for this as I was tweaking some of the stuff I was porting over from the previous article's examples:

//FacadeTest.cfc

import testbox.system.BaseSpec
import cfmlInDocker.miscellaneous.mixins.runtime.facade.*

component extends=BaseSpec {

    function run() {
        describe("Testing facade proof of concept", () => {
            it("tests unmodified setup", () => {
                model = new MyModel(new MyRepository(new MyDao()))

                results = model.getObjectsViaInjectedRepository()

                expect(results).toBe([
                    new Colour("whero"),
                    new Colour("kakariki"),
                    new Colour("kikorangi")
                ])
            })
        })
    }
}

Basically we have a model representing Colour collections, where the colours are the Maori terms for various colours. All somewhat contrived, I know.

It creates facades for methods

From a TDD perspective, my intended approach here "interests" me. I already know the end result I want, and I pretty much already have the code I need to use. I'll just be using it in a different way from before. My inclination here is to write (read: copy and paste) a lot of code first up to get the mechanics in place to do any of the mixing-in, and at that point each case is just reimplementing each step from the last article as I support more and more features in the mixinMap. Writing a lot of code for a single case is a smell to me. But is this new code? Is it kind of a refactoring instead, so I just need to make sure my existing cases are all still accounted for? I think I'm doing TDD wrong here, but will will just crack on with it, and pay attention to what I'm doing in case I can work out a better strategy as I go. And as always I'm open to advice as to how I should have been doing something.

Cracking on with it, here's my test:

it("creates facades for methods", () => {
    model = new MyModel(new MyRepository(new MyDao()), new FacadeMapper())

    results = model.getObjectsViaFacade()

    expect(results).toBe([
        new Colour("whero"),
        new Colour("kakariki"),
        new Colour("kikorangi")
    ])
})

I'm passing-in an optional dependency here: a FacadeMapper. I'm also calling a new method I've added: MyModel.getObjectsViaFacade: each test is going to need a method to call to test the case in question).

FacadeMapper's functionality is lifted largely from DependencyInjectionImplementor in the previous article:

component {

    public void function setFacadesFromMap(required Component target, required Component source, required struct map) {
        target.__getVariables = getVariables
        scopes = {
            public = target,
            private = target.__getVariables()
        }
        structDelete(target, "__getVariables")

        map.each((sourceMethod, mapping) => {
            targetMethod = mapping.keyExists("target") ? mapping.target : sourceMethod

            requestedAccess = mapping.keyExists("access") ? mapping.access : "private"
            targetAccess = requestedAccess == "public" ? "public" : "private"

            bind = mapping.keyExists("bind") && isBoolean(mapping.bind) ? mapping.bind : true

            proxy = bind
                ? () => source[sourceMethod](argumentCollection=arguments)
                : source[sourceMethod]

            scopes[targetAccess][targetMethod] = proxy
        })
    }

    private struct function getVariables(){
        return variables
    }
}

I explain it all throughly there, so go have a read. But basically it grabs references to a source object's methods, wraps them in a proxy, and injects them into the target.

And the relevant new code in MyModel to call this is here:

component {

    public MyModel function init(required MyRepository repository, Component facadeMapper) {
        variables.repository = repository

        if (arguments.keyExists("facadeMapper")) {
            mapFacades(facadeMapper)
        }

        return this
    }

    private void function mapFacades(facadeMapper) {
        facadeMapper.setFacadesFromMap(
            this,
            variables.repository,
            {
                "getAllObjects" = {}
            }
        )
    }
    
    // ...    
}

Unlike in the previous article's approach, where the wiring of the mixins was done completely outside of the target object, so no-one would ever have any idea what's going on with the code and why it magically works; here it's explicit. We list the methods we are creating facades for (only one just for now). So when we come to use the facaded (?is that a word?) method, we can tell where it came from:

public Colour[] function getObjectsViaFacade() {
    return getAllObjects()
}

It remaps function names if the map specifies it

It handles public/private method access on the mixed-in function

It ignores access modifiers other than public / private (falling back to private)

It doesn't bind facade function to its original context if bind option is false

I'm gonna cover the cases for these in one fell swoop because it's all repetition from the previous article anyhow. This is why I'm kinda considering this more of a refactoring exercise than the red/green part of a TDD exercise. The tests are thus:

it("remaps function names if the map specifies it", () => {
    model = new MyModel(new MyRepository(new MyDao()), new FacadeMapper())

    results = model.getSomeObjectsViaFacade(rows=2)

    expect(results).toBe([
        new Colour("whero"),
        new Colour("kakariki")
    ])
})

it("handles public/private method access on the mixed-in function", () => {
    model = new MyModel(new MyRepository(new MyDao()), new FacadeMapper())

    results = model.getOrderedObjects(orderBy="English")

    expect(results).toBe([
        new Colour("kikorangi"),
        new Colour("kakariki"),
        new Colour("whero")
    ])
})

it("ignores access modifiers other than public / private (falling back to private)" , () => {
    model = new MyModel(new MyRepository(new MyDao()), new FacadeMapper())

    results = model.getEnglishObjectsViaFacade()

    expect(results).toBe([
        new Colour("red"),
        new Colour("green"),
        new Colour("blue")
    ])

    expect(() => model.getEnglishObjects()).toThrow(type="Expression", regex="^.*MyModel.*has no\s+function with name.*getEnglishObjects.*$")
})

it("doesn't bind facade function to its original context if bind option is false", () => {
    model = new MyModel(new MyRepository(new MyDao()), new FacadeMapper())
    modelVariables = model.getVariables()

    expect(getMetadata(modelVariables.this)).toBe(getMetadata(model))
})

The mapping made to facilitate these tests has grown:

private void function mapFacades(facadeMapper) {
    facadeMapper.setFacadesFromMap(
        this,
        variables.repository,
        {
            "getAllObjects" = {},
            "getSomeObjects" = {"target" = "getSubsetOfObjects"},
            "getOrderedObjects" = {"access" = "public"},
            "getEnglishObjects" = {"access" = "package"}
        }
    )
    facadeMapper.setFacadesFromMap(
        this,
        new VariablesAccessor(),
        {
            "getVariables" = {"access" = "public", "bind" = false}
        }
    )
}

But all this was covered in depth in the previous article as well. VariablesAccessor is just this:

component {

    public struct function getVariables(){
        return variables
    }
}

This is used in the test case that doesn't bind to the original source's context, instead letting it bind to the target's, so it acts on the target's context (we can use it to access the target's variables scope). Just… look at the relevant test case (such is the beautify of BDD-oriented test cases, they kinda document what's going on in and of themselves.

And in MyModel I've need to add some more methods for each test to call to access the relevant facade-mapping variation. There's no need to pore over this code, it's boilterplate for the tests.

public Colour[] function getSomeObjectsViaFacade(string rows="all") {
    return getSubsetOfObjects(rows)
}

public Colour[] function getEnglishObjectsViaFacade() {
    return getEnglishObjects()
}

And in the repository:

public Colour[] function getSomeObjects(string rows="all") {
    rows = (isValid("integer", rows) && rows > -1) ? rows : -1

    return dao.getRecords("id", rows).reduce((array=[], record) => array.append(new Colour(record.mi)))
}

public Colour[] function getOrderedObjects(string orderby="id") {
    orderBy = mapOrderBy(orderBy)

    return dao.getRecords(orderBy, -1).reduce((array=[], record) => array.append(new Colour(record.mi)))
}

public Colour[] function getEnglishObjects() {
    return dao.getRecords("id", -1).reduce((array=[], record) => array.append(new Colour(record.en)))
}

private string function mapOrderBy(required string property) {
    propertyColumnMap = {
        "id" = "id",
        "english" = "en",
        "maori" = "mi"
    }
    return propertyColumnMap.keyExists(property) ? propertyColumnMap[property] : "id"
}

The DAO already had all the necessary functionality baked in: there were no changes to that.


That might seem like a lot of code changes that I glossed-over there, but the objective of this article is all around the explicit use of that mapping in the class receiving the facades, and how that's clearer than magic mixins.

I'll also re-re-iterate that I think doing this sort of thing is rubbish. It's a lot of effort to provide non-name-spaced functions, and all yer gaining is a lack of clarity, and a few keystrokes. It's just a daft thing to do. But it was an interesting progression of exercises for me, anyhow. And keeps me off the streets on a Sunday (it' too hot to go outdoors, that said).

Righto.

--
Adam