Adam Cameron's Dev Blog: Higher-order functions

Showing posts with label Higher-order functions. Show all posts

Friday, 9 September 2022

Kotlin: another Friday afternoon, another round of random investigation

G'day:

Because no-one has specifically screams "FFS stop it, Cameron", I'm gonna continue with another random Kotlin noobie investigation / brain dump. Previous similar articles here.

The Nothing and Unit types

Nothing

This came up in the next koans exercise. I read the docs, and looked at an answer on Stack Overflow ("Kotlin - Void vs. Unit vs. Nothing). The key bit seems to be the word cannot in this sentence: "If a function has return type Nothing, then it cannot return normally. It either has to throw an exception, or enter an infinite loop". And the next sentence is an interesting one: "Code that follows a call to a function with return type Nothing will be marked as unreachable by the Kotlin compiler". I wrote a test (obviously ;-)):

class NothingTest : DescribeSpec( {
    describe("Tests of the Nothing type") {
        val EOL = System.lineSeparator()

        val output = SystemLambda.tapSystemOut {
            try {
                outputUntilDie(3)
            } catch (e: Exception) {
                println(e.message)
            }
        }
        output.trim() shouldBe "Iteration 1${EOL}Iteration 2${EOL}Iteration 3${EOL}It died"
    }
})

fun outputUntilDie(max :Int) :Nothing {
    var current = 0
    while (true) {
        current++
        println("Iteration $current")
        if (current >= max) {
            throw Exception("It died")
        }
    }
}

The code below is on Github @ NothingTest.kt

Note in outputUntilDie there is never a normal return condition. The loop either goes forever (assuming I could somehow pass it a max of infinity), or the code bombs out. So whatever comes after the call to outputUntilDie can never be run. If I try to add some, IntelliJ lets me know:

And the compiler gives a warning:

> Task :compileTestKotlin
w: C:\src\kotlin\scratch\src\test\kotlin\language\types\NothingTest.kt: (15, 17): Unreachable code

If I was to change the return type to be Unit (returns no value, as opposed to returns nothing), then I don't get the visual cue or warning.

So I guess it's handy in that regard.

OK, the koans exercise demonstrates a more realistic situation where the code being considered unreachable is important.

import java.lang.IllegalArgumentException

fun failWithWrongAge(age: Int?) {
    throw IllegalArgumentException("Wrong age: $age")
}

fun checkAge(age: Int?) {
    if (age == null || age !in 0..150) failWithWrongAge(age)
    println("Congrats! Next year you'll be ${age + 1}.")
}

fun main() {
    checkAge(10)
}

Here the code won't even compile, because it's possible to get to ${age + 1} with age being null:

C:\src\kotlin\Kotlin Koans\Introduction\Nothing type\src\Task.kt:9:50
Kotlin: Operator call corresponds to a dot-qualified call 'age.plus(1)' which is not allowed on a nullable receiver 'age'.

Telling it that failWithWrongAge has a return type of Nothing mitigates this as processing could have halted already by then.

Unit

And the Unit type is a type that is equivalent to void, it's just a more OO way of doing things, cos it is actually a type, and there is a Unit object. This means everything returns an object, and there's no "special" handling like in Java with void. It's also the default return type in Kotlin according to the docs, so one doesn't need to specify it.

Lambdas

Here I've just messed around with some common higher-order functions available on collections. The code below is on Github @ FunctionsTest.kt

it("is a basic PoC test") {
    val strings = listOf("whero", "kākāriki", "kikorangi")
    val lengths = strings.map {it.length}

    lengths.joinToString(",") shouldBe "5,8,9"
}

map just takes a single param that is the mapping function, so no need for parentheses. And because the callback only receives the one argument, one can just use the default it name. So {it.length} is a lambda that returns the length property value of its it argument (ie: String.length).

class Number(var value: Int, var en: String, var mi: String) {}

it("chains some together") {
    val jumbledNumbers = listOf(
        Number(2, "two", "rua"),
        Number(4, "four", "wha"),
        Number(3, "three", "toru"),
        Number(1, "one", "tahi")
    )
    val maoriNumbers = jumbledNumbers.sortedBy{it.value}.map{it.mi}

    maoriNumbers.joinToString(",") shouldBe "tahi,rua,toru,wha"
}

More of the same except I'm chaining some methods together here.

it("shows using _ to ignore a param in a lambda") {
    val numbers = mapOf(
        Pair("one", "tahi"),
        Pair("two", "rua"),
        Pair("three", "toru"),
        Pair("four", "wha")
    )
    val output = SystemLambda.tapSystemOut {
        numbers.forEach{(_, maori) -> print("$maori")}
    }

    output.trim() shouldBe "tahiruatoruwha"
}

The docs made a point of mentioning this, so I might as well. It's right in the Kotlin idiom to use _ as a parameter name when your lambda receives some arguments it doesn't need. Here the key is the first parameter (so the English version of the number), and I don't need that so this is made explicit by just using _. It's only the second argument I want: the Māori version

class Number(var value: Int, var en: String, var mi: String) {}

it("uses an anonymous function") {
    val jumbledNumbers = listOf(
        Number(2, "two", "rua"),
        Number(4, "four", "wha"),
        Number(3, "three", "toru"),
        Number(1, "one", "tahi")
    )

    val maoriNumbers = jumbledNumbers.sortedWith(fun(e1, e2) = e1.value - e2.value).map{it.mi}

    maoriNumbers.joinToString(",") shouldBe "tahi,rua,toru,wha"
}

For the hell of it here I use the anonymous function syntax (so with the fun keyword) for sortedWith comparator lambda. As the comparator method is just one expression, I don't need the parentheses or the return statement.

it("is an example of a fold") {
    val rgb = listOf("whero", "kākāriki", "kikorangi")

    val colours = rgb.fold("") {
        colours, colour -> if (colours.isEmpty()) colour else "$colours,$colour"
    }

    colours shouldBe "whero,kākāriki,kikorangi"
}

And with the block syntax I had no idea how to define my own params (instead of just the situation where there's only one, and the default it would do), but it's pretty straight forward as it turns out. The bit I didn't guess right was the parameter definitions are inside the block, not before it like I kinda wanted to do.

I also found out that there is no ternary operator (? :) in Kotlin. if is an expression, so it does the same thing. Slightly more verbose, but it's OK. I don't like using else in my code though (but I don't might the "else" operand when using ? :. Hrm…?

it("is an example of a reduce") {
    val redWhiteAndBlue = listOf("whero", "mā", "kikorangi")

    val colours = redWhiteAndBlue.reduce {colours, colour -> "$colours,$colour"}

    colours shouldBe "whero,mā,kikorangi"
}

This shows ths difference between fold and reduce:

fold takes a specific initial value as a first argument.
reduce uses the first element of the collection as the initial value.

`groupBy` higher-order function

Ooh I just spotted this in the docs. Kotlin has a gropupBy higher-order function. It does what it says on the tin. here's a contrived example:

it("groups the words by their first character, capitalising each word in the result") {
    val words = listOf(
        "tahi","rua","toru","wha","rima","ono","whitu","waru","iwa","tekau", // 1-10
        "whero","karaka","kōwhai","kākāriki","kikorangi","poropango","papura", // ROYGBIV
        "rāhina","rātū","rāapa","rāpare","rāmere","rāhoroi","rātapu" // Mon-Sun
    ).sortedWith(Collator.getInstance(Locale("mi_NZ")))

    val groupedByLetter = words.groupBy(
        { word -> word.first()},
        {word ->  word.replaceFirstChar(Char::titlecase)}
    ).toSortedMap()

    groupedByLetter shouldBe mapOf(
        Pair('i', listOf("Iwa")),
        Pair('k', listOf("Kākāriki", "Karaka", "Kikorangi", "Kōwhai")),
        Pair('o', listOf("Ono")),
        Pair('p', listOf("Papura", "Poropango")),
        Pair('r', listOf("Rāapa", "Rāhina", "Rāhoroi", "Rāmere", "Rāpare", "Rātapu", "Rātū", "Rima", "Rua")),
        Pair('t', listOf("Tahi", "Tekau", "Toru")),
        Pair('w', listOf("Waru", "Wha", "Whero", "Whitu"))
    )
}

I like that.

Primary and secondary constructors

You saw I had that wee Number class in my tests:

 class Number(var value: Int, var en: String, var mi: String) {}

Initially I had written that long-form like this:

class Number {
    var value = 0
    var en = ""
    var mi = ""

    constructor(value: Int, en: String, mi: String) {
        this.value = value
        this.en = en
        this.mi = mi
    }
}

But IntelliJ admonished me with this:

And I went "um… primary and secondary constructors are a thing, are they? OK. Go for it pal". And it squizhed everything up for me.

I RTFMed and right there on the class page it says this:

A class in Kotlin can have a primary constructor and one or more secondary constructors. The primary constructor is a part of the class header, and it goes after the class name and optional type parameters.

[…]

The primary constructor cannot contain any code. …

Ah OK this makes sense if one just wants simple class with some property values (like I do here), there's no need for so much boiler plate to achieve this. Nice. I'm gonna need to read the rest of that section about initializer blocks and the like. But: tomorrow. I've messed around enough for one afternoon.

That was quite short, but it reflects the forward progress I made in a few hours exploration. The good thing is: the more I find out about Kotlin, the more I want to find out about it.

Righto.

--
Adam

Wednesday, 7 July 2021

TIL: two things about reduce operations

G'day:

When working through yesterday's article ("TDD, code puzzle, recursing reductions"), I learned two new things about how reduce operations work.

The first thing was that I was having problems with defaulting the initial value of my callback. I had something like this:

array = ["tahi", "rua", "toru", "wha"]

list = array.reduce((result="", value) => {
    return result.listAppend(value.ucase())
})
writeDump(list)

When I was running this on Lucee I was getting an error: can't call method [listAppend] on object, object is null. The callback was not receiving its default value.

After a while I clocked that I had Lucee's "Complete Null Support" switched on, and this was ruining things for everyone. Why?

Because the method signature for Lucee's Array.reduce has two forms:

Array.reduce(callback)

Array.reduce(callback, initialValue)

I was using the first form, but Lucee still needs to cater for the second form. What's happening is that when Lucee encounters the first form, it's saying "OK so no initialValue. But… null support is switched on, so I'll just pass null for that". So when the callback is called, the initial value is null, so the default in the function signature for the callback is "not needed". Harrumph. I don't like this much, but I understand why it is the way it is I s'pose.

The easy fix is to just use the second form all the time.

Brad Wood pointed out another pitfall when using the callback's default value for that first argument. Consider this variant of the code from above:

array = []

list = array.reduce((result="", value) => {
    return result.listAppend(value.ucase())
})
writeDump(list)

There's no elements in the array, so the callback is never called, so the initial value for the result is never set. So on Lucee with null support set, the result is null. On ColdFusion, and on Lucee with null support switched off, it errors along the lines of variable [list] doesn't exist. It's worthwhile remembering that the default value on the callback parameter is only for the callback, and it simply "coincidentally" works as a default value for the entire operation if the callback is used. To give the operation a default starting value, you need to set it on the reduce call, not the callback signature.

Cheers for pointing this out to me, Brad. It seems obvious once you mentioned it, but it had never occurred to me.

That's it. Two things I learned. Cool.

Righto.

--
Adam

Tuesday, 6 July 2021

TDD, code puzzle, recursing reductions

G'day:

I couldn't think of a title for this one, so: that'll do.

A few days ago Tom King posted this on the CFML Slack channel:

I'm sure I've asked this before, but has anyone got a good function to take complex values and parse out to a curl style string? i.e, https://trycf.com/gist/0c2672ac4a02d4c52d1e3ee1c9a408e7/lucee5?theme=monokai

The code in the gist is thus:

original = {
    'payment_method_types' = ['card'],
    'line_items' = [
        {
            "price_data" = {
                "product_data" = {
                    "name" = "Something"
                },
                "currency" = 'gbp',
                "unit_amount" = 123
            },
            "quantity" = 1
        }
    ]
};

writeDump(var=original, label="Original");

wantedResult["payment_method_types[0]"] = 'card';
wantedResult["line_items[0][price_data][product_data][name]"] ='Something';
wantedResult["line_items[0][price_data][currency]"] = 'gbp';
wantedResult["line_items[0][price_data][unit_amount]"] = 123;
wantedResult["line_items[0][quantity]"] = 1;

writeDump(var=wantedResult, label="Want this:");

So he wants to flatten out the key-path to the values. Cool. I'll give that a bash. Note: we've actually already solved this, but I'm now going back over the evolution of my solution, following my TDD steps. This is more a demo of using TDD than it is solving the problem: don't worry about how much code their is in this article, it's mostly incremental repetition, or just largely copy-and-pasted-and-tweaked tests. I will admit that for my first take on this I just wrote the one test that compared my result to his final expectation, and it took me far more time to sort it out than it ought to. I did the whole thing again last night to prep the code for this article, and using more focused TDD I solved it faster. Obviously I was benefiting from already kinda remembering how I did it the first time around, but even the first time around I could see how to do it before I even wrote any code, I just pantsed around making mistakes the first time due to not focusing. TDD on my second attempt helped me focus.

Last things first: the final test

I've got the "client's" final requirement already, so I'll create a test for that now, and watch my code error because it doesn't exist (my code, that is). I'll link the test case description through to the current state of the code in Github. I'll tag each step.

it("fulfils the requirement of the puzzle", () => {
    input = {
        "payment_method_types" = ["card"],
        "line_items" = [
            {
                "price_data" = {
                    "product_data" = {
                        "name" = "Something"
                    },
                    "currency" = "gbp",
                    "unit_amount" = 123
                },
                "quantity" = 1
            }
        ]
    }
    expected = {
        "payment_method_types[0]" = "card",
        "line_items[0][price_data][product_data][name]" = "Something",
        "line_items[0][price_data][currency]" = "gbp",
        "line_items[0][price_data][unit_amount]" = 123,
        "line_items[0][quantity]" = 1
    }

    actual = flattenStruct(input)

    expect(actual).toBe(expected)
})

This errors with No matching function [flattenStruct] found, because I ain't even started yet! Note: for this exercise, I'm just building the function in the test class itself, given it's just one function, and it only needs to exist to fulfil this blogging/TDD exercise.

This test will keep erroring until I've finished. This is fine. I am not going to look at that test code again until I have finished the task though.

It handles simple value key/value pairs

I'm not going to be crazily pendantic with the TDD here. For example in this case I'm rolling in "creating the function and solving the first step of the challenge" into one step. I've already got a test failing due to the lack of the function existing, so I can watch it's progress for some of the micro-iterations I am missing out. So for this step I'm gonna iterate over the struct, and deal with the really easy cases: top level key/value pairs that are simple values and do not require any recursion to work. This is also the "bottom" level of the recursion I'll move on to, so know I already need it anyhow. Maybe making the decision that I need to use recursion to sort this out is pre-emptive, but I'm buggered if I know any other way of doing it. And taking recursion as a given, I always like to start with the lowest level solution: the bit that doesn't actually recurse.

Here's my test for this:

it("handles top-level key/values with simple values", () => {
    input = {
        "one" = "tahi",
        "two" = {"second" = "rua"},
        "three" = {
            "third" = {"thrice" = "toru"}
        }
    }
    expected = {
        "[one]" = "tahi"
    }

    actual = flattenStruct(input)

    expect(actual).toBe(expected)
})

And the implementation to make it pass:

function flattenStruct(required struct struct) {
    return struct.reduce((flattened, key, value) => {
        if (isSimpleValue(value)) {
            return {"[#key#]" = value}
        }
        return flattened
    }, {})
}

That's lovely, and the test passes, but there's a bug in it which I didn't notice the first time around.

It handles multiple top-level key/values with simple values

it("handles multiple top-level key/values with simple values", () => {
    input = {
        "one" = "tahi",
        "first" = "tuatahi",
        "two" = {"second" = "rua"},
        "three" = {
            "third" = {"thrice" = "toru"}
        }
    }
    expected = {
        "[one]" = "tahi",
        "[first]" = "tuatahi"
    }

    actual = flattenStruct(input)

    expect(actual).toBe(expected)
})

That fails with Expected [{[one]={tahi}, [first]={tuatahi}}] but received [{[one]={tahi}}]. It's cos I'm a dropkick and not appending each iteration to the previous:

return {"[#key#]" = value}

// should be

return flattened.append({"[#key#]" = value})

Once I fixed that: the two TDD tests are green. Obviously the initial one is still broken cos we ain't done.

It handles complex values

Now all I needed was to add some structs to the sample data for the test:

it("handles complex values", () => {
    input = {
        "one" = "tahi",
        "first" = "tuatahi",
        "two" = {"second" = "rua"},
        "three" = {
            "third" = {"thrice" = "toru"}
        }
    }
    expected = {
        "[one]" = "tahi",
        "[first]" = "tuatahi",
        "[two][second]" = "rua",
        "[three][third][thrice]" = "toru"
    }

    actual = flattenStruct(input)

    expect(actual).toBe(expected)
})

And write a quick implementation:

function flattenStruct(required struct struct) {
    flatten = (flattened, key, value, _, prefix="") => {
        var prefixedKey = "#prefix#[#key#]"
        if (isSimpleValue(value)) {
            return flattened.append({"#prefixedKey#" = value})
        }

        return flattened.append(value.reduce(
            function (flattened, key, value, _, prefix=prefixedKey) {
                return flattened.append(flatten(flattened, key, value, _, prefix))
            },
            {}
        ))
    }

    return struct.reduce(flatten, {})
}

This warrants some explanation:

We need to extract all the logic we currently have within the outer function into its own function. We need to do this because when we recurse, we need it to call itself, so it needs a name.
The recursion works by saying to each level of the recursion "the previous level had this key, so append your keys to this", so we implement this, and default the prefix to be an empty string to start with.
We're leveraging a nice feature of CFML here in that one can pass additional arguments to a function, and "it'll just work", so we pass that after all the other arguments. We're not using the fourth parameter to the reduce callback which contains the original collection being iterated over, so we reflect that by calling the parameter _ (this is a notional standard for this sort of thing).
When we get to the "bottom" of the recursion we will always have a simple value for the value (otherwise we're not done recursing), so we prepend its key with the parent key (which will be all the ancestor keys combined).

I also now needed to update my previous test case expectations to include the structs coming out in the result too. In hindsight, I messed up there: I should only have had test data for the case at hand, rather than having some more data that I knew was specifically going to collapse in a heap.

But it also has a bug. The first test completely broke now: Can't cast Complex Object Type Struct to StringUse Built-In-Function "serialize(Struct):String" to create a String from Struct

I find the situation here a bit frustrating. The signature for the callback function for Struct.reduce is this:

function(previous, key, value, collection)

And that's what I have in the code above.

However. I had overlooked that the signature for the callback function for Array.reduce is this:

function(previous, value, key, collection)

It had never occurred to me that key, value and value, key were reversed between the two. Sigh. So my single handling of complex values there doesn't work because I'm passing Array.reduce the key and the value in the wrong order.

It's reasonably easy to fix though.

It handles arrays

I've added an array into the test data:

it("handles arrays values", () => {
    input = {
        "one" = "tahi",
        "first" = "tuatahi",
        "two" = {"second" = "rua"},
        "three" = {
            "third" = {"thrice" = "toru"}
        },
        "four" = ["fourth", "quaternary", "wha", "tuawha"]
    }
    expected = {
        "[one]" = "tahi",
        "[first]" = "tuatahi",
        "[two][second]" = "rua",
        "[three][third][thrice]" = "toru",
        "[four][1]" = "fourth",
        "[four][2]" = "quaternary",
        "[four][3]" = "wha",
        "[four][4]" = "tuawha"
    }

    actual = flattenStruct(input)

    expect(actual).toBe(expected)
})

And added logic to check if the value is a struct or an array before recursing. The variation just being to expect the callback parameters to be in a different order:

function flattenStruct(required struct struct) {
    flatten = (flattened, key, value, _, prefix="") => {
        var prefixedKey = "#prefix#[#key#]"
        if (isSimpleValue(value)) {
            return flattened.append({"#prefixedKey#" = value})
        }

        if (isStruct(value)) {
            return flattened.append(value.reduce(
                function (flattened={}, key, value, _, prefix=prefixedKey) {
                    return flattened.append(flatten(flattened, key, value, _, prefix))
                },
                {}
            ))
        }
        if (isArray(value)) {
            return flattened.append(value.reduce(
                function (flattened={}, value, index, _, prefix=prefixedKey) {
                    return flattened.append(flatten(flattened, index, value, _, prefix))
                },
                {}
            ))
        }
    }

    return struct.reduce(flatten, {})
}

With this amendment all the TDD tests pass, but the original test - which I was kinda expecting to also pass now - wasn't. I had a coupla bugs in my implementation still, basically from me "not reading the requirements"

It qualifies the keys correctly and zero-indexes the arrays

Yes yes, very bad to fix two things at once, I know. I didn't notice that Tom's requirements were to present the flattened keys like this (taken from the previous test): four[0], whereas I'm doing the same one like this: [four][1]. So I'm not supposed to be putting the brackets around the top-level item, plus I need to present the arrays with a zero index. This is easily done, so I did both at once. I figure I'm kinda OK doing this cos I am starting with a failing test afterall. And I also added a more complex test case in (mixing arrays, structs, simple values some more) to make sure I had nailed it:

it("qualifies the keys correctly and zero-indexes the arrays", () => {
    input = {
        "one" = "tahi",
        "first" = "tuatahi",
        "two" = {"second" = "rua"},
        "three" = {
            "third" = {"thrice" = "toru"}
        },
        "four" = ["fourth", "quaternary", "wha", "tuawha"],
        "five" = [
            {"fifth" = "rima"},
            ["tuarima"],
            "quinary"
        ]
    }
    expected = {
        "one" = "tahi",
        "first" = "tuatahi",
        "two[second]" = "rua",
        "three[third][thrice]" = "toru",
        "four[0]" = "fourth",
        "four[1]" = "quaternary",
        "four[2]" = "wha",
        "four[3]" = "tuawha",
        "five[0][fifth]" = "rima",
        "five[1][0]" = "tuarima",
        "five[2]" = "quinary"
    }

    actual = flattenStruct(input)

    expect(actual).toBe(expected)
})

And the fix:

function flattenStruct(required struct struct) {
    flatten = (flattened, key, value, actual, prefix="") => {
        var offsetKey = isArray(actual) ? key - 1 : key
        var qualifiedKey = prefix.len() > 0 ? "[#offsetKey#]" : offsetKey
        var prefixedKey = "#prefix##qualifiedKey#"

        if (isSimpleValue(value)) {
            return flattened.append({"#prefixedKey#" = value})
        }

        if (isStruct(value)) {
            return flattened.append(value.reduce(
                function (flattened={}, key, value, actual, prefix=prefixedKey) {
                    return flattened.append(flatten(flattened, key, value, actual, prefix))
                },
                {}
            ))
        }
        if (isArray(value)) {
            return flattened.append(value.reduce(
                function (flattened={}, value, index, actual, prefix=prefixedKey) {
                    return flattened.append(flatten(flattened, index, value, actual, prefix))
                },
                {}
            ))
        }
    }

    return struct.reduce(flatten, {})
}

Quite simply:

If the data we are reducing is an array, we reduce the index by one;
and we only put the brackets on if we've already got a prefix value.
We're using that _ parameter we were not using before, so I've given it a name.

Now everything works:

Or does it?

It handles the collection being an arguments scope object

I handed this implementation (or something very similar) to Tom with a flourish, and a coupla hours later he was puzzled about what sort of object the arguments scope is. It behaves like both a struct and an array. For some things. And it's ambiguous in others. When Tom was passing an arguments scope to this function, it was breaking again:

The arguments scope passes an isArray check, but its reduce method handles it like a struct, so they key being passed is the parameter name, not the positional index like we'd be expecting with an array that it's just claimed to us that it is.

Let's have a test for this:

it("handles arguments objects", () => {
    input = {
        "one" = "tahi",
        "first" = "tuatahi",
        "two" = {"second" = "rua"},
        "three" = {
            "third" = {"thrice" = "toru"}
        },
        "four" = ["fourth", "quaternary", "wha", "tuawha"],
        "five" = [
            {"fifth" = "rima"},
            ["tuarima"],
            "quinary"
        ],
        "six" = ((sixth, senary) => arguments)("ono", "tuaono", [6])
    }
    expected = {
        "one" = "tahi",
        "first" = "tuatahi",
        "two[second]" = "rua",
        "three[third][thrice]" = "toru",
        "four[0]" = "fourth",
        "four[1]" = "quaternary",
        "four[2]" = "wha",
        "four[3]" = "tuawha",
        "five[0][fifth]" = "rima",
        "five[1][0]" = "tuarima",
        "five[2]" = "quinary",
        "six[sixth]" = "ono",
        "six[senary]" = "tuaono",
        "six[2][0]" = 6
    }

    actual = flattenStruct(input)

    expect(actual).toBe(expected)
})

Note I'm using CFML's IIFE feature to call an inline function expression there.

I am not entirely happy with my fix for this, but I've gaffer-taped it up:

function flattenStruct(required struct struct) {
    flatten = (flattened, key, value, actual, prefix="") => {
        var offsetKey = isArray(actual) && isNumeric(key) ? key - 1 : key
        var qualifiedKey = prefix.len() > 0 ? "[#offsetKey#]" : offsetKey

I'm just checking if the "key" is numeric before I decrement it.

There's perhaps one more case I should put in there. This will silently ignore anything other than simple values, arrays or structs. I had a mind to throw an exception at the end if we'd not already returned, but I didn't get around to it. This solved the need we had, so that's all good.

Hopefully you found another incremental TDD exercise useful, and also found the problem we were trying to solve an interesting one as well. I did.

Righto.

--
Adam

Thursday, 1 July 2021

One last one! CFML higher-order functions compared to tag-based code: reduceRight

G'day:

I forgot one!

I've already discussed map, reduce, filter, sort, some, every and each, operations; but recently reduceRight was added to CFML (well: at least in ColdFusion it was; it's not in Lucee yet) as well.

I have to start my day job in 16min, so this will be quick.

reduceRight is the same as reduce, except it starts from the end of the collection, not the beginning:

colours = ["Whero","Karaka","Kowhai","Kakariki","Kikorangi","Poropango","Papura"]

coloursAsList = colours.reduce((all="", colour) => all.listAppend(colour))
coloursAsReversedList = colours.reduceRight((all="", colour) => all.listAppend(colour))

writeOutput("coloursAsList: #coloursAsList#<br>coloursAsReversedList: #coloursAsReversedList#<br>")


coloursAsList: Whero,Karaka,Kowhai,Kakariki,Kikorangi,Poropango,Papura
coloursAsReversedList: Papura,Poropango,Kikorangi,Kakariki,Kowhai,Karaka,Whero

Yes yes Mingo; one would not use reduce to convert an array of strings to a list. That is beside the point. But thanks for letting me know Lucee (but not ColdFusion) has an Array.reverse method, which would be a better way to reverse the list order here: colours.reverse().toList().

And the tags version, just a reversed counting loop does the trick here:

<cfset coloursAsReversedList = "">
<cfloop index="i" from="#arrayLen(colours)#" to="1" step="-1">
    <cfset coloursAsReversedList = listAppend(coloursAsReversedList, colours[i])>
</cfloop>

That's it. four minutes to get to work. Fortunately that's just a matter of switching desktops…

Righto.

--
Adam

CFML higher-order functions compared to tag-based code: some, every and each functions

G'day:

I'm gonna try to round out this short series today: there's not much to say about the some, every and each methods in the context of comparing their functionality to old-school tag-based code. As a reminder, I've already covered map, reduce, filter and sort operations.

some

some iterates over the collection, calling a callback on each element, and will exit as soon as the callback returns true for an element. An example might be checking if at least some class members passed (or failed) their test:

examResults = [
    {person="Alex", mark=75},
    {person="Billie", mark=52},
    {person="Charlie", mark=41},
    {person="Daryl", mark=29},
    {person="Evan", mark=53}
]

somePassed = examResults.some((result) => result.mark >= 50)

writeOutput("Some of the class passed the test? #somePassed#<br><hr>")


someFailed = examResults.some((result) => {
    writeOutput("Called for #result.person#, #result.mark#<br>")
    return result.mark < 50
})

In the second example there I show a difference between this iteration function and the others we've encountered so far. All the others always iterate through the entire collection, however some and every do not. They exit as soon as they can answer the question. So as soon as some gets a true it exits; as soon as every gets a false it exits. The output of this is:


  Some of the class passed the test? true
Called for Alex, 75
Called for Billie, 52
Called for Charlie, 41

In this case it only got as far as the first false result from the callback (because, sadly, Charlie did not make the cut)

The tag-based version of this would be:

<cfset somePassed = false>
<cfloop array="#examResults#" item="result">
    <cfif result.mark GTE 50>
        <cfset somePassed = true>
        <cfbreak>
    </cfif>
</cfloop>
<cfoutput>Some of the class passed the test? #somePassed#<br><hr></cfoutput>

<cfset someFailed = false>
<cfloop array="#examResults#" item="result">
    <cfoutput>Called for #result.person#, #result.mark#<br></cfoutput>
    <cfif result.mark LT 50>
        <cfset someFailed = true>
        <cfbreak>
    </cfif>
</cfloop>

Again with the boilerplate code (ref from previous articles).

BTW, don't get carried away with these higher-order functions if there's another built-in function to do the job. Recently I checked if something was in an array by doing this:

colours = ["Whero","Karaka","Kowhai","Kakariki","Kikorangi","Poropango","Papura"]

containsGreen = colours.some((colour) => colour == "Kakariki")
writeOutput("It contains green: #containsGreen#<br>")

My boss gently pointed out I could just do this:

containsGreen = !!colours.find("Kakariki")

Use the simpler option where possible ;-)

every

every is the opposite of some: it exits as soon as the callback returns false. Our example here would be to check if everyone passed the exam:

everyonePassed = examResults.every((result) => {
    writeOutput("Called for #result.person#, #result.mark#<br>")
    return result.mark >= 50
})
writeOutput("Everyone passed the test? #everyonePassed#<br><hr>")


Called for Alex, 75
Called for Billie, 52
Called for Charlie, 41
Everyone passed the test? false

The tag-based equivalent is the usual "mostly boilerplate" thing:

<cfset everyonePassed = true>
<cfloop array="#examResults#" item="result">
    <cfoutput>Called for #result.person#, #result.mark#<br></cfoutput>
    <cfset personPassed =  result.mark GTE 50>
    <cfif NOT personPassed>
        <cfset everyonePassed = false>
        <cfbreak>
    </cfif>
</cfloop>
<cfoutput>Everyone passed the test? #everyonePassed#<br><hr></cfoutput>

each

Sometimes it's not a data transformation that one needs when iterating over a collection. If none of the other options do the trick, there's the generic each method:

examResults.each((result) => {
    writeOutput("Name: #result.person#, mark: #result.mark#<br>")
})

As a general rule never start solving an iteration task with each. Consider if one of the other more situation-specific methods are a better fit. It's seldom that each is the right answer.

And the tag equivalent is pretty much the same, because - really - all the tag version does is "each"; it's down to the inner code block to distinguish between the various iteration possibilities:

<cfloop array="#examResults#" item="result">
    <cfoutput>Name: #result.person#, mark: #result.mark#<br></cfoutput>
</cfloop>

OK that's it. Tag-based CFML versions of the more situation-descriptive and less boilerplate iteration higher-order functions. If you need anything else about them explained, let me know.

Righto.

--
Adam

Wednesday, 30 June 2021

CFML higher-order functions compared to tag-based code: sort function

G'day:

OK so you've probably got the gist of things with these articles, with my previous treatments of comparing "modern" to "old school" with map, reduce, filter operations. On to sorting now.

I think this is going to involve some awful code.

I don't think I need to explain why we might need to sort a collection, or what "sorting" is. It's really easy using higher-order functions. The need to write the sorting algorithm has been removed, and only a function to compare to elements needs to be provided:

months = [
    {id=1, miSequence=8, mi="Kohi-tātea", anglicised="Hānuere", en="January"}, 
    {id=2, miSequence=9, mi="Hui-tanguru", anglicised="Pēpuere", en="February"}, 
    {id=3, miSequence=10, mi="Poutū-te-rangi", anglicised="Maehe", en="March"}, 
    {id=4, miSequence=11, mi="Paenga-whāwhā", anglicised="Āperira", en="April"}, 
    {id=5, miSequence=12, mi="Haratua", anglicised="Mei", en="May"}, 
    {id=6, miSequence=1, mi="Pipiri", anglicised="Hune", en="June"}, 
    {id=7, miSequence=2, mi="Hōngongoi", anglicised="Hūrae", en="July"}, 
    {id=8, miSequence=3, mi="Here-turi-kōkā", anglicised="Akuhata", en="August"}, 
    {id=9, miSequence=4, mi="Mahuru", anglicised="Hepetema", en="September"}, 
    {id=10, miSequence=5, mi="Whiringa-ā-nuku", anglicised="Oketopa", en="October"}, 
    {id=11, miSequence=6, mi="Whiringa-ā-rangi", anglicised="Noema", en="November"}, 
    {id=12, miSequence=7, mi="Hakihea", anglicised="Tihema", en="December"}
]
monthsInMaoriOrder = duplicate(months).sort((e1, e2) => e1.miSequence - e2.miSequence)

writeDump(monthsInMaoriOrder)

Here I have a list of the months of the year, ordered according to the Gregorian calendar. The Maori calendar has the same ordering, but the year starts around when the Gregorian calendar considers June. So the exercise here is to re-order the array to respect that ordering. The code for the sorting is just the comparator function.

One thing to note here is that despite appearances given we're assigning the return value of the sorting operation to a new variable, the original array is modified when you call sort on it. I think this is less than ideal, but it's the way it works on both ColdFusion and Lucee. If you want you're original array left alone, then duplicate it first like I have here.

If we're going old school procedural: it's a bit of a nightmare. We need to write our own sorting implementation. Well: we grab one from cflib.org anyhow. But even then, the original leverages a callback function, so I've modified this to be truly procedural and have that embedded in the implementation.

<cffunction name="monthsSortedByMaoriSequence" returntype="array" output="false">
    <cfargument name="arrayToCompare" type="array" required="true">

    <cfset var lesserArray = arrayNew(1)>
    <cfset var greaterArray = arrayNew(1)>
    <cfset var pivotArray = arrayNew(1)>
    <cfset var examine = 2>
    <cfset var comparison = 0>
    <cfset pivotArray[1] = arrayToCompare[1]>

    <cfif  arrayLen(arrayToCompare) LT 2>
        <cfreturn arrayToCompare>
    </cfif>

    <cfset arrayDeleteAt(arrayToCompare, 1)>
    <cfloop array="#arrayToCompare#" item="element">
        <cfset comparison = element.miSequence - pivotArray[1].miSequence>

        <cfswitch expression="#sgn(comparison)#">
            <cfcase value="-1">
                <cfset arrayAppend(lesserArray, element)>
            </cfcase>
            <cfcase value="0">
                <cfset arrayAppend(pivotArray, element)>
            </cfcase>
            <cfcase value="1">
                <cfset arrayAppend(greaterArray, element)>
            </cfcase>
        </cfswitch>
    </cfloop>

    <cfif arrayLen(lesserArray)>
        <cfset lesserArray = monthsSortedByMaoriSequence(lesserArray)>
    <cfelse>
        <cfset lesserArray = arrayNew(1)>
    </cfif>

    <cfif arrayLen(greaterArray)>
        <cfset greaterArray = monthsSortedByMaoriSequence(greaterArray)>
    <cfelse>
        <cfset greaterArray = arrayNew(1)>
    </cfif>

    <cfset arrayAppend(lesserArray, pivotArray, true)>
    <cfset arrayAppend(lesserArray, greaterArray, true)>

    <cfreturn lesserArray>
</cffunction>
<cfset sorted = monthsSortedByMaoriSequence(months)>

It's hard to see the bit that the modern implementation needs, but it's buried here.

Note: to an clever clogs who spot the odd shortcoming in that implementation of quicksort: you're missing the point of the article, and also yer talking to the wrong person because I didn't write it. But - yes yes - you're very clever.

The point is: that's awful. Writing old-school tag-based procedural code one needs to re-implement (and re-test!) the sorting function every time you need one. This is an extreme example and only a lunatic would not use the callback approach even with tag based code:

<cffunction name="comparator">
    <cfargument name="e1">
    <cfargument name="e2">
    <cfreturn e1.miSequence - e2.miSequence>
</cffunction>

<cfset sorted = duplicate(months)>
<cfset arraySort(sorted, comparator)>

But still: it's just better to get with the programme (or the decade) and use the modern version for this.

Righto.

--
Adam

Tuesday, 29 June 2021

CFML higher-order functions compared to tag-based code: filter function

G'day

This one will be pretty short I think. It's the next effort in going over how these higher-order functions work compared to writing procedural code in CFML tags. I've previous covered map and reduce. There's less intricacy to filter, so I won't have so much to say.

Yesterday I showed an example of how not to remove records from a collection using reduce

numbers = [1,2,3,4,5,6,7,8,9,10]
evens = numbers.reduce((evens=[], number) => number MOD 2 ? evens : evens.append(number))

This works, but it's not how one ought to do it. It's putting a square peg in a round hole, and it's gonna cause a small amount of FUD when someone comes back to review the code later ("why are they using reduce here? What am I missing?"). So… use the correct tool for the job. The idiomatic way to filter our elements from a collection is with a filter operation. Here's the equivalent operation using filter:

evens = numbers.filter((number) => number MOD 2 == 0)

Filter's callback receive the value of the collection element (and additionally its index/key, as well as the whole collection as additional parameters, if you need to use those too). If the logic in the callback returns true? The element is preserved in the result collection. if it's false? It's filtered out. That's it. The callback logic can be a one-liner like it is here, or as convoluted as it needs to be. As long as it boils down to a true or a false, you'll get your filtered collection. As with the other collection higher-order functions: it does not change the original collection; it returns a new one.

The tag-based equivalent is simple:

<cfset evens = []>
<cfloop array="#numbers#" item="number">
    <cfif number MOD 2 EQ 0>
        <cfset arrayAppend(evens, number)>
    </cfif>
</cfloop>

Just slightly more verbose, and it's mostly boilerplate.

The concept here is simple, and the object of the exercise for these articles is to just show the difference between using the higher-order functions and using a procedural approach with tags, and that's pretty much it.

Righto.

--
Adam

Monday, 28 June 2021

CFML higher-order functions compared to tag-based code: reduce function

G'day:

Here's the next effort in going over how these higher-order functions work compared to writing procedural code in CFML tags. The previous one was "CFML higher-order functions compared to tag-based code: map function". Today I'm looking at the reduce method. As per yesterday, I've discussed this before in ColdFusion 11: .map() and .reduce().

So what does reduce to? It helps if we compare it to map. Remember how I said this yesterday:

A mapping operation takes one collection and remaps the values for each key into a different value. The keys and the overall size and order (if it has a sense of order) of the collection is preserved. Also the original collection is not altered; an entirely new collection is returned.

A reduce operation is used to return a different data structure. It doesn't mean "reduce" in the sense of "make smaller"; the resultant data structure might be "bigger" (for some definition of bigger). Or it might be the same length, but a different type.

An example of returning the same length but different type would be similar to yesterday's example of mapping an array of records to an array of objects:

records = [
    {id=1, mi="whero", en="red"},
    {id=2, mi="kakariki", en="green"},
    {id=3, mi="kikorangi", en="blue"}
]
objects = records.map((record) => new Colour(record.id, record.mi, record.en))

A more likely scenario in CFML is for the records to be a query. But one still wants to pass an array of objects back from the storage tier to the application, so we use reduce to make the type conversion:

records = queryNew(
    "id,mi,en",
    "integer,varchar,varchar",
    [
        [1, "whero", "red"],
        [2, "kakariki", "green"],
        [3, "kikorangi", "blue"]
    ]
)
objects = records.reduce((objects=[], record) => objects.append(new Colour(record.id, record.mi, record.en)))

Note the way reduce works. The first argument is an "accumulator" that is passed into every iteration, and is ultimately returned to the calling code. One builds the return value iteration at a time into that. Here I'm appending to the array of objects each iteration. Whatever is returned from each iteration is the first argument of the next iteration. So as I iterate over the query, I start with an empty array. I append the first object to it, and that one-element array is then passed into the accumulator of the second call to the callback in the next iteration; and so on for all iterations so ultimately I have an array that I've appended three objects to. Some pseudo-code might make this more clear. Let's consider the iterations as they progress:

1: objects argument=[]; append Red; return value=[Red]
2: objects argument=[Red]; append Green; return value=[Red, Green]
3: objects argument=[Red, Green]; append Blue; return value=[Red, Green, Blue]
result: [Red, Green, Blue]

We start empty, we append red, we append green, we append blue.

After that first argument, the subsequent arguments follow the same pattern as with map: the second argument is a row of the query (passed as a struct). The callback can also receive the current index / key (or currentRow equivalent to a query loop in this case), and the last argument is the entire query. I don't need these here, so do not mention them in the callback's function signature.

The tag version of this is actually round about the same amount of code (109 bytes vs 112 bytes it seems):

<cfset objects = []>
<cfloop query="records">
    <cfset arrayAppend(objects, new Colour(id, mi, en))>
</cfloop>

Another case is shown here:

transactions = [
    {id=1, amount=.1},
    {id=2, amount=2.2},
    {id=3, amount=33.3},
    {id=4, amount=44.44}
]

sum = transactions.reduce((sum=0, transaction) => sum += transaction.amount)

We're summing the transactions. We are reducing the collection to a single value, I guess.

Oh one thing maybe work making very clear: it's complete coincidence that the final variable is called sum, and the accumulator parameter is called sum. They don't need to be, it just makes sense to me to match them up given we're kinda building the end result in that accumulator argument, and accordingly it's going to be the same sort of values, so makes sense it's called the same thing.

The tag-based version for this is simple again:

<cfset sum = 0>
<cfloop array="#transactions#" item="transaction">
    <cfset sum = sum += transaction.amount>
</cfloop>

Another more complicated example of script-vs-tags when reducing is in yesterday's article "CFML: tag-based versions of some script-based code". There I am reducing a query to a struct, then reducing that struct into another query. Both CFScript and tag versions of the code are in that.

One thing to not use reduce for is to actually reduce the size of a collection by removing records from it, eg:

numbers = [1,2,3,4,5,6,7,8,9,10]
evens = numbers.reduce((evens=[], number) => number MOD 2 ? evens : evens.append(number))

One would not use reduce for that. One would use filter. I guess I'll get to that tomorrow.

Righto.

--
Adam

CFML higher-order functions compared to tag-based code: map function

G'day:

As I mentioned yesterday ("CFML: tag-based versions of some script-based code") I've been asked by a couple of people to show the tag-based version of the script-based CFML code. This has ben particularly in reference to my typical approach of using higher-order functions to perform data transformation operations on iterable objects (eg: arrays, structs, lists, etc). Here I will briefly do that for some examples of using mapping functions. The process is the same each time, so I'll not dwell on it too much.

I have already written about the nuts and bolts of mapping higher-order functions in CFML back in 2014 in "ColdFusion 11: .map() and .reduce()". I also looked at how to implement arrayMap in older versions of CFML: "arrayMap(): a reverse CFML history".

In short, these collection-iteration higher order functions work on the premise that most looping operations exist solely to perform data transformation, and it makes sense to encapsulate that into a function, rather than having to hand-crank it. Obviously every data transformation is specific to its circumstance, so the collection-iteration functions take a callback as an argument (thus making them higher-order functions), where the callback defines the data transformation operation. Taking this approach makes the code clearer as to what the intent of the transformation is, and also encapsuates the implementation in its own functions, so its variables are all well encapsulated and don't impact the rest of the calling code. It's just a tider way of doing data transformation.

A mapping operation takes one collection and remaps the values for each key into a different value. The keys and the overall size and order (if it has a sense of order) of the collection is preserved. Also the original collection is not altered; an entirely new collection is returned.

That's enough of an explanation. This article is about comparing code styles. Here goes.

keys = ["ONE", "TWO", "THREE", "FOUR"]

translationLookup = {
    "ONE" = {mi = "tahi", jp = "一"},
    "TWO" = {mi = "rua", jp = "二"},
    "THREE" = {mi = "toru", jp = "三"},
    "FOUR" = {mi = "wha", jp = "四"}
}


maori = keys.map((key) => translationLookup[key].mi)

writeDump(maori)

Here we have a one-liner that takes an array of translation keys and maps them to their actual translations.

Equivalent tag-based code is a bit more effort. We need to hand-crank our array construction:

<cfset japanese = []>
<cfloop array="#keys#" item="key">
    <cfset arrayAppend(japanese, translationLookup[key].jp)>
</cfloop>
<cfdump var="#japanese#">

In the next example I am being less literal about the "key mapping" idea, in case one got a sense that that sort of thing is inate to a mapping operation. I'm doubling each element in the array:

values = [1, 22, 333, 4444]
doubled = values.map((n) => n*2)
writeDump(doubled)

And the tags version (although here I'm halvig the values, for the hell of it). Same as the previous exercise really: just a wee bit clunkier than using the dedicated mapping function:

<cfset halved = []>
<cfloop array="#values#" item="value">
    <cfset arrayAppend(halved, value / 2)>
</cfloop>
<cfdump var="#halved#">

A more real-world example would be when yer getting an array of raw data values back from some sort of data-retrieval operation, and you want to properly model those as objects before returning them to your business logic:

records = [
    {id=1, mi="whero", en="red"},
    {id=2, mi="kakariki", en="green"},
    {id=3, mi="kikorangi", en="blue"}
]
objects = records.map((record) => new Colour(record.id, record.mi, record.en))

vs:

<cfset objects = []>
<cfloop array="#records#" item="record">
    <cfset arrayAppend(objects, new Colour(record.id, record.mi, record.en))>
</cfloop>
<cfdump var="#objects#">

You get the idea.

To show how strings can be remapped too, I knocked-together a quick example of String.map, but then remembered Lucee does not support String.map yet, so needed to use a list instead:

s = "The Quick Brown Fox Jumps Over The Lazy Dog"

a = asc("a")
z = asc("z")

rot13 = s.listToArray("").map((c) => {
    var checkCode = asc(lcase(c))

    if (checkCode < a || checkCode > z) {
        return c
    }
    var offset = (checkCode + 13) <= z ? 13 : -13

    return chr(asc(c) + offset)
}).toList("")
writeOutput(rot13)

And I tested this by feeding the result back into a tag-based version of the operation, to make sure it returned to the original string:

<cfset a = asc("a")>
<cfset z = asc("z")>

<cfset s2 = "">
<cfloop array="#listToArray(rot13, "")#" item="c">
    <cfset checkCode = asc(lcase(c))>

    <cfif checkCode LT a OR checkCode GT z>
        <cfset s2 &= c>
        <cfcontinue>
    </cfif>
    <cfset offset = 13>
    <cfif checkCode + 13 GT z>
        <cfset offset = -13>
    </cfif>
    <cfset s2 &= chr(asc(c) + offset)>
</cfloop>
<cfoutput>#s2#</cfoutput>

All in all using the specific iteration function is slightly clearer as to what sort of transformation is taking place, plus it saves you from having to write the looping and assignment scaffolding that a tags-based / hand-cranked version might. Often remappings are one-liners, and it's just more readable to do it as a simple assignment epression than having to hand-crank the boilerplate looping code.

The code for this article is all munged together in public/nonWheelsTests/higherOrderFunctionsDemonstration.

I'll have a look at how reduce operations work, tomorrow.

Righto.

--
Adam

About

I've been a web developer since 2001: 13yrs as a CFML developer; 6yrs as a PHP dev; and now back adjacent to the CFML community but focusing on code quality, design and testing. As of 2023, I am migrating my dev focus back to PHP again.

The code I write and discuss here is pretty much just looking at random conundrums I encounter in my day job.

I tend to be a bit "forthright" in my opinions, I am indelicate, and I tend to swear too much. This will come out occasionally here: I make no apology for it.

Everything said here is my own opinion. Feel free to disagree with me :-)

Friday, 9 September 2022

The Nothing and Unit types

Nothing

Unit

groupBy higher-order function

Wednesday, 7 July 2021

Tuesday, 6 July 2021

Thursday, 1 July 2021

some

every

each

Wednesday, 30 June 2021

Tuesday, 29 June 2021

Monday, 28 June 2021

`groupBy` higher-order function