Friday, 18 June 2021

CFML: emulating query-of-query group-by with higher-order functions

G'day:

A week or so ago this question came up on the CFML slack channel:

Does QoQ support month() and year() such as in the following?

queryExecute("
      SELECT MONTH(settlementdate) AS month, year(settlementdate) AS year,
      SUM(LongTermGainLoss) AS ltgl, SUM(ShortTermGainLoss) AS stgl
      FROM data
      group by MONTH(settlementdate), year(settlementdate) 
    ",{},{
      dbtype: 'query'
})

The answer to the question is: Lucee supports MONTH and YEAR SQL functions in their QoQ SQL implementation, but ColdFusion does not. Here's a quick repro:

dates = queryNew(
    "date",
    "date",
    [
        [createDate(2011, 3, 24)],
        [createDate(2016, 8, 17)]
    ]
)

dateParts = queryExecute(
    "SELECT MONTH(date) AS month, YEAR(date) AS year FROM dates",
    {},
    {dbtype="query"}
)

writeDump(dateParts)

This works as expected on Lucee, but on ColdFusion errors with "Encountered "MONTH. Incorrect Select List, Incorrect select column".

The question intrigued me, and I thought… I reckon one could do something interesting with higher order functions to do what John needed here.

Firstly I needed some test data, so I banged out a quick function to do so:

function getUngroupedRecords(required numeric rows) {
    createRows = number => arrayNew(1).resize(number) // doesn't work in Lucee https://luceeserver.atlassian.net/browse/LDEV-2417
    date = () => now().add("d", randRange(-365, 365))
    amount = () => randRange(1,10000) / 100

    fakedDbData = queryNew(
        "settlementDate,LongTermGainLoss,ShortTermGainLoss",
        "Date,Double,Double",
        createRows(rows).map(_ => [date(), amount(), amount()])
    )
    return fakedDbData
}
ungroupedRecords = getUngroupedRecords(20)

NB: Lucee requires parentheses around even a single parameter in an arrow function, which is shouldn't so this breaks where indicated. This is a ColdFusion-only exercise here, so I'm running with a ColdFusion-only solution as the code is a bit nicer than if I make it work with Lucee too.

I thought there must be some better way of creating an array of a specific length just natively in CFML but I couldn't think of one, hence the arrayNew(1).resize(number) hack. It doesn't matter what's in the array, I just need that number of elements to re-map with my data.

Right so I have some data that possibly resembles the sort of thing John had.

From that I reduce that query to a struct keyed on "#year#-#month#", and for each row of the query I accumulate the gain/loss figures into their appropriate struct key entry:

ungroupedRecords.reduce((grouped={}, row) => {
    y = row.settlementDate.year()
    m = row.settlementDate.month()
    key = "#y#-#m#"
    grouped[key] = grouped[key] ?: {stgl = 0, ltgl = 0}
    grouped[key].stgl = grouped[key].stgl + row.ShortTermGainLoss
    grouped[key].ltgl = grouped[key].ltgl + row.LongTermGainLoss

    return grouped
})

At this point we have a "grouped-by" struct, but we need a query. So we reduce the struct back the other way now, to a query with the rows John spec'ed out in his original question:

.reduce( // breaks in Lucee https://luceeserver.atlassian.net/browse/LDEV-2523
    (records, key, values) => records.addRow({
        month = key.listLast("-"),
        year = key.listFirst("-"),
        ltgl = values.ltgl,
        stgl = values.stgl
    }),
    queryNew("month,year,ltgl,stgl", "Integer,Integer,Double,Double")
)

The only complexity there is we need to expand-out the struct key to be individual values again, cos they year and month have their own columns in the query.

Oh and as mentioned Lucee breaks on this. It has a bug in that when an array entry is null, it skips it when performing reduce (or map, etc) operation. So the end result is that it considers the result of the first reduce to be null, so this second one breaks cos one cannot reduce null.

Finally for good measure I sort the results by date:

.sort((r1, r2) => {
    yearDiff = r1.year - r2.year
    if (yearDiff != 0) {
        return yearDiff
    }
    return r1.month - r2.month
})

Initially I tried to be a smart-arse and do all that in one expression:

.sort((r1, r2) => (yearDiff = r1.year - r2.year) ? yearDiff : r1.month - r2.month)

And whilst Lucee was OK with that, ColdFusion could not make sense of the inline assignment expression there. I was just doing that so I only wanted r1.year - r2.year evaluated once, but I needed it in two places.

All of that is just the one expression in the end, when taken together:

groupedRecords = ungroupedRecords.reduce((grouped={}, row) => {
    y = row.settlementDate.year()
    m = row.settlementDate.month()
    key = "#y#-#m#"
    grouped[key] = grouped[key] ?: {stgl = 0, ltgl = 0}
    grouped[key].stgl = grouped[key].stgl + row.ShortTermGainLoss
    grouped[key].ltgl = grouped[key].ltgl + row.LongTermGainLoss

    return grouped
}).reduce( // breaks in Lucee https://luceeserver.atlassian.net/browse/LDEV-2523
    (records, key, values) => records.addRow({
        month = key.listLast("-"),
        year = key.listFirst("-"),
        ltgl = values.ltgl,
        stgl = values.stgl
    }),
    queryNew("month,year,ltgl,stgl", "Integer,Integer,Double,Double")
).sort((r1, r2) => {
    yearDiff = r1.year - r2.year
    if (yearDiff != 0) {
        return yearDiff
    }
    return r1.month - r2.month
})

The final output is along these lines:

I have to admit I did not TDD this work as I just wrote it in trycf.com, so I can't use TestBox, and can only write scripts on that anyhow. I did benchcheck the results by hand, and they're correct as far as I can tell.

I'm not sure how useful it is to have gone through this experience, but it was kinda fun I guess.

Righto.

--
Adam

Monday, 14 June 2021

Using TDD when adding new code to existing untestable code

G'day:

Just a note before I start. This article is based on a real-world situation for a company I used to work at. Because I'm basing this on proprietary information, I am not prepared (or permitted!) to share that. As such I have obfuscated some elements of the narrative, and have changed sample code around to be less specific to the industry concerned. The code is still line-for-line equivalent to the original code though with just names being changed. Nothing has been enhanced or simplified to make it artificially easier to solve the problem we really did face.

This challenge is best contextualised with a real-world situation I encountered. We had a function that was kind of a perversion of the concept of the Builder Pattern, and is pretty much a model example of how not to do it, and an example of what the Builder Pattern is designed to address.

Here's the method, shrunk down so you can get a feel for the scale of the problem, but without needing to see the reallyreally code. The code is too blurry to read, but with the syntax highlighting, you can get an idea of what's going on. And the method is… 387 lines long. Ridiculous:

I have indicated where we need to make out code change. This method (buildResponse) is building a reasonably complex Account object and wrapping it up in a Response object. The Account class itself is poorly designed and has far too many properties. And we need to add one more (don't @ me, it was not my call, this was another team's work, and I was only helping the salvage operation).

None of the existing code in the method had any tests, and the entire class had not been written with testing in mind: it had a raft of inline static method calls to external dependencies, and created other dependency objects inline. The team looking at the work had decided the code was untestable, and were about to continue adding to it, still without tests.

In a better world, even if the method had grown to be close to 400 lines long, as it evolved its tests would have evolved as well, and we'd be starting with all its current logic covered, and it would be a simple matter of adding in one new trivial case, and then adding in our one new bit of trivial code. But… that is not the reality we are living in.

Let's look at the code more closely. The code at the bottom of the method is along these lines:

    // ...
    $account->partnerInfo = $infoModel->jsonSerialize();
    $response = (object) ['account' => $account];
    return new Response($response);
}

Right before we create and return the $response object, we needed to add an apiToken property into the $account object.

The problem is that for a test to get to that last line of code, our test is going to have to execute the previous 400-odd lines of code too. And that code hits the database and a coupla different web services to source its data. Plus a couple of the DB accesses were writes to the database. We can't be doing that in a unit test, and we can't mock-out the DB and web service calls either due to how the code was written. Untestable, right? That was the conclusion the dev team had reached.

Is this code really untestable though? Can we really not use TDD to develop this change? Not at all. Deriving the test case is super easy. Barely even an inconvenience. We even already know what it is: "it must have an apiToken property sourced from the API service". So we know that. On that basis, we can even write the test for it.

/**
* @testDox it must have an apiToken property sourced from the API service
* @covers ::setApiTokenInAccount
*/
public function testBuiltAccountContainsAnApiToken()
{
    $anyUuid = $this->matchesRegularExpression($self::PATTERN_FOR_UUID);
    $anyTime = $this->matchesRegularExpression($self::PATTERN_FOR_TIMESTAMP);
    $apiConnector = $this->createMock(ApiConnector::class);
    $apiConnector
        ->expects($this->once())
        ->method('createApiToken')
        ->with($anyUuid, $anyTime)
        ->willReturn('A_VALID_TOKEN');
    ReflectionHelper::setProperty($this->accountService, 'apiConnector', apiConnector);
    $account = $this->createMock(Account::class);
    ReflectionHelper::invokeMethod($this->accountService, 'setApiTokenInAccount', [$account]);
    $this->assertSame('A_VALID_TOKEN', $account->apiToken);
}

Here it's $this->accountService that is our system under test, and it's where that huge buildResponse method is. ReflectionHelper is just a utility class that implements that ubiquitous code everyone has that uses reflection to set non-public properties, call non-public methods etc. Fortunately apiConnector is not created inline within buildResponse, it's been set into a property earlier, so we can replace it with a mock. This makes our life easier here.

We're going to extract the logic that gets the token out into its own private method - so we're only adding the minimum one line of new code into buildResponse - and we test that method directly, rather than calling buildResponse which we absolutely cannot do. Normally testing private methods is a nono. I also realise there's a strong whiff of "testing the implementation not the behaviour" going on here, which is also a TDD anti-pattern, and in the perfect world we would not do this. But we are about 370 lines of code away from the perfect world here, so we need to make do.

We have a test case that defines and tests the functionality we're expecting, and we will have an implementation to test:

private function setApiTokenInAccount(&$account)
{
    // some logic around the UUID and time argument values elided because it's not relevant here
    $apiToken = $this->apiConnector->createApiToken($uuid, $timestamp);
    $account->apiToken = $apiToken;
}

We then change buildResponse to call this:

    // ...
    $account->partnerInfo = $infoModel->jsonSerialize();
    $this->setApiTokenInAccount($account); // <----- new code
    $response = (object) ['account' => $account];

    return new Response($response);
}

We've still TDDed our solution, and we now have a bit of testing on the requirements of buildResponse even if we're not calling that method. We have to accept that we have added a line of code to buildResponse and it's now 388 lines long, and that addition isn't tested, but we cannot help that. This is a good case of "perfect is the enemy of good".

Now that we have discovered this mess, we can line-up a technical debt ticket to tidy-up buildResponse. It's a critical method, and it should not have been left to rot like this.


As it happens, we were needing to do more work on that method anyhow, and when planning out the story we built in effort to refactor it.

I need to blur the code because I can't think of suitable renamings for all the activity going on here and I can't use the original code, but you will be able to tell how much it's improved:

From 388 lines of code to 25, with most of the 25 just being one in a sequence of steps, which makes for really easy understanding of what's going on for any dev looking at the code in future. This is how a method should read.

All we did to get to here is to identify the 14 different sections of functionality in the original function and use PHPStorm's "Refactor > Extract Method" facility. We could then write tests for some of those bits of functionality, where there was no issues with inline dependencies being created. Some of them still didn't have tests because there were still untestable, but we undertook to deal with that later when we come to need to maintain them. We could also test buildResponse, because now it's basically just a recipe of steps (private methods) to follow, we could mock-out the steps, and just test that the data flow through the recipe was what was intended.

The testing for the overall method also overcomes the "testing the implementation not the behaviour" smell I mentioned before. The tests for it were testing the outer surface of the unit, and checking the results - what the response object ended up being given each of the variation of inputs - rather than how they were derived. There are still the individual tests on the private methods as well, and that's a bit "testing the implementation", but none of the tests will be impacted by changes elsewhere in the codebase, and as those bits of implementation change, then the overall tests can be updated to shift the cases to there, and retire the implementation-specific tests. When dealing with legacy code and limited time resources, backfilling test coverage and confidence is always a journey and not a destination. So as long was we've moving in the correct direction, I believe it's OK to not do things to the letter of the dogma, sometimes.

It was easy (and pretty fulfilling, I will say) work, and it's made the code a helluva lot better for the next person who comes along and needs to work on it.

In conclusion we don't need to think "oh god this is untestable", and continue making things worse by adding more untested code. The code probably is testable, just maybe in a partial fashion, and maybe also not in an ideal fashion. But we can at least start.

Righto.

--
Adam

Sunday, 13 June 2021

CFML: messing around with mixins (part 3)

G'day:

In the preceding two articles in this series (1 and 2) I looked at how to implement mixins in CFML, both at compile time and at runtime, with an increasing amount of functionality / complexity. My closing conclusion was that the runtime approach I was taking was a wee bit too magic and opaque to be something one should do; but at least it's interesting to look at the issues & techniques involved. Today I'm going to try to come up with a less opaque solution.

Before I go any further I will repeat what I've said in the previous two articles: I would never actually recommend using mixins at all. They just seem like a hack to me: an anti-pattern and poor design. As a concept they're just either a poor-person's inheritance, or a poor-person's dependency-injection technique. Just use the real things.

What do I mean by my previous work was a bit opaque? I'll repeat the example from the last article.

component {

    function getObjects(orderBy="id") {
        return getObjectsFromStorage(orderBy)
    }
}

Externally, I am mixing in that getObjectsFromStorage method at runtime. But how does the person looking at that code in isolation know this? They don't. They need to just already know that there might (or might not!) be some external shenanigans going on to avail that method at runtime. This is rubbish. The code in any given class should be self-contained, and be capable of being executed provided one fulfils the contract the code specifies. One should not need to know what context it's being used in to understand how it works: that's an anti-pattern. Contrast with this:

component {

    function init(required Repository repository) {
        variables.repository = repository
    }

    function getObjects(orderBy="id") {
        return variables.repository.getObjectsFromStorage(orderBy)
    }
}

Someone looking at that code is not surprised at all: they can tell what's going on. One should never write code that seems like the former example; go for the latter. Don't. Put. Surprises. In. Your. Code.

Today I'm gonna try to effect some hybrid of using clear dependency injection to avail my object of external resources, but then using my earlier mixing-in strategy to expose them to the code within the class as "local" functions. So basically still being able to use getObjectsFromStorage rather than qualify it with variables.repository.getObjectsFromStorage.

But… but … why, Adam?

Good question. Note I would never ever do any of this in my own code. This exercise cropped up from a conversation I was having about some framework code, and how it seems to revel in having an impenetrable (and pretty naive) implementation. It just injects methods all over the place - with no consideration as to whether they belong in the object they are being injected into, I might add - and it's all held together with some level-one-quality magic. I do not mean that in a good way, in case you can't tell. It's currently being overhauled a bit for its next version, and I'm going through this exercise to see if I can offer some suggestions as to how to make it seem less magical, and more coherent. This is part of that.

Did I mention I would never ever ever do any of this sort of thing in my own code? Good. Glad we cleared that one up.

The "interesting" thing about this article is that I have not got a single line of code yet - I mean as I type this sentence - and only a vague idea of how I'm going to solve this (and I'll be copy and pasting code from the previous article). I have some goals in place though, by way of the test cases from the previous articles. I've cannibalised those to a subset of cases I need to address for this exercise.

So that's a place to start. The cases might change as I go, but I'll start with the first one and work from there.

It tests unmodified setup

As some context, for this first case we are going to already assume some existing code. This part is not breaking TDD, but to wire things into objects, I need to have those objects' classes first. So I've got this lot:

//MyModel.cfc
import cfmlInDocker.miscellaneous.mixins.runtime.advanced.MyRepository

component {

    public MyModel function init(required MyRepository repository) {
        variables.repository = repository
        
        return this
    }

    public Colour[] function getObjectsViaInjectedRepository() {
        return variables.repository.getAllObjects()
    }
}
// MyRepository.cfc
component {

    public MyRepository function init(MyDao dao) {
        variables.dao = dao
        
        return this
    }

    public Colour[] function getAllObjects() {
        return dao.getRecords("id", -1).reduce((array=[], record) => array.append(new Colour(record.mi)))
    }
}
//MyDao.cfc
component {

    public query function getRecords(required string orderBy, required numeric maxrows) {
        records = queryNew("id,en,mi", "integer,varchar,varchar", [
            [1,"red","whero"],
            [2,"green","kakariki"],
            [3,"blue","kikorangi"]
        ])
        return queryExecute(
            "SELECT * FROM records ORDER BY #orderBy#",
            {},
            {dbtype="query", maxrows=maxrows}
        )
    }
}

And I couldn't help but pop a test in for this as I was tweaking some of the stuff I was porting over from the previous article's examples:

//FacadeTest.cfc

import testbox.system.BaseSpec
import cfmlInDocker.miscellaneous.mixins.runtime.facade.*

component extends=BaseSpec {

    function run() {
        describe("Testing facade proof of concept", () => {
            it("tests unmodified setup", () => {
                model = new MyModel(new MyRepository(new MyDao()))

                results = model.getObjectsViaInjectedRepository()

                expect(results).toBe([
                    new Colour("whero"),
                    new Colour("kakariki"),
                    new Colour("kikorangi")
                ])
            })
        })
    }
}

Basically we have a model representing Colour collections, where the colours are the Maori terms for various colours. All somewhat contrived, I know.

It creates facades for methods

From a TDD perspective, my intended approach here "interests" me. I already know the end result I want, and I pretty much already have the code I need to use. I'll just be using it in a different way from before. My inclination here is to write (read: copy and paste) a lot of code first up to get the mechanics in place to do any of the mixing-in, and at that point each case is just reimplementing each step from the last article as I support more and more features in the mixinMap. Writing a lot of code for a single case is a smell to me. But is this new code? Is it kind of a refactoring instead, so I just need to make sure my existing cases are all still accounted for? I think I'm doing TDD wrong here, but will will just crack on with it, and pay attention to what I'm doing in case I can work out a better strategy as I go. And as always I'm open to advice as to how I should have been doing something.

Cracking on with it, here's my test:

it("creates facades for methods", () => {
    model = new MyModel(new MyRepository(new MyDao()), new FacadeMapper())

    results = model.getObjectsViaFacade()

    expect(results).toBe([
        new Colour("whero"),
        new Colour("kakariki"),
        new Colour("kikorangi")
    ])
})

I'm passing-in an optional dependency here: a FacadeMapper. I'm also calling a new method I've added: MyModel.getObjectsViaFacade: each test is going to need a method to call to test the case in question).

FacadeMapper's functionality is lifted largely from DependencyInjectionImplementor in the previous article:

component {

    public void function setFacadesFromMap(required Component target, required Component source, required struct map) {
        target.__getVariables = getVariables
        scopes = {
            public = target,
            private = target.__getVariables()
        }
        structDelete(target, "__getVariables")

        map.each((sourceMethod, mapping) => {
            targetMethod = mapping.keyExists("target") ? mapping.target : sourceMethod

            requestedAccess = mapping.keyExists("access") ? mapping.access : "private"
            targetAccess = requestedAccess == "public" ? "public" : "private"

            bind = mapping.keyExists("bind") && isBoolean(mapping.bind) ? mapping.bind : true

            proxy = bind
                ? () => source[sourceMethod](argumentCollection=arguments)
                : source[sourceMethod]

            scopes[targetAccess][targetMethod] = proxy
        })
    }

    private struct function getVariables(){
        return variables
    }
}

I explain it all throughly there, so go have a read. But basically it grabs references to a source object's methods, wraps them in a proxy, and injects them into the target.

And the relevant new code in MyModel to call this is here:

component {

    public MyModel function init(required MyRepository repository, Component facadeMapper) {
        variables.repository = repository

        if (arguments.keyExists("facadeMapper")) {
            mapFacades(facadeMapper)
        }

        return this
    }

    private void function mapFacades(facadeMapper) {
        facadeMapper.setFacadesFromMap(
            this,
            variables.repository,
            {
                "getAllObjects" = {}
            }
        )
    }
    
    // ...    
}

Unlike in the previous article's approach, where the wiring of the mixins was done completely outside of the target object, so no-one would ever have any idea what's going on with the code and why it magically works; here it's explicit. We list the methods we are creating facades for (only one just for now). So when we come to use the facaded (?is that a word?) method, we can tell where it came from:

public Colour[] function getObjectsViaFacade() {
    return getAllObjects()
}

It remaps function names if the map specifies it

It handles public/private method access on the mixed-in function

It ignores access modifiers other than public / private (falling back to private)

It doesn't bind facade function to its original context if bind option is false

I'm gonna cover the cases for these in one fell swoop because it's all repetition from the previous article anyhow. This is why I'm kinda considering this more of a refactoring exercise than the red/green part of a TDD exercise. The tests are thus:

it("remaps function names if the map specifies it", () => {
    model = new MyModel(new MyRepository(new MyDao()), new FacadeMapper())

    results = model.getSomeObjectsViaFacade(rows=2)

    expect(results).toBe([
        new Colour("whero"),
        new Colour("kakariki")
    ])
})

it("handles public/private method access on the mixed-in function", () => {
    model = new MyModel(new MyRepository(new MyDao()), new FacadeMapper())

    results = model.getOrderedObjects(orderBy="English")

    expect(results).toBe([
        new Colour("kikorangi"),
        new Colour("kakariki"),
        new Colour("whero")
    ])
})

it("ignores access modifiers other than public / private (falling back to private)" , () => {
    model = new MyModel(new MyRepository(new MyDao()), new FacadeMapper())

    results = model.getEnglishObjectsViaFacade()

    expect(results).toBe([
        new Colour("red"),
        new Colour("green"),
        new Colour("blue")
    ])

    expect(() => model.getEnglishObjects()).toThrow(type="Expression", regex="^.*MyModel.*has no\s+function with name.*getEnglishObjects.*$")
})

it("doesn't bind facade function to its original context if bind option is false", () => {
    model = new MyModel(new MyRepository(new MyDao()), new FacadeMapper())
    modelVariables = model.getVariables()

    expect(getMetadata(modelVariables.this)).toBe(getMetadata(model))
})

The mapping made to facilitate these tests has grown:

private void function mapFacades(facadeMapper) {
    facadeMapper.setFacadesFromMap(
        this,
        variables.repository,
        {
            "getAllObjects" = {},
            "getSomeObjects" = {"target" = "getSubsetOfObjects"},
            "getOrderedObjects" = {"access" = "public"},
            "getEnglishObjects" = {"access" = "package"}
        }
    )
    facadeMapper.setFacadesFromMap(
        this,
        new VariablesAccessor(),
        {
            "getVariables" = {"access" = "public", "bind" = false}
        }
    )
}

But all this was covered in depth in the previous article as well. VariablesAccessor is just this:

component {

    public struct function getVariables(){
        return variables
    }
}

This is used in the test case that doesn't bind to the original source's context, instead letting it bind to the target's, so it acts on the target's context (we can use it to access the target's variables scope). Just… look at the relevant test case (such is the beautify of BDD-oriented test cases, they kinda document what's going on in and of themselves.

And in MyModel I've need to add some more methods for each test to call to access the relevant facade-mapping variation. There's no need to pore over this code, it's boilterplate for the tests.

public Colour[] function getSomeObjectsViaFacade(string rows="all") {
    return getSubsetOfObjects(rows)
}

public Colour[] function getEnglishObjectsViaFacade() {
    return getEnglishObjects()
}

And in the repository:

public Colour[] function getSomeObjects(string rows="all") {
    rows = (isValid("integer", rows) && rows > -1) ? rows : -1

    return dao.getRecords("id", rows).reduce((array=[], record) => array.append(new Colour(record.mi)))
}

public Colour[] function getOrderedObjects(string orderby="id") {
    orderBy = mapOrderBy(orderBy)

    return dao.getRecords(orderBy, -1).reduce((array=[], record) => array.append(new Colour(record.mi)))
}

public Colour[] function getEnglishObjects() {
    return dao.getRecords("id", -1).reduce((array=[], record) => array.append(new Colour(record.en)))
}

private string function mapOrderBy(required string property) {
    propertyColumnMap = {
        "id" = "id",
        "english" = "en",
        "maori" = "mi"
    }
    return propertyColumnMap.keyExists(property) ? propertyColumnMap[property] : "id"
}

The DAO already had all the necessary functionality baked in: there were no changes to that.


That might seem like a lot of code changes that I glossed-over there, but the objective of this article is all around the explicit use of that mapping in the class receiving the facades, and how that's clearer than magic mixins.

I'll also re-re-iterate that I think doing this sort of thing is rubbish. It's a lot of effort to provide non-name-spaced functions, and all yer gaining is a lack of clarity, and a few keystrokes. It's just a daft thing to do. But it was an interesting progression of exercises for me, anyhow. And keeps me off the streets on a Sunday (it' too hot to go outdoors, that said).

Righto.

--
Adam

Wednesday, 9 June 2021

Repro for code that breaks in ColdFusion but works in Lucee

G'day:

In the last article (CFML: messing around with mixins (part 2)) I used some code that worked in Lucee, but didn't work in ColdFusion. It was not my code's fault, it's ColdFusion's fault for not being able to parse it. I didn't think I could be arsed looking at it this evening, but someone's asked me about it already so I had a look. I can reproduce the issue, and work around it.

Here's the code wot works on Lucee but breaks on ColdFusion (on trycf.com):

obj = {
    myMethod = function () {return arguments}
}    
methodName = "myMethod"
    
useProxy = true

callMe = useProxy
    ? () => obj[methodName](argumentCollection=arguments)
    : obj[methodName]

result = callMe("someArg")
writeDump(result)

This errors with: Invalid CFML construct found on line 10 at column 8. on line 10

It's pointing to the arrow function shorthand, but if I change it back to a normal function expression using the function keyword, it still errors.

It's also not the dynamic method call either. In the past that would have caused problems on ColdFusion, but they've fixed it at some point.

It's also not the lack of semi-colons ;-)

This adjustment works (on trycf.com):

obj = {
    myMethod = function () {return arguments}
}
methodName = "myMethod"
    
useProxy = true

proxy = () => obj[methodName](argumentCollection=arguments)
callMe = useProxy
    ? proxy
    : obj[methodName]

result = callMe("someArg")
writeDump(result)

It seems like ColdFusion really didn't like that arrow function / function expression in the ?: expression.

I'd say this is definitely a bug, and will point this repro case to Adobe for them to… do whatever they like with it.

Righto

--
Adam

CFML: messing around with mixins (part 2)

G'day:

In the previous article - CFML: messing around with mixins (part 1) - I had a look at some code to implement both compile-time and runtime mixins into target objects. Basically one takes a library object (a CFC just full of functions, no statefulness or anything), and stick references to them into a target object so that object can call those methods as if they were its own native method. It was all very proof-of-concept, and I would never recommend anyone actually using the code I wrote, and indeed would never recommend using mixins at all. They just seem like a hack to me: an anti-pattern and poor design.

One drawback of the approach I had taken thus far is that the way of injecting the functions was very naïve. It would only work as a tactic if the functions were very simple, and didn't refer to anything else within their library: no state (which isn't a problem in a library situation), but also no references to other private functions within the library. This is more of a problem. The reason it won't work is that a function statement doesn't bind any context to itself, it just adopts the context of wherever it's run from. Consider this:

// Library.cfc
component {
    function doSomething() {
        // some code
        
        someValue = runPrivateFunction()
        
        // more code
    }
    
    private function runPrivateFunction() {
        // code
    }
}

And using our injector we inject doSomething in to this:

// Object.cfc
component {


    myMethod() {
        // some code
        
        doSomething()
        
        // some more code
    }
}

This won't work. We only mixin a reference to doSomething, and when it's run in the context of Object, it will try to find runPrivateFunction in that object's context, and it won't be there.

We could inject all the private methods into the target as well: that's easy enough. But what if the object we're mixing-in from isn't a simple library, but it's a stateful object, or has dependencies injected into it that its methods then expect to be able to use… we'd have to chuck everything into the target. And by this point one should be questioning what the hell one is doing. It's just rubbish.

Instead of that, I'm just going to bind it to its original context instead. So we just inject the method, but it will still know about its context from the original object it came from. And it seems to be surprisingly easy.

More test cases…

It binds mixed-in methods to their original calling context when using a mixinMap

In this example we have a model class:

// MyModel.cfc
component {

    function getObjects(orderBy="id") {
        return getObjectsFromStorage(orderBy)
    }
}

And you can see it's calling a repository-ish method to getObjectsFromStorage. It just returns those in this example, but it's the sort of thing where you have a request to return some objects, the model tier calls a repository to get the objects, then the model performs business logic on the to round out the requisite business logic, and returns them to whatever asked for them - probably a controller.

You can also se that MyModel doesn't implement that behaviour, we need to inject it as a mixin. The repo is like this:

component {

    function init(MyDao dao) {
        variables.dao = dao
    }

    function getObjectsFromStorage(orderby="id") {
        orderBy = mapOrderBy(orderBy)
        return dao.getRecords(orderBy).reduce((array=[], record) => array.append(new Number(record.mi)))
    }

    private function mapOrderBy(property) {
        propertyColumnMap = {
            "id" = "id",
            "english" = "en",
            "maori" = "mi"
        }
        return propertyColumnMap.keyExists(property) ? propertyColumnMap[property] : "id"
    }
}

This is a proper dependency-injected repository. The repo method knows how to get the DAO to fetch the raw storage data, and it knows how to convert those records into domain objects. It simply does that, it doesn't perform any business logic; just conversion from storage data to domain data.

You can also see how our current mixin injector would break this code: it relies on other objects in its variables scope: the DAO, plus a private method to offload the task of mapping domain property names ot DB columns.

For completeness here's the DAO, although it's not relevant to this exercise other than needing to exist:

component {

    function getRecords(required orderBy) {
        records = queryNew("id,mi,en", "int,varchar,varchar", [
            [1, "tahi", "one"],
            [2, "rua", "two"],
            [3, "toru", "three"],
            [4, "wha", "four"]
        ])

        return queryExecute("SELECT * FROM records ORDER BY #orderBy#", {}, {dbtype="query"})
    }
}

And the test to make sure our mixing-in will all work is as follows:

it("binds mixed-in methods to their original calling context when using a mixinMap", () => {
    di = new advanced.DependencyInjectionImplementor()
    model = new advanced.MyModel()
    di.wireStuffIn(
        model,
        new advanced.MyRepository(new advanced.MyDao()),
        {
            getObjectsFromStorage : {}
        }
    )

    result = model.getObjects(orderBy="maori")

    expect(result).toBeArray()
    expect(result).toHaveLength(4)
    expect(result).toBe([
        new Number("rua"),
        new Number("tahi"),
        new Number("toru"),
        new Number("wha")
    ])
})

When I run the test, you see what I mean about all that context malarcky:


We've injected getObjectsFromStorage, but not the rest of the stuff it needs to work, and they're not magically there in the next context it's being run from. Solving this problem is super easy. Currently we're injecting the method like this (from DependencyInjectionImplementor.cfc):

mixinMap.each((sourceMethod, mapping) => {
    targetMethod = mapping.keyExists("target") ? mapping.target : sourceMethod
    requestedAccess = mapping.keyExists("access") ? mapping.access : "private"

    targetAccess = requestedAccess == "public" ? "public" : "private"

    scopes[targetAccess][targetMethod] = someMixin[sourceMethod]
})

We just need to change it to this:

mixinMap.each((sourceMethod, mapping) => {
    targetMethod = mapping.keyExists("target") ? mapping.target : sourceMethod
    requestedAccess = mapping.keyExists("access") ? mapping.access : "private"

    targetAccess = requestedAccess == "public" ? "public" : "private"

    proxy = () => someMixin[sourceMethod](argumentCollection=arguments)

    scopes[targetAccess][targetMethod] = proxy
})

We wrap the call to the original method in a proxy function, and inject the proxy. This means the method is being called in the context of its original object!

Now the test passes.

It binds mixed-in methods to their original calling context by default

It stands to reason that this is the default behaviour to use, so we'll cross-implement that back into the rest of the injection code, even if we're not using the mixinMap:

it("binds mixed-in methods to their original calling context by default", () => {
    di = new advanced.DependencyInjectionImplementor()
    model = new advanced.MyModel()
    di.wireStuffIn(
        model,
        new advanced.MyRepository(new advanced.MyDao())
    )

    result = model.getObjects(orderBy="english")

    expect(result).toBeArray()
    expect(result).toHaveLength(4)
    expect(result).toBe([
        new Number("wha"),
        new Number("tahi"),
        new Number("toru"),
        new Number("rua")
    ])
})

And the implementation:

private function mixinUsingMixin(someObject, someMixin) {
    someObject.__putVariable = putIntoVariables
    structKeyArray(someMixin).each((methodName) => {
    
        proxy = () => someMixin[methodName](argumentCollection=arguments)

        someObject.__putVariable(someMixin[methodName]proxy, methodName)
    })
    structDelete(someObject, "__putVariable")
}

It doesn't bind mixed-in function to their original calling context if bind option is false

One last thing. Maybe one might want to still use the old-school approach without the proxy for some reason. In this super contrived example we actually want the mixed-in method to bind to the target location's context. Here's a model:

component {

    function init() {
        variables.status = true
    }
}

And the method we want to mix-in:

component {

    function getStatus() {
        return variables.status
    }
}

It's just providing a way to return the status of the target object to calling code. Daft example, but you get the idea. here's the test:

it("doesn't bind mixed-in function to their original calling context if bind option is false", () => {
    di = new advanced.DependencyInjectionImplementor()
    model = new advanced.MyModel()
    di.wireStuffIn(
        model,
        new advanced.MyStatusLib(),
        {getStatus = {access="public", bind=false}}
    )

    result = model.getStatus()

    expect(result).toBeTrue()
})

And the implementation:

private function mixinUsingMap(someObject, someMixin, mixinMap) {
    someObject.__getVariables = getVariables
    scopes = {
        public = someObject,
        private = someObject.__getVariables()
    }
    structDelete(someObject, "__getVariables")

    mixinMap.each((sourceMethod, mapping) => {
        targetMethod = mapping.keyExists("target") ? mapping.target : sourceMethod
        requestedAccess = mapping.keyExists("access") ? mapping.access : "private"

        targetAccess = requestedAccess == "public" ? "public" : "private"
        
        bind = mapping.keyExists("bind") && isBoolean(mapping.bind) ? mapping.bind : true

        proxy = bind
            ? () => someMixin[sourceMethod](argumentCollection=arguments)
            : someMixin[sourceMethod]

        scopes[targetAccess][targetMethod] = proxy
    })
}

(Note that this code does not work on ColdFusion, but works fine on Lucee. I have not worked out what's up with it yet, and will report back once I've looked at it. ColdFusion's parser is getting confused and it gives a compile error. The code is syntactically correct though).

That deals with it, and that's it for this exercise. As I said before don't actually do this. It'll make your code impenetrable to follow, with everyone wondering how you're calling methods that aren't defined in the class yer looking at. It's a shit way of going about things. I'm thinking of a third part to this when the mixing-in is at runtime still, but it's handled in a more explicit fashion in the target class, so the code stays clear. I've not worked out how to do this yet, but I have some ideas and some embryonic code somewhere.

TDD is your friend

I've been adding these test cases iteratively, and making sure all the previous test cases still pass with every update I make to the implementation. This directs how I design the code, and makes sure I take small steps. This is gold. And you are - IMO - a fool if it's not how you work on a daily basis.

 

Righto.

--
Adam

PS: I just noticed I missed a test case in all this. It's related to the last test case I created. Can you spot what I've missed? Hint: it's not some "nice to have" sort of test case; I've actually written some code that I'm not testing, and was not actually related to the case I was trying to fulfil.

Sunday, 6 June 2021

CFML: messing around with mixins (part 1)

G'day:

There was a conversation on the CFML Slack channel the other day about mixing-in functions into objects in CFML. This stemmed from some of the way CFWheels has been architected, such as how the main Controller class is composed:

component output="false" displayName="Controller" {

    include "controller/functions.cfm";
    include "global/functions.cfm";
    include "view/functions.cfm";
    include "plugins/standalone/injection.cfm";
    if (
        IsDefined("application") && StructKeyExists(application, "wheels") && StructKeyExists(
            application.wheels,
            "viewPath"
        )
    ) {
        include "../#application.wheels.viewPath#/helpers.cfm";
    }

}

I hasten to add that I think this is a questionable design approach - and one I would never personally use or advocate. The only library of functions that belongs in this class is the controller/functions.cfm ones; clearly the ones in the vaguely-named global/functions.cfm don't belong in a class called Controller, and it's even worse that view functions are being injected into a controller class. I can't even speak to what functionality is in global/functions.cfm, but I suspect it's just a bunch of stuff that was left over once everything else found its way in a properly named/designed library. But anyhow, it should be its own class, and injected into the Controller object compositionally, via dependency injection or something. Similarly a controller should not be polluted with view methods, but if a controller needs to call methods on a view object - entirely reasonable - then that view object should also be a dependency of the controller. Ugh.

But anyway. CFWheels is where it is, and this is how it does things, and hence the topic of mixins.

To back up a step, mixins are a strategy of code reuse; basically if used cautiously they can be used to effect a poor-person's implementation of multiple inheritance in languages that don't support it, or a kind of shonky half-baked implementation of the dependency inversion principle. You can probably tell I don't like the idea. I much prefer the stategy of eschewing inheritance (multiple, single or otherwise) wherever possible in favour of a composition strategy - as implemented by dependency injection.

Some languages have formal language constructs for effecting "mixing in". I've seen traits being used in PHP before. Here's a quick example:

trait MyExcellentLib {
  function doExcellentThing() {
    echo "excellent";
  }
}

trait MyCoolLib {
  function applyCoolness() {
    echo "cool";
  }
}

class MyModel {
  use MyExcellentLib;
  use MyCoolLib;

  function executeSomeStuff() {
    $this->doExcellentThing();
    $this->applyCoolness();
  }
}

$model = new MyModel();

$model->executeSomeStuff();

And Ruby does similar with modules. Here's an analogous example of the PHP one in Ruby:

module MyExcellentLib
  def doExcellentThing
    puts "excellent"
  end
end

module MyCoolLib
  def applyCoolness
    puts "cool"
  end
end

class MyModel
  include MyExcellentLib
  include MyCoolLib

  def executeSomeStuff
    doExcellentThing
    applyCoolness
  end
end

model = MyModel.new

model.executeSomeStuff

And CFML's version would be much the same as Ruby's; including .cfm files, as per the CFWheels example above. The problem is one can only include CFML script files (I don't mean "CFScript", I mean scripts (.cfm) as opposed to components (.cfc)). One cannot go include path.to.MyComponent or include "/path/to/MyComponent.cfc". And one certainly cannot include an object. To mixin an object one would need to do it at runtime, not compile time.

This got me thinking, and I decided to write some experimental code to see what I could do with this. If anything. And as-always, I'm TDDIng the exercise.

Compile-time implementation of mix-ins: it mixes-in the required functions

As a baseline, I'm just gonna test the include approach to doing this. Here's the test:


import cfmlInDocker.miscellaneous.mixins.compileTime.MyModel;

component extends=testbox.system.BaseSpec {

    function run() {
        describe("Testing mixin proofs of concept", () => {
            describe("Tests compile-time implementation of mix-ins", () => {
                it("mixes-in the required functions", () => {
                    model = new MyModel()
                    result = model.executeSomeStuff()

                    expect(result).toBe("ExcellentCool")
                })
            })
        })
    }
}

And the code that fulfils the test:

// MyModel.cfc

component {
    include "./myCoolLib.cfm";
    include "./myExcellentLib.cfm";

    function executeSomeStuff() {
        return doExcellentThing() & applyCoolness()
    }
}
<cfscript> // myCoolLib.cfm
    function applyCoolness() {
        return "Cool"
    }
</cfscript>
<cfscript>// myExcellentLib.cfm
    function doExcellentThing() {
        return "Excellent"
    }
</cfscript>

(In all cases here, the test passes with the given code, so I'll spare you saying "and that works" each time)

Baseline done. What I want to do now is to do the equivalent with an object, not an include. And at runtime.

Run-time simple implementation of mix-ins: it mixes-in the required functions

it("mixes-in the required functions", () => {
    di = new simple.DependencyInjectionImplementor()
    model = new simple.MyModel()
    di.wireStuffIn(model, new simple.MyExcellentLib())
    di.wireStuffIn(model, new simple.MyCoolLib())

    result = model.executeSomeStuff()

    expect(result).toBe("ExcellentCool")
})

Now I have a DependencyInjectionImplementor class that handles all the wiring-up of things:

// DependencyInjectionImplementor.cfc
component  {

    function wireStuffIn(someObject, someMixin) {
        someObject.__putVariable = putIntoVariables
        structKeyArray(someMixin).each((methodName) => {
            someObject.__putVariable(someMixin[methodName], methodName)
        })
        structDelete(someObject, "__putVariable")
    }

    function putIntoVariables(value, key){
        variables[key] = value
    }
}
// MyModel.cfc
component {
    function executeSomeStuff() {
        return doExcellentThing() & applyCoolness()
    }
}
// MyExcellentLib.cfc
component {
    function doExcellentThing() {
        return "Excellent"
    }
}
// MyCoolLib.cfc
component {
    function applyCoolness() {
        return "Cool"
    }
}

It just takes and object, loops over its exposed methods, and pops them into the variables scope of the target object.

All those source code files are in /miscellaneous/mixins/runtime/simple.

Job done, but it's a bit basic. Let's keep going…

It mixes-in a subset of functions from a lib

What say one only wants to mixin a subset of the methods from the object? Pretty straight forward:

it("mixes-in a subset of functions from a lib", () => {
    di = new advanced.DependencyInjectionImplementor()
    model = new advanced.MyModel()
    di.wireStuffIn(model, new advanced.MyBrilliantLib(), ["radiateBrilliance"])

    result = model.executeSomethingBrilliant()
    expect(result).toBe("brilliance")

    expect(() => model.failAtDoingItBrilliantly()).toThrow(type="expression")
})
// DependencyInjectionImplementor.cfc
component  {

    function wireStuffIn(someObject, someMixin, mixinMap) {
        someObject.__putVariable = putIntoVariables

        functionsToMixin = arguments.keyExists("mixinMap")
            ? mixinMap
            : structKeyArray(someMixin)

        functionsToMixin.each((methodName) => {
            someObject.__putVariable(someMixin[methodName], methodName)
        })

        structDelete(someObject, "__putVariable")
    }

    function putIntoVariables(value, key){
        variables[key] = value
    }
}
// MyModel.cfc
component {

    // ...

    function executeSomethingBrilliant() {
        return radiateBrilliance()
    }

    function failAtDoingItBrilliantly() {
        return doItBrilliantly()
    }
}
// MyBrilliantLib.cfc
component {
    function radiateBrilliance() {
        return "brilliance"
    }
    function doItBrilliantly() {
        return "done brilliantly"
    }
}

(That source code is at /src/miscellaneous/mixins/runtime/advanced).

We can pass in an array of method names, and loop over that if so. Otherwise just continue to loop over the whole object as per before.

Note that because I've not mixed-in doItBrilliantly, we should expect it to error if we then call it.

It should go without saying that any earlier tests I have written continue to pass as I build new functionality into this. Such is the benefit of TDDing… I have that safety net.

It remaps function names if the map specifies it

The next task I set myself is to allow the mixinMap to specify a new name for the mixed-in method, should for some reason one want to do that (avoid naming collisions or something).

it("remaps function names if the map specifies it", () => {
    di = new advanced.DependencyInjectionImplementor()
    model = new advanced.MyModel()
    di.wireStuffIn(
        model,
        new advanced.MyBrilliantLib(),
        {
            "radiateBrilliance" = {target="shine"},
            "doItBrilliantly" = {}
        }
    )

    result = model.performBrilliantThings()
    expect(result).toBe("brilliance done brilliantly")
})
// DependencyInjectionImplementor.cfc
component  {

    function wireStuffIn(someObject, someMixin, mixinMap) {
        someObject.__putVariable = putIntoVariables

        arguments.keyExists("mixinMap")
            ? mixinUsingMap(someObject, someMixin, mixinMap)
            : mixinUsingMixin(someObject, someMixin)

        structDelete(someObject, "__putVariable")
    }

    private function mixinUsingMixin(someObject, someMixin) {
        structKeyArray(someMixin).each((methodName) => {
            someObject.__putVariable(someMixin[methodName], methodName)
        })
    }

    private function mixinUsingMap(someObject, someMixin, mixinMap) {
        mixinMap.each((sourceMethod, mapping) => {
            targetMethod = mapping.keyExists("target") ? mapping.target : sourceMethod
            someObject.__putVariable(someMixin[sourceMethod], targetMethod)
        })
    }

    function putIntoVariables(value, key){
        variables[key] = value
    }
}
// MyModel.cfc
component {

    // ...

    function performBrilliantThings() {
        return shine() & " " & doItBrilliantly()
    }
}

This looks a bit more complicated, but I've just separated-out the two mixing-in methods into their own functions now, to make things more clear.

It handles public/private method access on the mixed-in function

This is more interesting. Now I'm allowing the map to specify whether to mix-in the method as being private or public in the target. Previously they were always private.

it("handles public/private method access on the mixed-in function", () => {
    di = new advanced.DependencyInjectionImplementor()
    model = new advanced.MyModel()
    di.wireStuffIn(
        model,
        new advanced.MyBestLib(),
        {
            improveGoodness = {target="makeItBetter"},
            makeBest = {access="public"}
        }
    )

    result = model.executeSomeOtherStuff()
    expect(result).toBe("better best")

    expect(() => model.makeItBetter()).toThrow(type="expression")
})
// DependencyInjectionImplementor.cfc
component  {

    function wireStuffIn(someObject, someMixin, mixinMap) {
        arguments.keyExists("mixinMap")
            ? mixinUsingMap(someObject, someMixin, mixinMap)
            : mixinUsingMixin(someObject, someMixin)
    }

    private function mixinUsingMixin(someObject, someMixin) {
        someObject.__putVariable = putIntoVariables
        structKeyArray(someMixin).each((methodName) => {
            someObject.__putVariable(someMixin[methodName], methodName)
        })
        structDelete(someObject, "__putVariable")
    }

    private function mixinUsingMap(someObject, someMixin, mixinMap) {
        someObject.__getVariables = getVariables
        scopes = {
            public = someObject,
            private = someObject.__getVariables()
        }
        structDelete(someObject, "__getVariables")

        mixinMap.each((sourceMethod, mapping) => {
            targetMethod = mapping.keyExists("target") ? mapping.target : sourceMethod
            targetAccess = mapping.keyExists("access") ? mapping.access : "private"

            scopes[targetAccess][targetMethod] = someMixin[sourceMethod]
        })
    }

    function getVariables(){
        return variables
    }

    function putIntoVariables(value, key){
        variables[key] = value
    }
}
// MyModel.cfc
component {

    // ...

    function executeSomeOtherStuff(){
        return variables.makeItBetter() & " " & this.makeBest()
    }
}
// MyBestLib.cfc
component {

    function improveGoodness() {
        return "better"
    }

    function makeBest() {
        return "best"
    }
}

It ignores access modifiers other than public / private (falling back to private)

I introduced a small bug into that last implementation. I could specify any access level I wanted, whereas the only valid access levels here are public or private. I need to fix this.

it("ignores access modifiers other than public / private (falling back to private)" , () => {
    di = new advanced.DependencyInjectionImplementor()
    model = new advanced.MyModel()
    di.wireStuffIn(
        model,
        new advanced.MyBestLib(),
        {
            makeBest = {access="INVALID"}
        }
    )
    result = model.checkIfItsTheBest()
    expect(result).toBe("best")

    expect(() => model.makeBest()).toThrow(type="expression")
})
private function mixinUsingMap(someObject, someMixin, mixinMap) {
    // ...

    mixinMap.each((sourceMethod, mapping) => {
        // ...
        requestedAccess = mapping.keyExists("access") ? mapping.access : "private"

        targetAccess = requestedAccess == "public" ? "public" : "private"

        scopes[targetAccess][targetMethod] = someMixin[sourceMethod]
    })
}

I could have rolled that into one line, but it was getting a bit awkward-looking, so I figured two lines - each with clear labels - were clearer in intent.

Oh! And the model file:

component {

    // ...

    function checkIfItsTheBest(){
        return variables.makeBest()
    }
}

Back up in the test I just make sure that it's only available in the private scope.

End of round 1

That's where I got to yesterday when writing the code for all this. I was discussing this with Tom King (lead contributor on CFWheels), and he observed one flaw in the way I'm doing these mixins is that they lose the context of the object they were originally homed in. Because of how CFML handles injecting methods by reference like how I am doing here, once the method is in the target object, any references that method makes to variables or this is a reference to the target object, not the source object that they came from. My initial reaction to this was "well: yes", but I'm only thinking of mixing functions from function libraries here. There's also a usecase for a mix-in library to also require its own context, so I need to address this too. I worked out how to do this pretty quickly (which made me pleased with myself, I have to admit), but I figured I need to verify the behaviour some more before I am happy with it, plus also I'm now thinking that how these contexts are bound should be perhaps optional. And I already have a lot of content in this article already, so I'll work on the next wodge of ideas some more before writing them up. For now I'm gonna call it quits, have a beer, and play dumb-arse games for the balance of the evening. Ooh: and eat! It's eating time.

(I've just written Part 2 of this exercise).

Righto.

--
Adam