Sunday, 17 April 2022

A day in the life of trying to write a blog article in the CFML ecosystem

G'day:

This is not the article I intended to write today. That article was gonna be titled "CFML: Adding a LogBox logger to a CFWheels app via dependency injection", but I'll need to get to that another day now.

Here's how far that article got before the wheels fell off:

And that was it.

Why? Well I started by writing an integration test just to check that box install logbox did what I expected:

import test.BaseSpec
import logbox.system.logging.LogBox

component extends=BaseSpec {

    function run() {
        describe("Tests the LogBox install", () => {
            it("is instantiable", () => {
                expect(() => getComponentMetadata("LogBox")).notToThrow()
            })
        })
    }
}

Simple enough. It'll throws an exception if LogBox ain't there, and I'm expecting that. It's a dumb test but it's a reasonable first step to build on.

I run the test:

Err… come again? I ain't installed it yet. I lifted the code from the expect callback out and run it "raw" in the body ofthe test case: predictable exception. I put it back in the callback. Test passes. I change the matcher to be toThrow. Test still passed. So this code both throws and exception and doesn't throw an exception. This is pleasingly Schrödingeresque, but not helpful.

The weird thing is I know this is not a bug in TestBox, cos we use notToThrow in our tests at work. I port the test over to my work codebase: test fails (remember: this is what I want ATM, we're still at the "red" of "red-green-refactor").

I noticed that we were running a slightly different version of Testbox in the work codebase: 4.4.0-snapshot compared to my 4.5.0+5. Maybe there's been a regression. I changed my TestBox version in box.json and - without thinking things through - went box install again (not just box install testbox which is all I really needed to do), and was greeted with this:

That's reasonably bemusing as I had just used box install fw1 to install it in the first place, and that went fine. And I have not touched it since. I checked what version I already had installed (in framework/box.json), and it claims 4.3.0. So… ForgeBox… I beg to differ pal. You found this version y/day, why can't you find it today? I check on ForgeBox, and for 4.x I see versions 4.0.0, 4.1.0, 4.2.0, 4.5.0-SNAPSHOT. OK, so granted: no 4.3.0. Except that's what it installed for me yesterday. Maybe 4.3.0 has issues and got taken down in the last 24h (doubtful, but hey), so I blow away my /framework directory, and remove the entry from box.json, and box install fw1 again. This is interesting:

root@280d80cf28c6:/var/www# box install fw1
√ | Installing package [forgebox:fw1]
|------------------------------------------------
| Verifying package 'fw1' in forgebox, please wait...
| Installing version [4.2.0].

4.2.0. But its entry in its own box.json is 4.3.0, and the constraint it put in my box.json is ^4.3.0.

I do not have time or inclination for any of this, so I just stick a constraint of ~4.2.0 in my box.json, and that seems to have solved it. I mean the error went away: it's still installing 4.3.0. Even with a hard-coded 4.2.0 it's still installing 4.3.0.

Brad Wood from Ortus/CommandBox had a look at this, nutted-out that there was something wrong with the way the FW/1 package on ForgeBox was configured, and he in turn pinged Steve Neiland who looks after FW/1 these days, and he got this sorted. I'm now on 4.3.0, and it says it's 4.2.0. And box install no longer complains at me. Cheers fellas.

Then I noticed that because of the stupid way CFWheels "organises" itself in the file system, I have inadvertantly overwritten a bunch of my own CFWheels files. Sigh. CFWheels doesn't bother to package itself up as "app" (its stuff) and "implementation" (my code that uses their app), it just has "here's some files: some you should change (anything outside the wheels subdirectory), some you probably shouldn't (the stuff in the wheels subdirectory)", but there's no differentiation when it comes to installation: all the files are deployed. Overwriting all the user-space files with their original defaults. Sorry but this is just dumbarsey. Hurrah for source control and small commit iterations is all I can say, as I could just revert some files and I was all good.

Right so now I have the same version of TestBox installed here as in our app at work (remember how this was all I was tring to do? Update testbox. Nothing to do with FW/1, and nothing to do with CFWheels. But there's an hour gone cocking around with that lot).

And the test still doesn't work. Ballocks.

I notice the Lucee version is also different. We're locked into an older version of Lucee at work due to bugs and incompats in newer versions that we're still waiting on to be fixed, so the work app is running 5.3.7.47, and I am on 5.3.8.206. Surely it's not that? I rolled my app's Lucee version back to 5.3.7.47 and the test started failing (correctly). OK, so it's a Lucee issue.

I spent about an hour messing around doing a binary search of Lucee container versions until I identified the last version that wasn't broken (5.3.8.3) and the next version - a big jump here - 5.3.8.42 that was broken. I looked at a diff of the code but nothing leapt out at me. This was slightly daft as I had no idea what I was looking for, so that was probably half an hour of time looking at Lucee's source code in an aimless fashion. I actually saw the change that was the problem, but didn't clock that that is what caused it at the time.

Having drawn a blank, I slapped my forehead, called myself a dick, and went back to the code in TestBox that was behaving differently. That would obviously tell me where to look for the issue.

I tracked the problem down to here, in system/Assertion.cfc:

    // Message+Detail regex must not match
    if (
        len( arguments.regex ) AND
        (
            !arrayLen( reMatchNoCase( arguments.regex, e.message ) ) OR !arrayLen(
                reMatchNoCase( arguments.regex, e.detail )
            )
        )
    ) {
        return this;
    }

    fail( arguments.message );
}

I distilled that down into a portable repro case:

s = ""
pattern = ".*"
writeDump([
    reMatch = s.reMatchNoCase(pattern),
    matches = s.matches(pattern),
    split = s.split(pattern)
])

There are some Java method calls there to act as controls, but on Lucee's current version, we get this:

And on earlier versions it's this:

(Full disclosure: I'm using Lucee 4.5 on trycf.com for that second dump, but it's the same results in earlier versions of Lucee 5, up to the point where it starts going wrong)

Note how previously a regex match of .* matches an empty string? This is correct. It does. In all regex engines I know of. Yet in Lucee's current versions, it returns a completely empty array. This indicates no match, and it's wrong. Simple as that. So there's the bug.

I was pointed in the direction of an existing issue for this: LDEV-3703. Depsite being a regression they know they caused, Lucee have decided to only fix it in 6.x. Not the version they actually broke. Less than ideal, but so be it.

There were a coupla of Regex issues dealt with between those Lucee versions I mentioned before. Here's a Jira search for "fixversion >= 5.3.8.4 and fixversion < 5.3.8.42 and text ~ regex". I couldn't be arsed tracking back through the code, but I did find something in LDEV-3009 mentioning a new Application.cfc setting:

this.useJavaAsRegexEngine = true

This is documented for ColdFusion in Application variables, and… absolutely frickin' nowhere in the Lucee docs, as far as I can see.

On a whim I stuck that setting in my Application.cfc and re-ran the test. If the setting was false: the test doesn't work. If it was true: the test does work. That's something, but Lucee is not off the hook here. The behaviour of that regex match does not change between the old and new regex engines! .* always matches an empty string! So there's still a bug.

However, being pragmatic, I figured "problem solved" (for now), and moved on. For some reason I restarted my container, and re-hit my tests:

I switched the setting to this.useJavaAsRegexEngine = false and the tests ran again (failed incorrectly, but ran). So… let me get this straight. For TestBox to work, I need to set that setting to true. To get CFWheels to work, I need to set it to false.

For pete's sake.

As I said on the Lucee subchannel on the CFML Slack:

Do ppl recall how I've always said these stupid flags to change behaviour of the language were a fuckin dumb idea, and are basically unusable in a day and age where we all rely on third-party libs to do our jobs?

Exhibit. Fucking. A.

Every single one of these stupid, toxic, setting doubles the overhead for library providers to make their code work. I do not fault TestBox or CFWheels one bit here. They can't be expected to support the exponential number of variations each one of those settings accumulates. I can firmly say that no CFML code should ever be written that depends on any of these settings. And no library or third-party code should ever be tested with the setting variation on. Just ignore them. The settings should not exist. Anyway: this is an editorial digression. "We are where we are" as the over-used saying goes.


Screw all this. Seriously. All I wanted to do is to do a blog article about perhaps 50-odd new lines of code in my example app. Instead I spent four hours untangling this shite. And my blog article has not progressed.

Here's what I needed to do to my app to work around these various issues:

This is all committed to GitHub as 0.5.1

Writing this sure was cathartic. I think I was my own audience for this one. Ah well. Good on you if you got to here.

Righto.

--
Adam