Sunday 19 February 2023

PHP: looking at spatie/async some more

G'day:

I'm getting back to this spatie/async library today (see "PHP: looking at spatie/async" for the first part of this).

Previously I have just been passing a callback to the add function when adding a task to the pool:

$pool->add(function () use ($connection, $i, $startTime) {
    $result = $connection->executeQuery("CALL sleep_and_return(?)", [2]);
    return sprintf(
        "%d:%d:%d",
        $i,
        $result->fetchOne(),
        microtime(true) - $startTime
    );
});

This requires the task to be tightly coupled to the pool-handling, which is probably not what one wants to do. Instead, one can put the task logic into a Task class:

namespace adamcameron\php8\Task;

use adamcameron\php8\tests\Integration\Fixtures\Database as DB;
use Doctrine\DBAL\Connection;
use Spatie\Async\Task;

class SlowDbCallTask extends Task
{
    readonly private Connection $connection;

    public function __construct(
        readonly private int $i,
        readonly private float $startTime
    ) {
    }

    public function configure()
    {
        $this->connection = DB::getDbalConnection();
    }

    public function run()
    {
        $result = $this->connection->executeQuery("CALL sleep_and_return(?)", [2]);
        return sprintf(
            "%d:%d:%d",
            $this->i,
            $result->fetchOne(),
            microtime(true) - $this->startTime
        );
    }
}

So that's a bit nicer, especially when you see how the calling code in the test looks now:

$pool->add(new SlowDbCallTask($i, $startTime));

One thing that took me a while to straighten out in my head was the way the configure method works / needs to be used. Initially, my implementation of the SlowDbCallTask class was a bit literal, and my constructor was thus:

public function __construct(
    readonly private Connection $connection,
    readonly private int $i,
    readonly private float $startTime
) {
}

This errored-out when I ran the test:

Exception: Serialization of 'Closure' is not allowed
/var/www/vendor/spatie/async/src/Runtime/ParentRuntime.php:87
/var/www/vendor/spatie/async/src/Runtime/ParentRuntime.php:69
/var/www/vendor/spatie/async/src/Pool.php:140
/var/www/tests/Functional/SpatieAsync/TaskTest.php:20

I've run into this before: the object being used as the task handler needs to be serialised to get it into the PHP process that's running it, and not everything can be serialised. Less than ideal, but so be it. Then it occurred to me that I was being a div anyhow: I don't want to pass the Task's DB connection into it - there's no reason to - I can just initialise it in the Task's constructor, eg:

public function __construct(
    readonly private int $i,
    readonly private float $startTime
) {
    $this->connection = DB::getDbalConnection();
}

But same error, and same problem. When I handle this in the constructor, it's all being done in the calling code, so the same serialisation issue exists. I have to concede the penny did not drop without some googling, and I found this helpful comment on an issue on GitHub, which led me to this other comment:

It's important to remember that parallel processes are really separate processes. By using closures and a smart bootstrap, we're simulating the parent process as best as possible. There are however cases which won't work with that simple approach: not everything is serialisable.

That's where a Task comes in. Tasks allow more control over the bootstrap in the child process.

Start by taking a look at what the README says about tasks: https://github.com/spatie/async#working-with-tasks

The less objects and application things are passed to the child process, the better. […]

Whilst it doesn't directly say it, this made me click that that config method is for (I was previously going "not sure why I need that"); the docs - linked to above - are of the sort that state what thing are, but not why they are they way they are, so don't directly explain this either. Basically if there's anything the Task will need that won't be serialisable: sling it in the config method. Makes complete sense to me now.

If one looks at the base Task class, we can see how it works:

namespace Spatie\Async;

abstract class Task
{
    abstract public function configure();

    abstract public function run();

    public function __invoke()
    {
        $this->configure();

        return $this->run();
    }
}

There is a simplified way of implementing tasks too (not sure why this is needed: once one knows what's going on, it's not like the other approach is "complicated".

To prove I'm actually TDDing all this still, here's the test for this one:

/** @TestDox It supports a simplified version of a task */
public function testSimpleTask()
{
    $pool = Pool::create();

    $pool->add(new SimpleTask());

    $results = $pool->wait();

    $this->assertCount(1, $results);
    $this->assertEquals("G'day world from an async call", $results[0]);
}

And the implementation simply needs to be this:

class SimpleTask
{
    public function __invoke()
    {
        return "G'day world from an async call";
    }
}

(It's a bit mindless, fine. It's late).

As per the docs: provided one has an __invoke method: that's it.


Speaking of TDD: in the first part of this - despite not starting with the test code listing - I did TDD it. Although I took it as kind of a refactoring, so I duplicated the test for the inline-task-via-callback version, and changed the code to instantiate my SlowDbCallTask. I did not directly test the configure and run implementations (with their own tests I mean), because that literally is implementation detail. The feature here is "the slow DB call [blah blah, whatever the feature actually is]", and whether it's done by inline callback or a Task class is neither here nor there. And that test does actually exercise all the code in the task class anyhow, so: job done.

Interestingly, the test coverage report disagrees with me:

This is because the PHP process running that code is not the same one running the tests, I guess. Hrm. Maybe I should have separate tests. But: I am not of the disposition that I need to chase 100% LOC coverage … but (again) I discuss why ensuring 100% is something to aim for anyway in an article I wrote ages ago: "Yeah, you do want 100% test coverage" (which should then be measured against this other article on the topic: "Test coverage: it's not about lines of code"). I think in this case - as the class is so simple - I'd put a @codeCoverageIgnore annotation on it. If the task class ended up having a bunch of moving parts in it that became trickier to test via its calling context, then I might consider testing those discretely. "It depends" [etc].

OK. That's the end of that section of the docs (I have written more about it than there was actual documentation that said. Ha. So I'm gonna finish up here. I think I have another article or so to write on this stuff yet. Let's seem.

Righto.

--
Adam

Thursday 16 February 2023

ColdFusion: bug in runAsync

G'day:

In my article yesterday ("CFML: runAsync doesn't seem very asynchronous on Lucee") I mentioned in passing that I'd found a bug in ColdFusion's runAsync implementation. I wasn't focusing on ColdFusion at the time, so I didn't try to repro it; however I have now.

Don't worry about the wall of code, it's the same thing three times to increase the odds of the bug situation to surface:

// /vendor/ColdFusion/2021/runAsyncBug/repro.cfm
telemetry = []
recordTelemetry = (label) => {
    telemetry.append(label)
}

recordTelemetry("Begin")

recordTelemetry("f1 before define")
f1 = runAsync(() => {
    recordTelemetry("f1 main")
}).then(() => {
    recordTelemetry("f1 then1")
}).then(() => {
    recordTelemetry("f1 then2")
})
recordTelemetry("f1 after define")

recordTelemetry("f2 before define")
f2 = runAsync(() => {
    recordTelemetry("f2 main")
}).then(() => {
    recordTelemetry("f2 then1")
}).then(() => {
    recordTelemetry("f2 then2")
})
recordTelemetry("f2 after define")

recordTelemetry("f3 before define")
f3 = runAsync(() => {
    recordTelemetry("f3 main")
}).then(() => {
    recordTelemetry("f3 then1")
}).then(() => {
    recordTelemetry("f3 then2")
})
recordTelemetry("f3 after define")

recordTelemetry("f1 before get")
f1.get()
recordTelemetry("f1 after get")

recordTelemetry("f2 before get")
f2.get()
recordTelemetry("f2 after get")

recordTelemetry("f3 before get")
f3.get()
recordTelemetry("f3 after get")

recordTelemetry("End")

writeDump(telemetry)

tallies = [
    "Begin" = 0,
    "f1 before define" = 0,
    "f1 main" = 0,
    "f1 then1" = 0,
    "f1 then2" = 0,
    "f1 after define" = 0,
    "f2 before define" = 0,
    "f2 main" = 0,
    "f2 then1" = 0,
    "f2 then2" = 0,
    "f2 after define" = 0,
    "f3 before define" = 0,
    "f3 main" = 0,
    "f3 then1" = 0,
    "f3 then2" = 0,
    "f3 after define" = 0,
    "f1 before get" = 0,
    "f1 after get" = 0,
    "f2 before get" = 0,
    "f2 after get" = 0,
    "f3 before get" = 0,
    "f3 after get" = 0,
    "End" = 0
]
errors = tallies
    .map((k) => telemetry.findAll(k).len())
    .filter((_, v) => v != 1)

writeDump(errors)

This code is very similar to the cod ein the earlier article I linked to. I create three futures: f1, f2, f3; and each of those has a main task and two then handlers. All the task and then handlers do is append to a telemetry array. This is pared down from some code that actually did more stuff that I wanted to be async.

All going well (and 75% of the time, this is the case), we should end up with an array of 23 telemetry entries, one each for the entries in the code. We can't guarantee the order cos most of it is running asynchronously.

What I was seeing though, is sometimes entries were missing, but some others were doubled up. Here's an example of that:

See how the telemetry array has no entries for "f2 then1" and "f3 before get"; but instead has two each of "f3 then1" and "f2 before get". Odd.

Also about 5% of the time I get this:

Variable LABEL is undefined.

 
The error occurred in /app/cfml/vendor/ColdFusion/2021/runAsyncBug/repro.cfm: line 4
2 : telemetry = []
3 : recordTelemetry = (label) => {
4 :     telemetry.append(label)
5 : }
6 : 


There are always 23 entries; just some are wrong. There seems to be any number of wrong ones, but usually only a couple. The example above is a fairly typical run.

I tried a variant of the test with a lock around the write to the telemetry array:

recordTelemetry = (label) => {
    lock type="exclusive" name="telemetry" timeout=1 throwontimeout=true {
        telemetry.append(label)
    }
}

But this did not make a difference.

Also the fact I am using a function expression using closure over the telemetry array is significant. If I just have a function statement:

function recordTelemetry(label) {
    telemetry.append(label)
}

There is no problem.

I dunno what to make of this, but it'll give the Adobe ColdFusion Team something to work with anyhow.

Oh: this is not an issue with Lucee.

Righto.

--
Adam

Lucee bug surfaced by CFWheels's "architecture"

G'day:

Well here's a fine use of my frickin time.

Lucee's released a new "stable" version and we're being encouraged to use it. Today I upgraded my dev environment to Lucee 5.3.10.120 (from 5.3.9.166 which seems quite stable and gave us no issues), ran our tests to see if anything obvious went wrong, and a whole lot of them went splat. All of them were variations of this (this is from my repro case, not our codebase):

Lucee 5.3.10.120 Error (expression)
MessageCannot update key [MAORI_NUMBERS] in static scope from component [cfml.vendor.lucee.staticFinal.C], that member is set to final
StacktraceThe Error Occurred in
/var/www/cfml/vendor/lucee/staticFinal/C.cfc: line 5
3: static {
4: final static.MAORI_NUMBERS = ["tahi", "rua", "toru", "wha"]
5: }
6: }

called from /var/www/cfml/vendor/lucee/staticFinal/C.cfc: line 1
called from /var/www/cfml/vendor/lucee/staticFinal/test.cfm: line 2

The code in question is this:

component extends=Base {

    static {
        final static.MAORI_NUMBERS = ["tahi", "rua", "toru", "wha"]
    }
}

I am not trying to "update key [MAORI_NUMBERS] [etc]", I am simply trying to create the object.

Roll back to 5.3.9.166: code works.

My initial attempt to reproduce the error was just this:

component {

    static {
        final static.MAORI_NUMBERS = ["tahi", "rua", "toru", "wha"]
    }
}

But that worked fine, no errors.

Notice how I am not extending anything there? This is significant.

What's in Base.cfc? I'm kind of embarrassed to show you this.

component {

    include "include.cfm";
}

"WTF?" you might legitimately ask. Well: quite. The thing is I found this issue in a CFWheels application, and it's down to CFWheels's "architecture" that this bug in Lucee surfaces.

Look at this… stuff:

component output="false" displayName="Model" {

    include "model/functions.cfm";
    include "global/functions.cfm";
    include "plugins/standalone/injection.cfm";

}

That's how CFWheels implements its base model class. They are pretending CFML has mixins (it doesn't) by using includes. All their classes seem to be defined as a series of include files. I just… just… I just… … aaaaaaaah!!! Just do me a favour and don't use CFWheels. Trust me.

Anyway, so this is why I'm writing shit code to reproduce this Lucee bug.

In include.cfm I have this:

public function build() {
    return this
}

(This is also how CFWheels initialises its model class objects: using a sort of factory method. So I'm replicating a pared-down version of that).

And CFWheels creates its objects like this (in test.cfm):

<cfinvoke component="C" method="build" returnVariable="o">

Yeah. <cfinvoke>. Ain't seen one of those in about a decade (and it was old code then…).

And if I run that code, I get the error concerned.

I've tried to pare it back further, but it seems I need the sub class, base class, include combo. Lucee is - it seems - doing something dodgy when it instantiates an object like this. No surprise they didn't catch it during their own tests, because this is a very - um - "edge case" approach to designing code.

Another weird thing is that if I restart my Docker container: the problem goes away. However if I then do something like change the name of include.cfm to includex.cfm or something, the problem comes back. Adding code to either CFC does not bring the issue back (if you see what I mean). It's def down to something about how a base class that has an include in it is first "created" (I hesitate to say "compiled" here, cos then I don't think the issue would magically go away between container restarts: the compiled code seems fine. Just the in-memory code after the compilation the first time round is crocked. Anyhow, I won't try to guess what's wrong, I'll leave that to the Lucee bods. They can have my repro case to help them (here on GitHub: /vendor/lucee/staticFinal).

I encountered a ColdFusion bug during assessing this: time to write that one up. (update: this one has already been reported, as it turns out: CF-4213214).

Righto.

--
Adam

Wednesday 15 February 2023

CFML: runAsync doesn't seem very asynchronous on Lucee (5.3.10.120)

G'day:

I was wanting to compare how other languages dealt with the promise-style async processing, as I was not completely convinced by what I was seeing when investigating spatie/async in the previous article: "PHP: looking at spatie/async". It seems solid, just doesn't behave how I'd expect in some situations. More on that in another article. I checked what JS does, and it was fine, and then I recalled in the back of my mind that CFML added something new to do with async processing a while ago. CF2018 as it turns out. runAsync. And this was also added to Lucee at some point too (their docs don't say when something was added).

I knocked together a quick repro of what I was looking at in spatie/async and JS as a baseline; and then wtote an equivalent in CFML. But when I ran it on Lucee it didn't seem to be behaving how I'd expect, but in a different way from what I was trying to compare to in the other languages. When I ran the same code on ColdFusion though, it seemed to behave the way I expected. It seemed I have found a Lucee rabbit hole to go down.

I ended up with this repro that demonstrates what I'm seeing. There's a chunk of code, but it's a bunch of largely the same operations repeated a few times so as to provide usable telemetry. I'll go through it piecemeal though.

This is all being run on Lucee 5.3.10.120.

start = getTickCount()
writeLog(file="runAsync", text="Begin @ #getTickCount() - start#")

I'll time things as I go.

writeLog(file="runAsync", text="f1 before define @ #getTickCount() - start#")
f1 = runAsync(() => {
    writeLog(file="runAsync", text="f1 main before query @ #getTickCount() - start#")
    queryExecute("CALL sleep_and_return(1)", [], {datasource="test"});
    writeLog(file="runAsync", text="f1 main after query @ #getTickCount() - start#")
}).then(() => {
    writeLog(file="runAsync", text="f1 then1 before query @ #getTickCount() - start#")
    queryExecute("CALL sleep_and_return(1)", [], {datasource="test"});
    writeLog(file="runAsync", text="f1 then1 after query @ #getTickCount() - start#")
}).then(() => {
    writeLog(file="runAsync", text="f1 then2 before query @ #getTickCount() - start#")
    queryExecute("CALL sleep_and_return(1)", [], {datasource="test"});
    writeLog(file="runAsync", text="f1 then2 after query @ #getTickCount() - start#")
})
writeLog(file="runAsync", text="f1 after define @ #getTickCount() - start#")
  • I create a future
  • And it has a main task, and two then handlers.
  • Each of these do exactly the same thing:
    • Call a proc in the DB. See below for the implementation code, but you can probably tell it's a proc specifically designed to take time to execute. It's basically a DB-borne sleep call.
    • And log the current run time before and after the proc call.
  • Oh and I log the timing either side of the whole oepration too.

Here's the proc:

DROP PROCEDURE IF EXISTS sleep_and_return;
DELIMITER //
CREATE PROCEDURE sleep_and_return(IN seconds INT)
BEGIN
    DO SLEEP(seconds);
    SELECT seconds;
END //
DELIMITER ;

It just sleeps for the specified time and then returns the value it was passed.

Then I do exactly the same thing I did with f1 for f2 and f3 as well. Three identical sets of runAsync / then / then with the logging and the DB call.

After that, I sit back and get each of them, timing around each get:

writeLog(file="runAsync", text="f1 before get @ #getTickCount() - start#")
f1.get()
writeLog(file="runAsync", text="f1 after get @ #getTickCount() - start#")

writeLog(file="runAsync", text="f2 before get @ #getTickCount() - start#")
f2.get()
writeLog(file="runAsync", text="f2 after get @ #getTickCount() - start#")

writeLog(file="runAsync", text="f3 before get @ #getTickCount() - start#")
f3.get()
writeLog(file="runAsync", text="f3 after get @ #getTickCount() - start#")

And one final entry in the log at the end:

writeLog(file="runAsync", text="End @ #getTickCount() - start#")

All the code together looks like this:

start = getTickCount()
writeLog(file="runAsync", text="Begin @ #getTickCount() - start#")


writeLog(file="runAsync", text="f1 before define @ #getTickCount() - start#")
f1 = runAsync(() => {
    writeLog(file="runAsync", text="f1 main before query @ #getTickCount() - start#")
    queryExecute("CALL sleep_and_return(1)", [], {datasource="test"});
    writeLog(file="runAsync", text="f1 main after query @ #getTickCount() - start#")
}).then(() => {
    writeLog(file="runAsync", text="f1 then1 before query @ #getTickCount() - start#")
    queryExecute("CALL sleep_and_return(1)", [], {datasource="test"});
    writeLog(file="runAsync", text="f1 then1 after query @ #getTickCount() - start#")
}).then(() => {
    writeLog(file="runAsync", text="f1 then2 before query @ #getTickCount() - start#")
    queryExecute("CALL sleep_and_return(1)", [], {datasource="test"});
    writeLog(file="runAsync", text="f1 then2 after query @ #getTickCount() - start#")
})
writeLog(file="runAsync", text="f1 after define @ #getTickCount() - start#")


writeLog(file="runAsync", text="f2 before define @ #getTickCount() - start#")
f2 = runAsync(() => {
    writeLog(file="runAsync", text="f2 main before query @ #getTickCount() - start#")
    queryExecute("CALL sleep_and_return(1)", [], {datasource="test"});
    writeLog(file="runAsync", text="f2 main after query @ #getTickCount() - start#")
}).then(() => {
    writeLog(file="runAsync", text="f2 then1 before query @ #getTickCount() - start#")
    queryExecute("CALL sleep_and_return(1)", [], {datasource="test"});
    writeLog(file="runAsync", text="f2 then1 after query @ #getTickCount() - start#")
}).then(() => {
    writeLog(file="runAsync", text="f2 then2 before query @ #getTickCount() - start#")
    queryExecute("CALL sleep_and_return(1)", [], {datasource="test"});
    writeLog(file="runAsync", text="f2 then2 after query @ #getTickCount() - start#")
})
writeLog(file="runAsync", text="f2 after define @ #getTickCount() - start#")


writeLog(file="runAsync", text="f3 before define @ #getTickCount() - start#")
f3 = runAsync(() => {
    writeLog(file="runAsync", text="f3 main before query @ #getTickCount() - start#")
    queryExecute("CALL sleep_and_return(1)", [], {datasource="test"});
    writeLog(file="runAsync", text="f3 main after query @ #getTickCount() - start#")
}).then(() => {
    writeLog(file="runAsync", text="f3 then1 before query @ #getTickCount() - start#")
    queryExecute("CALL sleep_and_return(1)", [], {datasource="test"});
    writeLog(file="runAsync", text="f3 then1 after query @ #getTickCount() - start#")
}).then(() => {
    writeLog(file="runAsync", text="f3 then2 before query @ #getTickCount() - start#")
    queryExecute("CALL sleep_and_return(1)", [], {datasource="test"});
    writeLog(file="runAsync", text="f3 then2 after query @ #getTickCount() - start#")
})
writeLog(file="runAsync", text="f3 after define @ #getTickCount() - start#")


writeLog(file="runAsync", text="f1 before get @ #getTickCount() - start#")
f1.get()
writeLog(file="runAsync", text="f1 after get @ #getTickCount() - start#")

writeLog(file="runAsync", text="f2 before get @ #getTickCount() - start#")
f2.get()
writeLog(file="runAsync", text="f2 after get @ #getTickCount() - start#")

writeLog(file="runAsync", text="f3 before get @ #getTickCount() - start#")
f3.get()
writeLog(file="runAsync", text="f3 after get @ #getTickCount() - start#")


writeLog(file="runAsync", text="End @ #getTickCount() - start#")

When I run this on CF2021, I get exactly what I'd expect:

Begin @ 0
f1 before define @ 1
f1 main before query @ 2
f1 after define @ 3
f2 before define @ 4
f2 main before query @ 6
f2 after define @ 8
f3 before define @ 10
f3 after define @ 12
f3 main before query @ 12
f1 before get @ 13
f1 main after query @ 1006
f1 then1 before query @ 1008
f2 main after query @ 1011
f2 then1 before query @ 1012
f3 main after query @ 1015
f3 then1 before query @ 1016
f1 then1 after query @ 2010
f1 then2 before query @ 2012
f2 then1 after query @ 2014
f2 then2 before query @ 2016
f3 then1 after query @ 2023
f3 then2 before query @ 2024
f1 then2 after query @ 3014
f1 after get @ 3016
f2 before get @ 3017
f2 then2 after query @ 3018
f2 after get @ 3020
f3 before get @ 3021
f3 then2 after query @ 3026
f3 after get @ 3028
End @ 3029

Each of the f1, f3, f3 operations are being executed in a "first come, first served", and independently of each other (ie: asynchronously). And it's all over and done with at about the time it takes the slowest one to finish (allowing for intrinsic overhead).

It's a different story when I run this on Lucee:

Begin @ 1
f1 before define @ 19
f1 main before query @ 24
f1 main after query @ 1244
f1 then1 before query @ 1247
f1 then1 after query @ 2252
f1 after define @ 2255
f1 then2 before query @ 2255
f2 before define @ 2255
f2 main before query @ 2258
f1 then2 after query @ 3258
f2 main after query @ 3263
f2 then1 before query @ 3266
f2 then1 after query @ 4270
f2 after define @ 4272
f3 before define @ 4273
f2 then2 before query @ 4274
f3 main before query @ 4276
f2 then2 after query @ 5277
f3 main after query @ 5278
f3 then1 before query @ 5281
f3 then1 after query @ 6284
f3 after define @ 6286
f3 then2 before query @ 6286
f1 before get @ 6286
f1 after get @ 6287
f2 before get @ 6288
f2 after get @ 6289
f3 before get @ 6290
f3 then2 after query @ 7288
f3 after get @ 7289
End @ 7290

Here f1 gets underway, but it really seems to mostly block until [f1 then2 before query] until f2 even gets a look-in. This is two seconds of synchronous execution before Lucee even finds out there's something else asynchronous to be cracking on with. Then f2 blocks for second before f3 gets a chance to be looked at. Then there's a mess of stopping and starting, sucking up even more time, for an end result of ~7sec for an "asynchronous" operation that would take ~9sec if it was synchronous. Not great.

I also hasten to add that this was not a case of bad luck how the "async-ness" happened to run this time. I've run this code dozens of times, and it's always around the same. I also have run it with just CFML's own sleep instead of the DB call, and it's still the same. It's not the work the tasks are doing, it's how Lucee is implementing the asynchronous thread management, I think.

Also if ColdFusion's handling of the same code was similar to Lucee's, I'd just chalk it up to "sometimes asynchronous code doesn't even up seeming that asynchronous". But given ColdFusion nails it, I really think there's something up with Lucee. Pity: I had a good use case for this stuff in our app. But now I'm thinking I need to go back to our simple <cfthread> implementations and check exactly how asynchronous that code is…

Full disclosure: I also found what seems to be a bug with how ColdFusion handles enclosed variable references asynchronously, but I can't work out what's going on with that yet, and I was half way through this bit of code, so I wanted to finish this first. I might come back to the ColdFusion thing "tomorrow".

The code here is reasonably portable (one just needs to create that proc, and a DSN), so I'd be dead keen to hear from other people if they can run it in their own Lucee environments, if you can be arsed, that is.

Or am I missing something? Entirely possible. Let me know.

Righto.

--
Adam

PHP: I need another two devs

G'day:

I'm revisiting an article I posted in December, fishing for a coupla devs: I need another two devs. I parked that because we re-evaluated the direction we were going to take our replatforming exercise. In short we were going to be shift the CFML to Kotlin, but for various reasons that I won't go into here we've revised that plan to shift to PHP instead.

So. Now I'm looking for a couple of PHP devs to join my team.

We are growing our team some more, this time looking for two devs. Previously we'd only been hiring senior / very experienced engineers; this time I'm happy to talk to more mid-level devs who might like to join a team of seniors and take an opportunity to improve their dev skills, and their general engineering skills.

We have a B2B app running on a CFML-based framework. We are retiring that over the next couple of years, re-designing and re-implementing it in Symfony, on PHP8.2. The devs I am looking for will be working on the PHP side of the migration process. It's a green-fields project: currently the PHP app is at "G'day world" stage, and is not even deployed to the production environment yet, although it will be in the next couple of weeks.

There will be no requirement to do any dev work on the legacy CFML platform, however realistically one is gonna need to be reading / running the code whilst identifying / understanding the business logic, before re-implementing it in the PHP application. So CFML is not a necessary skill at all, but preparedness to understand the legacy code will be something one will need to do here.

You'd be joining a team of four devs (a mix of CFML and PHP), and a QA; this is from a total head count at Easy Direct Debits of 20-ish.

The details of the role - which I've lifted from our ad: "Application Developer (UK)" - are:

Competencies:

  • Experience with PHP.
  • Experience developing new web applications/web services.
  • Experience with Symfony.

Highly desirable:

  • Knowledge of design principles such as MVC, SOLID and other common design patterns.
  • Some familiarity with Agile principles, and experience delivering value in an Agile fashion.
  • Some exposure to test automation (eg: unit testing), and TDD practices.
  • Experience with client-side application technology (eg React.js or similar).
  • Some familiarity with Dockerised development environments.

The full job spec has details of the company and all the usual HR malarky.

Logistics-wise this is a permanent (ie: we are not offering it to contractors) remote-first position within the UK. I will not consider any other "unofficial" attempts to work from an offshore location. You must be resident in the UK, and be able to work here, fulltime. We cannot offer visa sponsorship. We have a physical office but I've never set foot in it, and everyone else in the dev team are 100% remote as well; spread throughout England. But if you want to work in Bournemouth, there's a desk for you there if that's your thing. The people in that office are all nice :-).

As I said, we have two openings for this role at the moment.

If you want to have a chat about this, you can send yer CV through to the email address on the job spec page linked-to above. I'm not interested in talking to recruiters for now, just in case you are one (and reading this?): I'm only wanting to talk to people in the dev community directly.

Righto.

--
Adam

Thursday 9 February 2023

PHP: looking at spatie/async

G'day:

For no good reason at all, other than it piquing my interest, I've decided to mess around with spatie/async, which is - in its own words:

[A] small and easy wrapper around PHP's PCNTL extension. It allows running of different processes in parallel, with an easy-to-use API.

I got onto it because after messing around with Guzzle the other day ("PHP: looking at ways of making HTTP requests"), I got to wondering how PHP is doing async stuff. It occurs to me now I didn't think to just check which lib(s) Guzzle was using; I just googled, and found a few options, and spatie/async seemed to be the simplest / most-no-nonsense of the lot, so thought would start here.

I'm gonna "experiment via test", as is my wont.


Set-up

First of all(*) though, I need to install PCNTL, which is not installed by default in PHP's stock Docker image. It's easy though, just add this to my Dockerfile and rebuild my container:

RUN docker-php-ext-configure pcntl && docker-php-ext-install pcntl

(*) I say "first of all", but - full disclosure - I missed the note in the first sentence of the docs (ie: that sentence I quoted above) that it was required, and wrote a whole bunch of code, and was left wondering "why is this not running async?". But only for like an hour or two. Ahem. Cough. Anyway. Moving right along.

First I have set up a proc that I can run that just takes ages to run. I'm gonna use this to tie-up PHP processing. Pardon my pathetic MySQL skillz:

USE db1;

DROP PROCEDURE IF EXISTS sleep_and_return;
DELIMITER //
CREATE PROCEDURE sleep_and_return(IN seconds INT)
BEGIN
    DO SLEEP(seconds);
    SELECT seconds;
END //
DELIMITER ;

Baseline: it runs stuff asynchronously

And now my first test:

/** @testdox It can call a slow proc multiple times async */
public function testSlowProcAsync()
{
    $connection = DB::getDbalConnection();

    $pool = Pool::create();

    $startTime = microtime(true);
    for ($i = 1; $i <= 3; $i++) {
        $pool->add(function () use ($connection, $i, $startTime) {
            $result = $connection->executeQuery("CALL sleep_and_return(?)", [2]);
            return sprintf(
                 "%d:%d:%d",
                 $i,
                 $result->fetchOne(),
                 microtime(true) - $startTime
             );
        });
    }
    $poolPopulationTime = microtime(true) - $startTime;
    $this->assertLessThanOrEqual(1, $poolPopulationTime);

    $startTime = microtime(true);
    $results = $pool->wait();
    $endTime = microtime(true);
    $executionTime = $endTime - $startTime;

    $this->assertLessThan(3, $executionTime);
    $this->assertCount(3, $results);
    $this->assertContains("1:2:2", $results, "1:2:2 not found in " . implode(",", $results));
    $this->assertContains("2:2:2", $results, "2:2:2 not found in " . implode(",", $results));
    $this->assertContains("3:2:2", $results, "3:2:2 not found in " . implode(",", $results));
}

There's a bit going on here (it's not my first iteration of this test, I have to admit).

  • The premise is that each task I need to process async is a call to that proc which I am making sleep for two seconds. Three tasks doing that is six seconds of synchronous work; or only around two seconds or so if each is running asynchronously.
  • This thing seems to work by having a pool, then adding processes to run asynchronously.
  • I take note of the time before I start adding processes to the pool, so I can take note of how long each takes to run (should be ~2sec).
  • I return those metrics, and they end up being what's returned from the wait call at the end. Easy.
  • I also note how long it took to populate the pool. This should be pretty quick (less than even one of the processes running).
  • I then time how long the pool takes to run. Each process is ~2sec, and if they're running asynchronously the whole thing should finish in less time than it would take two of them to run.
  • Lastly I verify the results are what I'd expect. Note that I'm not testing an exact array, but just looking for each element: as each process is async, I can't really assume that they'll complete (and populate that array) in an specific sequence.

The metrics I'm grabbing are not as clever as they could be. I'm assuming each process will finish around 2sec after I start the timer before adding them to the pool. There's overhead in the adding, and calling the code as well as the 2sec pause at the DB, and sometimes the latter 1-2 assertions fail. I've added the failure message to see what actually gets returned, but sadly I have not been able to see it fail since I added those. I imagine it's a case that ~3sec has passed between one of the latter process callbacks being added, and them actually finishing running. So like the metrics returned for that process might be 2:2:3, not 2:2:2: the last number is the duration since we started adding the processes to the pool.

Update

I'm writing this the day after I wrote the paragraph above.

I got the test failure, and this bears out my theory from before, pretty miuch:

tests of spatie/async (https://github.com/spatie/async)
  It can call a slow proc multiple times async
   
    1:2:2 not found in 3:2:3,1:2:3,2:2:3            
    Failed asserting that an array contains '1:2:2'.
   
    /var/www/tests/Functional/Async/SpatieAsyncTest.php:41
     
  

Note the array returned in the results: 3:2:3,1:2:3,2:2:3. In each of these the process took three seconds to complete. Probably because I had only just started my containers, and the PHP code was not compiled and cached, and the DB had only just started. So, yeah, not a great test, but it helped show this issue, which is something.


Adding a then handler

The next test is very similar, except I'm chaining a then handler onto the process:

/** @testdox It uses a then handler which acts on the result */
public function testSlowProcAsyncThen()
{
    $connection = DB::getDbalConnection();

    $pool = Pool::create();

    $startTime = microtime(true);
    for ($i = 1; $i <= 3; $i++) {
        $pool
            ->add(function () use ($connection) {
                return $connection->executeQuery("CALL sleep_and_return(?)", [2]);
            })
            ->then(function ($result) use ($i, $startTime) {
                return sprintf(
                    "%d:%d:%d",
                    $i,
                    $result->fetchOne(),
                    microtime(true) - $startTime
                );
            });
    }

    $startTime = microtime(true);
    $results = $pool->wait();
    $endTime = microtime(true);
    $executionTime = $endTime - $startTime;

    $this->assertLessThan(3, $executionTime);
    $this->assertCount(3, $executionTimes);
    $this->assertContains("1:2:2", $results, "1:2:2 not found in " . implode(",", $results));
    $this->assertContains("2:2:2", $results, "2:2:2 not found in " . implode(",", $results));
    $this->assertContains("3:2:2", $results, "3:2:2 not found in " . implode(",", $results));
}
  • The task returns the DB statement this time.
  • And the then handler receives that, and now it returns the metrics.

Everything else is the same.

And this test goes splat:

Exception : Serialization of 'PDOStatement' is not allowed

OK, fair enough. Not everything is serialisable, and I'm guessing this is how objects are being passed to different PHP processes. Good to know. I will update my test to pass the result, not the whole statement object:

$pool
    ->add(function () use ($connection) {
        return$result = $connection->executeQuery("CALL sleep_and_return(?)", [2]);
        return $result->fetchOne();
    })
    ->then(function ($result) use ($i, $startTime) {
        return sprintf(
            "%d:%d:%d",
            $i,
            $result,
            microtime(true) - $startTime
        );
    });

But this is wrong too:

1:2:2 not found in 2,2,2
Failed asserting that an array contains '1:2:2'.

What's going on here? Ah. OK, so the final result that is returned from $pool->wait() is the return values from the initial process callbacks, not the end result of the then handler. My process callbacks are returning the result of the proc call, which is just the number originally passed into it (2), hence the result array being [2,2,2]. If I want to access the result of the processing done in the then I'm gonna need to grab it separately:

public function testSlowProcAsyncThen()
{
    $connection = DB::getDbalConnection();

    $pool = Pool::create();

    $metrics = [];
    $startTime = microtime(true);
    for ($i = 1; $i <= 3; $i++) {
        $pool
            ->add(function () use ($connection) {
                $result = $connection->executeQuery("CALL sleep_and_return(?)", [2]);
                return $result->fetchOne();
            })
            ->then(function ($result) use (&$metrics, $i, $startTime) {
                $metrics[] = sprintf(
                    "%d:%d:%d",
                    $i,
                    $result,
                    microtime(true) - $startTime
                );
            });
    }

    $startTime = microtime(true);
    $pool->wait();
    $endTime = microtime(true);
    $executionTime = $endTime - $startTime;

    $this->assertLessThan(3, $executionTime);
    $this-->assertCount(3, $results);
    $this->assertCount(3, $metrics);
    $this->assertContains("1:2:2", $metrics, "1:2:2 not found in " . implode(",", $metrics));
    $this->assertContains("2:2:2", $metrics, "2:2:2 not found in " . implode(",", $metrics));
    $this->assertContains("3:2:2", $metrics, "3:2:2 not found in " . implode(",", $metrics));
}

And now this gives expected results.


Handling timeouts

Next up, I'm having a look at the timeout one can put on a pool. Similar code again:

/** @testdox It supports a timeout */
public function testSlowProcAsyncTimeout()
{
    $connection = DB::getDbalConnection();

    $pool = Pool::create();
    $pool->timeout(1);

    $timeOuts = [];

    $startTime = microtime(true);
    for ($i = 1; $i <= 3; $i++) {
        $pool
            ->add(function () use ($connection) {
                $result = $connection->executeQuery("CALL sleep_and_return(?)", [2]);
                return $result->fetchOne();
            })
            ->timeout(function () use (&$timeOuts, $i, $startTime) {
                $timeOuts[] = sprintf(
                    "TIMED OUT ON ITERATION %d after %d seconds",
                    $i,
                    microtime(true) - $startTime
                );
                return false;
            });
    }

    $results = $pool->wait();
    $endTime = microtime(true);
    $executionTime = $endTime - $startTime;

    $this->assertEquals(
        [
            "TIMED OUT ON ITERATION 1 after 1 seconds",
            "TIMED OUT ON ITERATION 2 after 1 seconds",
            "TIMED OUT ON ITERATION 3 after 1 seconds"
        ],
        $timeOuts
    );

    $this->assertLessThan(2, $executionTime);

    $this->assertEquals([], $results);
}
  • I set a timeout of 1sec.
  • I have a timeout handler that returns some metrics.
  • Note that the value returned from wait is an empty array this time.

Handling exceptions

Next a much more simplified test, looking at exception handling:

/** @testdox it supports exception handling */
public function testAsyncException()
{
    $pool = Pool::create();
    $pool
        ->add(function () {
            throw new \Exception("This is an exception");
        })
        ->catch(function (\Exception $exception) {
            $this->assertEquals("This is an exception", $exception->getMessage());
        });

    $pool->wait();
}

There's no surprises there, except the assertion fails, because $exception->getMessage() is not "This is an exception", it's this mess:

'This is an exception\n
\n
#0 [internal function]: adamcameron\php8\tests\Functional\Async\SpatieAsyncTest::{closure}()\n
#1 /var/www/vendor/laravel/serializable-closure/src/Serializers/Native.php(91): call_user_func_array(Object(Closure), Array)\n
#2 [internal function]: Laravel\SerializableClosure\Serializers\Native->__invoke()\n
#3 /var/www/vendor/laravel/serializable-closure/src/SerializableClosure.php(48): call_user_func_array(Object(Laravel\SerializableClosure\Serializers\Native), Array)\n
#4 /var/www/vendor/spatie/async/src/Runtime/ChildRuntime.php(26): Laravel\SerializableClosure\SerializableClosure->__invoke()\n
#5 {main}'

Exactly that. The exception message has somehow been polluted with a stack trace.

Initially I thought it might be because the library was wrapping my exception in another one (kinda fair enough), but it's not: it's just an exception. If I wasn't a complete n00b with this (and how PHP handles async stuff), and also wasn't incredibly rusty with PHP, I'd probably raise an issue with the library maintainers. But: I'll mull it over a bit first, to see if I can work out what dumbarsery I might be engaging in.


I decided to push the exception down a bit, and throw it in a then handler. Disappointing results here:

/** @testdox it does not support exception handling from a then handler */
public function testAsyncExceptionFromThen()
{
    $this->expectException(\Exception::class);
    $this->expectExceptionMessage("This is an exception");

    $pool = Pool::create();
    $pool
        ->add(function () {
            // do nothing
        })
        ->then(function () {
            throw new \Exception("This is an exception");
        })
        ->catch(function (\Exception $exception) {
            $this->assertStringStartsWith("This is an exception", $exception->getMessage());
        });

    $pool->wait();
}

The exception wasn't handled by the catch handler. It seems it'll only catch exceptions from the main process.


Stopping the pool in its tracks

I'm changing the test approach now, indeed I'm lifting their sample code from the docs and turning it into a test:

/**  @testdox a pool can be stopped */
public function testPoolStop()
{
    $pool = Pool::create();

    for ($i = 0; $i < 10000; $i++) {
        $pool->add(function () {
            return rand(0, 100);
        })->then(function ($output) use ($pool) {
            // If one of them randomly picks 100, end the pool early.
            if ($output === 100) {
                $pool->stop();
            }
        });
    }

    $results = $pool->wait();

    $this->assertLessThan(10000, count($results));
    $this->assertContains(100, $results);
}

This is obviously really contrived, but it's a way to see that the stop method does the trick.

I decided to see if it would also work in a catch handler:

/**  @testdox a pool can be stopped in a catch */
public function testPoolStopInCatch()
{
    $pool = Pool::create();

    for ($i = 0; $i < 10000; $i++) {
        $pool->add(function () {
            $result = rand(0, 100);
            if ($result === 100) {
                throw new \Exception("Something went wrong");
            }
            return $result;
        })->catch(function (\Exception $_) use ($pool) {
            $pool->stop();
        });
    }

    $results = $pool->wait();

    $this->assertLessThan(10000, count($results));
    $this->assertNotContains(100, $results);
}

Note that in this case, unlike the previous one, the results won't include the 100, because it never gets returned.


Stopping the Cameron in its tracks

There's more to look at, but this article is already quite long, and this is a reasonable stopping point because the next bit deals with alterntive syntaxes. I'll do that tomorrow (-ish).

This seems like pretty handy stuff so far, although there's that exception weirdness to mull over, and also I need to reason why I can't catch exceptions thrown in the then handlers.

All the methods are linked-to inline, but for the record they're all here: /tests/Functional/Async/SpatieAsyncTest.php.

Righto.

--
Adam

Sunday 5 February 2023

Ugh. PHPUnit and dealing with deprecation notices

G'day:

Today's plan

I sat down to write an article and some code using the spatie/async library. Not for any particular requirement I have, just that it sounds interesting. But I've just shelved that idea for today.

Today's revised plan

I added the package as a dependency and ran composer update. My first exercise after doing a composer update is always to run my automated tests to see what library updates have broken things. I'm glad I did because today it's PHPUnit itself that's breaking shit. Great.

So I backed-out spatie/async as a dependency, re-updated, and ran my tests again: yeah I still have problems. I checked and PHPUnit has updated from 9.5.28 to 9.6.2. I hasten to add that the breakages don't seem to be bugs or problems, just that they've added a bunch of deprecation notices in for upcoming changes to PHPUnit 10. This is a good thing. But I feel that now that I know about them, I should look into what I need to do to fix them. And I also suspect there's a way to run PHP with deprecation warnings switched off, and I should know how to do this so I am going to find out.

The symptoms I am seeing are this sort of thing:

Warning:       Abstract test case classes with "Test" suffix are deprecated
   
   ┐
   ├ assertObjectHasAttribute() is deprecated and will be removed in PHPUnit 10.
   ┴
   
   ┐
   ├ Expecting E_ERROR and E_USER_ERROR is deprecated and will no longer be possible in PHPUnit 10.
   ┴
   

Those seem like the only three actually. Let's find out how to work around them.


Not PHPUnit's finest hour

First: how to suppress the warnings. This is not helped by the phpunit.xml page in the docs currently 404ing (this is a temporary thing I think: it was fine last week, I have it in my history). But I found a cached version. Having done some googling, there is no way of turning off the deprecation reporting it seems (it was requested, and actively rejected). I also found this article written by Sebastian Bergmann, PHP's lead: Help! My tests stopped working ("Deprecations" section). Relevant extract:

PHPUnit behaves differently compared to PHP and how it reports the usage of deprecated functionality. When a deprecated feature of PHP is used then a message of type E_DEPRECATED is created. This message, however, is only shown when the developer has configured PHP's error_reporting setting accordingly. In my experience, developers either do not have error_reporting configured correctly (it should be set to -1 for development so that you see all error messages) or never bother to look at error messages that get buried in some log file. This is why quite a few developers are surprised that their software stops working when a major version of PHP is released that removes previously deprecated functionality.

PHPUnit reports a test that uses deprecated functionality with a warning because I know how developers use, or rather not use, PHP's E_DEPRECATED messages. You cannot opt out of getting this information from PHPUnit.

IMO this is a bit of a dick move (even if the intent is sound). PHPUnit: your job? Being a unit test framework. Stick to that. Leave it to me and PHP how deprecations ought to be dealt with. That's not part of your remit. If there was no prescribed way of handling (or not ~) in PHP then fair enough for PHPUnit to make its own call. And if PHPUnit was unaware of the prescribed PHP approach: also fair enough. But to be aware of it and pronounce "well ackchyually… I know better" is letting the side down a bit. IMO, obvs.

Still: it's given me a blog article topic I guess. So… erm…"cheers"…?

Firstly: the article also says this:

By default, PHPUnit's command-line test runner exits with shell exit code 0 when the use of a deprecated feature is reported.

I'm staring at an exit code of 1, currently, but I am guessing this is because I have failOnWarning="true" set in phpunit.xml.dist.

WARNINGS!
Tests: 79, Assertions: 208, Warnings: 8, Skipped: 1.

Generating code coverage report in HTML format ... done [00:06.012]
root@19f76723847c:/var/www# echo $?
1

Yeah, this returns zero if I switch that setting off. It still sux though, cos I don't want to turn warnings off, I only want to not be pestered by PHPUnit's own deprecations interfering with the reporting of the tests of my own code, which - actually - are all fine. So this is a case of misreporting on PHPUnit's part. To me that's a bug (one implemented by design, but subpar-decision-making is as much a bug as shonky implementation is).

NB: the quickest fix for me here would be to roll back to the previous version of PHPUnit whilst I take time to scheduling in the fixing of the tests. I hasten to add I am completely fine making the code changes, that's not the issue here. It's the crap way it's being handled by PHPUnit is the only issue.

Anyway, we get it: I'm annoyed. Let's move on and do something useful.


Abstract test case classes with "Test" suffix are deprecated

My initial reaction to this is about as positive as my reaction to PHPUnit's deprecation handling. It's completely legit for test suites to have common notionality in an abstract sense. And the best description of this could very legitimately end in the word Test. An obvious example might be a polypmorhic sitution with the classes under test, which could have different imlpementations each with the same custom assertion needs. Or even the implementation of dependencies to the object under test might be the same. On the other hand, from a position of not having looked into why PHPUnit has had this change make, there seems no good reason to prevent this: this seems like meddling for the sake of it to me, but let's see if I can find out why they've done it. It would be great, btw, if these deprecation notices also linked to whatever issue caused the deprecation to be made. But I'll hunt through the code.

The first thing I have found is the Github issue that represents the work: no explanation. No text at all, which is less than ideal: no work should be undertaken with a description of what/why, and also like some acceptance criteria so we know what the "definition of done" is for the given case. The only other thing I found is another Github issue of someone being bitten on the bum by it, but there was only confirmation it's "expected" behaviour, but no explanation why. Oh well.

Ha, OK, there's a 9.6 change log file which also links through to #5132, which I found whilst fossicking around in my vendor/phpunit dir. Still no explanation of "why" though.

This is at least easy to fix. I can simply rename the class in question.

In the case in question it's just this:

abstract class HttpTest extends TestCase
{

    protected function getUserAgentForCurl(): string
    {
        return sprintf("curl/%s", curl_version()['version']);
    }

    protected function assertGitInfoIsCorrect(string $response): void
    {
        $myGitMetadata = json_decode($response);
        $this->assertEquals('adamcameron', $myGitMetadata->login);
        $this->assertEquals('Adam Cameron', $myGitMetadata->name);
    }
}

TBH hoisting the getUserAgentForCurl helper method does not really contribute to a sense that there's an is-a relationship between this class and the subclasses (CurTest, GuzzleTest, PhPStreamTest, SymfonyHttpClientTest): this would better be in a dependency class if that's all that was in play here. But I def think the common need for the same custom assertion implementation fits the bill, so the abstractness of this class and its subclass hierarchy is legit. I also think HttpTest is the most fitting name for it, but… oh well. I'm just gonna rename it to HttpTestBase (runner up: HttpTests (plural)) I think. Not great - and not as fitting as HttpTest as it is now - but so be it.


assertObjectHasAttribute() is deprecated and will be removed in PHPUnit 10.

I found the issue for this one straight away: Remove assertions that operate on class/object properties #4601. This situation is legit. Quoting the ticket:

PHPUnit currently refers to "fields" (see above) as "attributes". This is (or will become) confusing considering the introduction of attributes in PHP 8 and their support in PHPUnit.

PHPUnit will be changed to use the term "property" instead of "attribute" where "field" is meant.

I'm onboard with this: it makes sense.

But. The replacement methods have not been implemented yet. So this is a good example where this thing oh PHPUnit's of raising deprecations as warnings is shit. I can't switch this one off "for now", knowing I have some work to do in future. I also don't want to turn off the PHPUnit setting that promotes warnings to exceptions because I want to know about legitimate warnings (from other sources). This is why PHP handles deprecations the way that it does. Sigh.

I want to get rid of the warnings - well I have to! - So I'm gonna implement my own assertObjectHasProperty assertion, using the same implementation as the current assertObjectHasAttribute. It's only used in one test class, so it's easy to integrate it into my codebase as well.

[beat]

OK so not exactly the same implementation as the current one uses private methods from its current home /src/Framework/Assert.php. But I'll knock together as close an approximation as possible I guess.

[beat]

OK screw that: it's using other deprecated methods on other objects, and it's not a good use of my time reimplementing all this crap. I'm just gonna use a property_exists check inline where I am currently using assertObjectHasAttribute. There's only two occurrences.


Expecting E_ERROR and E_USER_ERROR is deprecated and will no longer be possible in PHPUnit 10.

This is another case of the deprecation being put in with no real explanation. It's issue #5062, and the "explanation" is:

There are no replacements. PHPUnit 10 no longer converts E_* to exceptions, therefore E_* can no longer be expected.

Fine, but throw us a bone, pal: why does "PHPUnit 10 no longer converts E_* to exceptions, therefore E_* can no longer be expected". There must be some rationale? Let's go see if I can find anything about that… … … Nope. No issue I can find relating to that. All right (shakes head, and mutters "fuck sake, PHPUnit").

Right so I've addressed this one thus:

Before:

/** @testdox It cannot be type-coerced */
public function testTypeCoercion()
{
    $this->expectError(); // NB: not an exception; an error
    $this->expectErrorMessageMatches("/.*MaoriNumbers could not be converted to int.*/");
    $this->assertEquals(sprintf("ono: %d", MI::ONO), "ono: 6");
}

After:

/** @testdox It cannot be type-coerced */
public function testTypeCoercion()
{
    $this->assertError(
        function () {
            $this->assertEquals(sprintf("ono: %d", MI::ONO), "ono: 6");
        },
        "MaoriNumbers could not be converted to int"
    );
}

public function assertError(callable $callback, string $message)
{
    try {
        $callback();
        $this->fail("Expected an error, didn't get one");
    } catch (AssertionFailedError $e) {
        throw $e;
    } catch (\Throwable $e) {
        $this->assertStringContainsString($message, $e->getMessage());
    }
}

This is not perfect, but it'll do for what I'm needing it for. I have a coupla instances of this I need to refactor, and both in the same package. So I'll chuck this custom assertion in a trait I guess. One of the few times I don't feel icky using traits.


Conclusion

This was not a very edifying exercise, and I don't really think anything's been improved by two out of three of these PHPUnit changes I've needed to work around. Still: it's not the end of the world, and I hasten to add for my gripes here, I still think PHPUnit is brilliant. Mostly ;-) (kidding, it's great).

I guess I'll finally get to write the article I wanted to write tomorrow evening.

The current state of this project is tagged as 1.14.

Righto.

--
Adam

Thursday 2 February 2023

PHP / Doctrine/DBAL: adding a logger to an connection

G'day:

There's plenty of stuff online about how to do this the old deprecated way, but I couldn't find any example of how to do it the "current" way. So. Here's an example.

NB: the old way is some variant of this sort of thing:

$logger = new \Doctrine\DBAL\Logging\DebugStack();
$config = new \Doctrine\DBAL\Configuration();
$config->setSQLLogger($logger);

$connection = DriverManager::getConnection(
    $connectionParams,
    $config
);

However both DebugStack and setSQLLogger are deprecated. The DebugStack thing is just a logger instance and I didn't want to use that anyhow, but being able to set the logger I did want to use would be nice.

I dug into setSQLLogger, and it says this:

@deprecated Use {@see setMiddlewares()} and {@see \Doctrine\DBAL\Logging\Middleware} instead.

OK we'll do that then.

I'll spare you the code, but initially I was trying to set the middlewares (I'm not sure(*) that's the plural of "middleware", guys) on an existing connection object, and that was not working at all. Nothing was being logged. Then I discovered someone else with the same woes as me @ https://github.com/doctrine/dbal/issues/5366, and apparently one cannot actually do this, as the middlewares are only paid-attention-to when the connection is first created, so I needed to pass the config in during connection-creation. This is not well-documented either, but looking at the source code answered that.

Here's what I ended up with in my test:

namespace adamcameron\php8\tests\Functional\Database;

use adamcameron\php8\tests\Integration\Fixtures\Database as DB;
use Doctrine\DBAL\Configuration;
use Doctrine\DBAL\Logging\Middleware;
use Monolog\Handler\TestHandler;
use Monolog\Logger;
use PHPUnit\Framework\TestCase;

/** @testdox Tests Database objects */
class DbLoggingTest extends TestCase
{
    /** @testdox It can log SQL traffic */
    public function testLogging()
    {
        $testLogger = new Logger("test");
        $testHandler = new TestHandler();
        $testLogger->setHandlers([$testHandler]);
        $middleware = new Middleware($testLogger);

        $config = new Configuration();
        $config->setMiddlewares([$middleware]);

        $connection = DB::getDbalConnection($config);

        $result = $connection->executeQuery("SELECT * FROM numbers WHERE id = ?", [5]);

        $this->assertEquals([5, 'five', 'rima'], $result->fetchNumeric());

        $withSql = array_filter($testHandler->getRecords(), function ($record) {
            return in_array('SELECT * FROM numbers WHERE id = ?', $record->context);
        });
        $this->assertCount(1, $withSql);
    }
}

Given Middleware only supports Loggers, I'm not sure this is a great architectural win for DBAL: moving from a clear and on-point description of the functionality to a nebulous, vague, and actually inaccurate term like "Middleware". Oh well.

For the sake of completeness (and an explanation of why I needed to dick around filtering the log to get the one I was looking for, this is what is logged for that call:

array(2) {
  [0] =>
  class Monolog\LogRecord#363 (7) {
    public readonly DateTimeImmutable $datetime =>
    class Monolog\DateTimeImmutable#357 (4) {
      private $useMicroseconds =>
      bool(true)
      public $date =>
      string(26) "2023-02-02 23:27:14.519171"
      public $timezone_type =>
      int(3)
      public $timezone =>
      string(13) "Europe/London"
    }
    public readonly string $channel =>
    string(4) "test"
    public readonly Monolog\Level $level =>
    enum Monolog\Level::Info : int(200);
    public readonly string $message =>
    string(35) "Connecting with parameters {params}"
    public readonly array $context =>
    array(1) {
      'params' =>
      array(6) {
        ...
      }
    }
    public array $extra =>
    array(0) {
    }
    public mixed $formatted =>
    string(204) "
        [2023-02-02T23:27:14.519171+00:00]
        test.INFO:
        Connecting with parameters {params}
        {
            "params":{
                "dbname":"db1",
                "user":"user1",
                "password":"<redacted>",
                "host":"mariadb",
                "port":"3306",
                "driver":"pdo_mysql"
            }
        }
        []
    "
  }
  [1] =>
  class Monolog\LogRecord#349 (7) {
    public readonly DateTimeImmutable $datetime =>
    class Monolog\DateTimeImmutable#348 (4) {
      private $useMicroseconds =>
      bool(true)
      public $date =>
      string(26) "2023-02-02 23:27:14.780537"
      public $timezone_type =>
      int(3)
      public $timezone =>
      string(13) "Europe/London"
    }
    public readonly string $channel =>
    string(4) "test"
    public readonly Monolog\Level $level =>
    enum Monolog\Level::Debug : int(100);
    public readonly string $message =>
    string(65) "Executing statement: {sql} (parameters: {params}, types: {types})"
    public readonly array $context =>
    array(3) {
      'sql' =>
      string(34) "SELECT * FROM numbers WHERE id = ?"
      'params' =>
      array(1) {
        ...
      }
      'types' =>
      array(1) {
        ...
      }
    }
    public array $extra =>
    array(0) {
    }
    public mixed $formatted =>
    string(194) "
        [2023-02-02T23:27:14.780537+00:00]
        test.DEBUG:
        Executing statement:
            {sql}
            (parameters: {params}, types: {types})
            {
                "sql":"SELECT * FROM numbers WHERE id = ?",
                "params":{
                    "1":5
                },
                "types":{
                    "1":2
                }
            }
        []
    "
  }
}

(Well that pads the word count…)

I like how it's had the presence of mind to redact the password from the connection params.

Anyway, that's it. I just wanted that code written down somewhere were it can be found.

Righto.

--
Adam


(*) In fact I've very bloody sure it oughtn't be (and, yesyesyes, I know it's been adopted as such, but this is LCD thinking at work. "-ware" already makes it a mass noun, so… plural).

PHP 8: a quick look at enums

G'day:

Whilst working on my recent code implemneting a postcode look-up web service, I stumbled across Monolog/Level:

enum Level: int
{
    // …
}

I was unaware of PHP having enums, and it turns out it's a recent addition: arriving in PHP 8.1 (I had moved on from PHP when 7.2 was brand new). See PHP: Enumerations for the docs.

My usage of it was I wanted to get the string "CRITICAL" back from the log level I was testing a log entry for:

public function assertLogEntryIsCorrect(
    TestHandler $testHandler,
    Level $expectedLogLevel,
    int $statusCode,
    string $expectedMessage
): void {
    $logRecords = $testHandler->getRecords();
    $this->assertCount(1, $logRecords);
    $this->assertEquals($expectedLogLevel->getName(), $logRecords[0]["level_name"]);

Monolog/Logger has this:

public const CRITICAL = 500;

But nothing associating that const to the string "CRITICAL", yet in the log entry the level is "CRITICAL". Yeah I could have just used strtoupper, but I had suspected there was a better way, so had a look at how Monolog actually writes the log entry. I traced this back to here:

public static function getLevelName(int|Level $level): string
{
    return self::toMonologLevel($level)->getName();
}

And from there found the Level enum

I wanted to see what enums in PHP can do, so I worked through the docs, and wrote myself an example of each. I did this via TDD, so I have some tests describing the functionality (see the @testdox annotations for the descriptions):

namespace adamcameron\php8\tests\Unit\System;

use adamcameron\php8\tests\Unit\System\Fixtures\MaoriNumbers as MI;
use PHPUnit\Framework\TestCase;

/** @testdox Testing enums */
class EnumTest extends TestCase
{

    /** @testdox It can have static methods */
    public function testStaticMethods()
    {
        $this->assertEquals("one", MI::asEnglish(1));
    }

The underlying method in the enum is:

public static function asEnglish(int $i) : string
{
    return self::EN[$i - 1];
}
    /** @testdox It can have instance methods */
    public function testInstanceMethods()
    {
        $this->assertEquals("two", MI::RUA->toEnglish());
    }

The underlying method in the enum is:

public function toEnglish() : string
{
    return self::EN[$this->value - 1];
}
    /** @testdox Its name can be returned */
    public function testGetName()
    {
        $this->assertEquals("TORU", MI::TORU->name);
    }

    /** @testdox Its value can be returned */
    public function testGetValue()
    {
        $this->assertEquals(4, MI::WHĀ->value);
    }

I purposedly used the correct accented unicode charactcer in WHÄ€ to check if PHP supported non-ASCII charatcers there. Yes.


    /** @testdox It encodes to JSON OK */
    public function testJsonEncode()
    {
        $this->assertEquals('{"rima":5}', json_encode(["rima" => MI::RIMA]));
    }

    /** @testdox It cannot be type-coerced */
    public function testTypeCoercion()
    {
        $this->expectError(); // NB: not an exception; an error
        $this->expectErrorMessageMatches("/.*MaoriNumbers could not be converted to int.*/");
        $this->assertEquals(sprintf("ono: %d", MI::ONO), "ono: 6");
    }

    /** @testdox It has a from method */
    public function testFrom()
    {
        $this->assertEquals(MI::from(7), MI::WHITU);
    }

    /** @testdox It has a from method that throws an exception */
    public function testFromException()
    {
        $this->expectError(\ValueError ::class);
        $this->expectErrorMessage();
        MI::from(0);
    }

I like how IntelliJ warns me I'm going wrong trying to use 0 there:

    /** @testdox It has a tryFrom method */
    public function testTryFrom()
    {
        $this->assertEquals(MI::tryFrom(8), MI::WARU);
    }

    /** @testdox It has a tryFrom method that returns null */
    public function testTryFromNull()
    {
        $this->assertNull(MI::tryFrom(-1));
    }

    /** @testdox It will type-coerce the argument for from and tryFrom */
    public function testTryFromTypeCoercion()
    {
        $this->assertEquals(MI::from("9"), MI::IWA);
        $this->assertEquals(MI::tryFrom(10.0), MI::TEKAU);
    }

    /** @testdox It can be used in a match expression */
    public function testMatch()
    {
        $this->assertEquals("odd", MI::TAHI->getParity());
        $this->assertEquals("even", MI::RUA->getParity());
    }

The underlying method in the enum is:

public function getParity() : string
{
    return match($this) {
        self::TAHI, self::TORU, self::RIMA, self::WHITU, self::IWA => "odd",
        self::RUA, self::WHĀ, self::ONO, self::WARU, self::TEKAU => "even"
    };
}

Match expressions are new to PHP 8.0, and are an expression version of switch statements. I might need to look at these some more later on too.


    /** @testdox It can have consts, and supply the values for same */
    public function testConsts()
    {
        $this->assertEquals(MI::THREE, MI::TORU);
    }

    /** @testdox It can use traits */
    public function testTraits()
    {
        $this->assertEquals(MI::FOUR, MI::WHĀ);
    }

    /** @testdox It is an object and has a class const */
    public function testClassConst()
    {
        $this->assertEquals(get_class(MI::RIMA), MI::CLASS);
    }

    /** @testdox It supports the __invoke magic method */
    public function testInvoke()
    {
        $this->assertEquals("six", (MI::ONO)());
    }

The underlying method in the enum is:

public function __invoke() : string
{
    return $this->toEnglish();
}

    /** @testdox It has a cases method which returns a listing */
    public function testCases()
    {
        $someCases = array_slice(MI::cases(), 6, 4);
        $this->assertEquals(
            [MI::WHITU, MI::WARU, MI::IWA, MI::TEKAU],
            $someCases
        );

    }

    /** @testdox It can be serialised */
    public function testSerialize()
    {
        $this->assertMatchesRegularExpression(
            sprintf('/^E:\d+:"%s:%s";$/', preg_quote(MI::TAHI::CLASS), MI::TAHI->name ),
            serialize(MI::TAHI)
        );
    }
}

And the whole enum:

namespace adamcameron\php8\tests\Unit\System\Fixtures;

enum MaoriNumbers : int {
    case TAHI = 1;
    case RUA = 2;
    case TORU = 3;
    case WHĀ = 4;
    case RIMA = 5;
    case ONO = 6;
    case WHITU = 7;
    case WARU = 8;
    case IWA = 9;
    case TEKAU = 10;

    use MaoriNumbersConstsTrait;

    private CONST EN = ["one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten"];

    public function toEnglish() : string
    {
        return self::EN[$this->value - 1];
    }

    public static function asEnglish(int $i) : string
    {
        return self::EN[$i - 1];
    }

    public function getParity() : string
    {
        return match($this) {
            self::TAHI, self::TORU, self::RIMA, self::WHITU, self::IWA => "odd",
            self::RUA, self::WHĀ, self::ONO, self::WARU, self::TEKAU => "even"
        };
    }

    public function __invoke() : string
    {
        return $this->toEnglish();
    }
}

This also uses a trait:

namespace adamcameron\php8\tests\Unit\System\Fixtures;

use adamcameron\php8\tests\Unit\System\Fixtures\MaoriNumbers as MI;

trait MaoriNumbersConstsTrait {
    public const ONE = MI::TAHI;
    public const TWO = MI::RUA;
    public const THREE = MI::TORU;
    public const FOUR = MI::WHĀ;
    public const FIVE = MI::RIMA;
    public const SIX = MI::ONO;
    public const SEVEN = MI::WHITU;
    public const EIGHT = MI::WARU;
    public const NINE = MI::IWA;
    public const TEN = MI::TEKAU;
}

I would not normally use a trait (in pretty much any situation), but I created this for the purposes of the testing.

That's it. I just wanted to share my code of my investigations into a language feature I was previously unaware of. I find this a helpful way of learning new stuff: putting it through its paces and formally observing the results.

Righto.

--
Adam