Saturday, 13 May 2017

PHPUnit: using a mocked-method call log to track its activity

G'day:
This is just a quick PHPUnit one, inspired by a trick my team lead suggested to me. I think it's quite handy and works around what seems to be a shortcoming of how PHPUnit mocks stuff. I'm sure there's strong opinions held on this sort of thing somewhere as to how I'm not supposed to want to do what my requirement has me needing to do here. I do so look forward to being told about it.

Anyway, I'm writing a script to repair some bung data we have. The details are irrelevant, but it's fairly "important" data (well: not to me... but to someone, apparently...), and I want to make sure we can track when things go wrong. And they will go wrong. Firstly I'm talking to an external third-party service, and it's foolhardy to trust those; secondly I'm relying on some of our own internal services which... ah... well yeah they were written in a different era by people with an unusual understanding of how to write managable, testable, and reliable code. And in the course of doing this work I turned-up a coupla bugs and had to fix those, so I don't back our own code to be stable here. I don't mean that judgementally: all code bases have better and not so good code in them. Anyway...

The upshot of this is the script logs a bunch of stuff - said script will be run by a cron job - and part of my testing is to verify the right things gets logged at the right time. A lot of it is fairly perfunctory "belt and braces" sort of logging, say "Records 1000-1999 processed OK", and I don't really care about that; but there's other stuff where the script deals with unexpected circumstances and I want to make sure that's done right.

For the purposes of this test I am mocking out all the dependencies so the script doesn't actually do anything, but the mocks return "workable" mocked data sufficient to walk through the stages of the script. And the logger is one of these dependencies, and is also mocked.

The test I was stuck on was one where an unhandled exception was thrown, and the script basically shuts itself down; having first logged what went wrong.

Here's a very pared-back version of the script:

public function myMethod($times){
    try {
        $this->logger->logMessage("Starting off");
        for ($i=1; $i <= $times; $i++) {
            $this->logger->logMessage("Starting processing iteration $i");
            $this->helper->doThing();
            $this->logger->logMessage("Finished processing iteration $i");
        }
        $this->logger->logMessage("Finishing off");
    }
    catch (\Exception $e) {
        $this->logger->logMessage(sprintf("Something went wrong: %s", $e->getMessage()));
        throw $e;
    }
    finally {
        $this->logger->logMessage("All done");
    }
}

Here I've just got the one doThing dependency call in the loop, but it makes the point. And around that, and before and after the the loop I log a line to track progress. And I log a line if there's an unhandled exception. As I said the real script is somewhat more complex than this, but this is an adequate analogy.

I have another test which tests the ongoing logging of a successful run with 0, 1, >1 records, but this test is to make sure the script a) reports the exception; b) still shuts down OK.

PHPUnit's approach to checking a specific argument was passed to a particular mock call is to use the at matcher method when setting an expectation. My first pass at the test used this matcher:

function testMyMethodUsingAt()
{
    $exceptionMessage = "EXPECT_THIS";
    $this->expectException(\RuntimeException::class);
    $this->expectExceptionMessage($exceptionMessage);

    $testIterations = 6;
    $errorAt = $testIterations / 2;

    $this->helper
        ->expects($this->at($errorAt))
        ->method("doThing")
        ->willThrowException(new \RuntimeException($exceptionMessage));

    $this->logger
         ->expects($this->at(7))
         ->method("logMessage")
         ->with("Starting processing iteration 4");
    $this->logger
         ->expects($this->at(8))
         ->method("logMessage")
         ->with("Something went wrong: $exceptionMessage");
    $this->logger
         ->expects($this->at(9))
         ->method("logMessage")
         ->with("All done");

    $this->sut->myMethod($testIterations);
}

Here I worked out the the call history for info would be nine entries long, and the entries I wanted to confirm the exception-tracking logic was correct was the 7th, 8th and 9th ones.

But this is problematic for a number of ways:
  • it's not immediately clear what hat I pulled 7th, 8th and 9th from;
  • it's fragile because it relies on there being precisely seven earlier log entries for it to work. EG: if we were to add another log line into the method, that would break this test, which is rubbish;
  • semantically it's incorrect, as it's not the 7th, 8th and 9th entries I'm interested in; it's the last three. It'd be better to code it accordingly so the code was more readable to maintainers, later on.
  • Lastly the at approach to things in PHPUnit is just a bit crap because the index is based on the mock object, not the mock method, despite almost all use-cases I've had for this sort of thing have been wanting to check sequential calls to the mocked method, not just to any method in the mocked object. So I try to steer clear of at at the best of times.
Other mocking frameworks I've used expose the call-log for a mocked method, but I cannot find any way to get it from PHPUnit. I can see where the data is stored, but to get it out would be deeply "Heath Robinson", and would be even worse than the approach above.

I was stumped.

Then me boss said "just use a callback in the will expectation, and build the log yerself". Initially I just looked at him like he was a loony (he's used to this), but then it dawned on me what he was suggesting. And it was easy, and not altogether awful, either:

function testMyMethodUsingCallLog()
{
    $exceptionMessage = "EXPECT_THIS";
    $this->expectException(\RuntimeException::class);
    $this->expectExceptionMessage($exceptionMessage);

    $testIterations = 6;
    $errorAt = $testIterations / 2;

    $this->helper
        ->expects($this->at($errorAt))
        ->method("doThing")
        ->willThrowException(new \RuntimeException($exceptionMessage));

    $callLog = [];
    $this->logger
        ->method("logMessage")
        ->will($this->returnCallback(function ($message) use (&$callLog) {
            $callLog[] = $message;
        }));

    try {
        $this->sut->myMethod($testIterations);
    }
    catch (\Exception $e) {
        $this->assertSame("All done", array_pop($callLog));
        $this->assertSame("Something went wrong: $exceptionMessage", array_pop($callLog));
        $this->assertNotSame("Finishing off", array_pop($callLog));

        throw $e;
    }
}

The first key bit is capturing the call-log. This creates an array, and with each call to the mocked logging method, we capture what was logged. Note that these expectations don't enforce any rules here. They just capture what's going on.

Because this method call throws an exception, I have to try/catch it so that I can regain control once it's errored, and inspect the call log to see it's what I expect. My approach here is far more semantic: I check the last entry is the "Finishing off" one, the one before that is logging the correct exception, and - this is slightly different now - as a failsafe I check that the one before that is anything other than "All Done". Because if it was that value, it'd mean the logic flow did not branch at the right stage of things. Lastly I throw the exception I caught so PHPUnit's own exception-expectation-checking also validates that too.

One of the best things about this is that this test doesn't care about what goes on in the log prior to the stuff it's testing, it is able to focus on precisely what it cares about. This is much clearer code. An example of this would be to change this line in each test:

$testIterations = 6;

Change it to be any other number, and the first test breaks, whereas the second test keeps working. Win.

That's pretty cool.

I hasten to add I would probably not usually be focusing quite this closely on this sort of sequencing, but in this case we need to know things are working properly.

It's a shame PHPUnit doesn't just expose the call-log like this, as I can't be the first person who's been in this situation. But I guess the tools are there to DIY like I have here, so no harm done.

Righto.

--
Adam

Saturday, 29 April 2017

CFML: Lucee 13 v 414 v 6 ColdFusion. We have a "winner"

G'day:
Argh. CFML again. I'm doing this because I was a bit of a meanie to Brad on the CFML Slack Channel, and promised I'd try to play nice with one of our current exercises set out by Ryan:

Todays challenge: post something you can do in CFML that you don’t think the majority of people know. Maybe its a little-known function, or some java integration, or a technique from another language that isn’t common in CFML - or even just something that you’ve learned from this slack group - if you didn’t know it before someone else probably doesn’t know it now. Doesn’t even have to be a good idea or something you’ve ever actually used but bonus points for useful stuff. Can apply to ACF [ed: Adobe ColdFusion, or could just  be ColdFusion or CF. Dunno why ppl say "ACF"] or Lucee but bonus points for something universal.
This is a great idea. And whilst I am trying to not use CFML any more due to it not being a good use of my time, I still try to help other people with it on the Slack channel, and CFML does do some good stuff.

I couldn't think of anything interesting that no-one else would know about, but I figured I'd show people a different approach to something. Just as food for thought.

In CFML, when hitting the DB one receives the data back not as an array of objects or an array of structs, but as a single "Query Object", which internally contains the data, and exposes access to said data by various functions, methods, and statements. This is fine which restricted to CFML code, but increasingly code these days needs to interchange with other systems, and sometimes it's a bit of a pain converting from a Query object to something more interchangeable like... an array of objects (or structs in CFML). There's no native method to do this, but it's easy enough to do with a reduction:

// fake a DB call
numbers = queryNew("id,en,mi", "integer,varchar,varchar", [
    [1,"one","tahi"],
    [2,"two","rua"],
    [3,"three","toru"],
    [4,"four","wha"]
]);

numbersAsArray = numbers.reduce(function(elements, row){
    return elements.append(row);
}, []);

writeDump({
    numbers=numbers,
    numbersAsArray=numbersAsArray
});

(run this yerself on trycf.com)

Here I'm faking the DB hit, but that can be chained together too:

numbersAsArray = queryExecute("SELECT * FROM numbers")
    .reduce(function(rows, row){
        return rows.append(row);
    }, []);

All good, and nice and compact. Not very inspiring though.

So as a next step I decided to add some more method calls in there. Let's say I only wanted numbers greater than 5, I wanted the keys in the structs to be different from the ones in the DB, and I wanted it sorted in reverse order. Obviously this is all easily done in the DB:

numbers = queryExecute("
    SELECT id AS value, en AS english, mi AS maori
    FROM numbers
    WHERE id > 5
    ORDER BY id DESC
");

But sometimes we have to play the hand we've been dealt, and we cannot change the recordset we're getting back.

Of course we could also do this with a query-of-query too, but CFML's internal SQL implementation offers... "challenges" of its own, so let's forget about that.

Anyway, I ended up with this:

numbersAsArray = queryExecute("SELECT * FROM numbers")
    .filter(function(row){
        return row.id > 5;
    })
    .map(function(row){
        return {value=row.id, english=row.en, maori=row.mi};
    }, queryNew("value,english,maori"))
    .reduce(function(rows=[], row){
        return rows.append(row);
    })
    .sort(function(e1,e2){
        return e2.value - e1.value;
    })
;

That's all straight forward. We retain only the rows we want with filter, we change the column names with map, and again convert the result to be an array of structs with reduce, then finally we re-order them with sort.

That's cool. And the result is...

YES

Huh? I mean literally... the value of numbersAsArray was "YES". Groan. For the uninitiated, the string "YES" is a boolean truthy value in CFML (CFML also has true and false, but it favours "YES" and "NO" for some common-sense-defying reason). And indeed some old-school CFML functions which should have been void functions instead return "YES". But the method versions should not: methods - where sensible - should return a result so they can be chained to the next method call. Like I'm trying to do here: the end of the chain should be the answer.

I could see how this would continue, so I decided to start keeping score of the bugs I found whilst undertaking this exercise.

ColdFusion: 1.

I pared my code back to get rid of any irrelevant bits for the sake of a SSCCE, and ran it on Lucee for comparison. I just got an error, but not one I expected.

I have an Application.cfc set up her to define my datasource, and I had this:

component {
    this.name = getCurrentTemplatePath().hash();
    this.datasource = "scratch_mysql";
}

Now I didn't need the DSN for this SSCCE, but the Application.cfc was still running obviously. And it seems Lucee does not implement the hash method:

Lucee 5.1.3.18 Error (expression)
MessageNo matching Method/Function for String.hash() found

Lucee joins the scoring:

ColdFusion 1 - 1 Lucee

I use the hash function instead of the method to name my application, and at least now Lucee gets to the code I want to run.

numbers = queryNew("id,en,mi", "integer,varchar,varchar", [
    [1,"one","tahi"],
    [2,"two","rua"],
    [3,"three","toru"],
    [4,"four","wha"]
]);

reversed = numbers.sort(function(n1,n2){
    return n2.id - n1.id;
});

writeDump([reversed,numbers]);

And the result:

Array
1
Query
Execution Time: 0 ms
Record Count: 4
Cached: No
Lazy: No 
idenmi
14fourwha
23threetoru
32tworua
41onetahi
2
Query
Execution Time: 0 ms
Record Count: 4
Cached: No
Lazy: No 
idenmi
14fourwha
23threetoru
32tworua
41onetahi

Hang on. Lucee's changed the initial query as well. If it's returning the result, then it should not also be changing the initial value. But I'm gonna say this is due to a sort of sideways compatibility with ColdFusion:

array
1YES
2
query
enidmi
1four4wha
2three3toru
3two2rua
4one1tahi

As it doesn't return the value from the method, it makes sense to act on the initial value itself.

But if Lucee's gonna copy ColdFusion (which it should be) then it should be copying it properly.

ColdFusion 1 - 2 Lucee

To mitigate this, I decide to duplicate the initial query first:

reversed = numbers.duplicate();
reversed.sort(function(n1,n2){
    return n2.id - n1.id;
});

This works fine on ColdFusion:

array
1
query
ENIDMI
1four4wha
2three3toru
3two2rua
4one1tahi
2
query
enidmi
1one1tahi
2two2rua
3three3toru
4four4wha

But breaks on Lucee:

 Error:
No matching Method/Function for Query.duplicate() found on line 9

Hmmm. Well I s'pose the duplicate method doesn't seem to be documented, but it was added in CF2016. This is getting in my way, so I'm still chalking it up to an incompat in Lucee:

ColdFusion 1 - 3 Lucee

(I probably should add a documentation bug with ColdFusion too, but that's a separate matter).

Anyway, that's mostly an aside. In my example what I am sorting is an intermediary value anyhow, so it doesn't matter that it gets sorted as well as being returned. For my purposes I am not using ColdFusion any more, just Lucee, as I'm specifically showing the method chaining thing, and we already know ColdFusion messes this up with how sort works.

So here we go, all done:

numbers = queryExecute("SELECT * FROM numbers")
    .map(function(row){
        return {value=row.id, english=row.en, maori=row.mi};
    }, queryNew("value,english,maori"))
    .filter(function(row){
        return row.value > 5;
    })
    .sort(function(e1,e2){
        return e2.value - e1.value;
    })
    .reduce(function(rows, row){
        return rows.append(row);
    }, [])
;

writeDump(numbers);

And the output:

Array
1
Struct
en
Empty:null
english
stringten
id
Empty:null
maori
stringtekau
mi
Empty:null
value
number10
2
Struct
en
Empty:null
english
stringnine
id
Empty:null
maori
stringiwa
mi
Empty:null
value
number9
3
Struct
en
Empty:null
english
stringeight
id
Empty:null
maori
stringwaru
mi
Empty:null
value
number8
4
Struct
en
Empty:null
english
stringseven
id
Empty:null
maori
stringwhitu
mi
Empty:null
value
number7
5
Struct
en
Empty:null
english
stringsix
id
Empty:null
maori
stringono
mi
Empty:null
value
number6

OK, now WTF is going on? Lucee hasn't remapped the columns properly. it's added the new ones, but it's also included the old ones. It ain't supposed to do that. Contrast ColdFusion & Lucee with some more simple code:

numbers = queryNew("id,en,mi", "integer,varchar,varchar", [
    [1,"one","tahi"],
    [2,"two","rua"],
    [3,"three","toru"],
    [4,"four","wha"]
]);
remapTemplate = queryNew("value,english,maori"); 

reMapped = numbers.map(function(row){
    return {value=row.id, english=row.en, maori=row.mi};
}, remapTemplate);

writeDump(reMapped);

ColdFusion:

query
ENGLISHMAORIVALUE
1onetahi1
2tworua2
3threetoru3
4fourwha4

Lucee:

Query
Execution Time: 0 ms
Record Count: 4
Cached: No
Lazy: No 
valueenglishmaoriidenmi
11onetahi
Empty:null
Empty:null
Empty:null
22tworua
Empty:null
Empty:null
Empty:null
33threetoru
Empty:null
Empty:null
Empty:null
44fourwha
Empty:null
Empty:null
Empty:null

Sigh. What's supposed to be returned by a map operation on a query is a new query with only the columns from that remapTemplate query. That's what it's for.

ColdFusion 1 - 4 Lucee

On a whim I decided to check what Lucee did to the remapTemplate:

Query
Execution Time: 0 ms
Record Count: 4
Cached: No
Lazy: No 
valueenglishmaoriidenmi
11onetahi
Empty:null
Empty:null
Empty:null
22tworua
Empty:null
Empty:null
Empty:null
33threetoru
Empty:null
Empty:null
Empty:null
44fourwha
Empty:null
Empty:null
Empty:null

ColdFusion 1 - 5 Lucee

This situation is slightly contrived as I don't care about that query anyhow. But what if I was using an extant query which had important data in it?

remapTemplate = queryNew("value,english,maori", "integer,varchar,varchar", [
    [5, "five", "rima"]
]);

So here I have a query with some data in it, and for whatever reason I want to remap the numbers query to have the same columns as this one. But obviously I don't want it otherwise messed with. Lucee mungs it though:

Query
Execution Time: 0 ms
Record Count: 5
Cached: No
Lazy: No 
valueenglishmaoriidenmi
15fiverima
21onetahi
Empty:null
Empty:null
Empty:null
32tworua
Empty:null
Empty:null
Empty:null
43threetoru
Empty:null
Empty:null
Empty:null
54fourwha
Empty:null
Empty:null
Empty:null

Not cool.

But wait. We're not done yet. Let's go back to some of my original code I was only running on ColdFusion:

numbersAsArray = queryExecute("SELECT id,mi FROM numbers LIMIT 4")
    .reduce(function(rows=[], row){
        return rows.append(row);
    })
;
writeDump(numbersAsArray);

This was part of the first example, I've just ditched the filter, map and sort: focusing on the reduce. On ColdFusion I get what I'd expect:

array
1
struct
ID1
MItahi
2
struct
ID2
MIrua
3
struct
ID3
MItoru
4
struct
ID4
MIwha

On Lucee I get this:

Lucee 5.1.3.18 Error (expression)
Messagecan't call method [append] on object, object is null
StacktraceThe Error Occurred in
queryReduceSimple.cfm: line 4 
2: numbersAsArray = queryExecute("SELECT id,mi FROM numbers LIMIT 4")
3: .reduce(function(rows=[], row){
4: return rows.append(row);
5: })
6: ;

Hmmm. What's wrong now? Oh. It's this:

reduce(function(rows=[], row)

Notice how I am giving a default value to the first argument there. This doesn't work in Lucee.

ColdFusion 1 - 6 Lucee

This is easy to work around, because reduce functions take an optional last argument which is the initial value for that first argument to the callback, so I can just re-adjust the code like this:

.reduce(function(rows , row){
    return rows.append(row);
}, [])

OK, at this point I give up. Neither implementation of CFML here - either ColdFusion's or Lucee's - is good enough to do what I want to do. Oddly: Lucee is far worse on this occasion than ColdFusion is. That's disappointing.

So currently the score is 1-6 to Lucee. How did I get to 4-13?

I decided to write some test cases with TestBox to demonstrate what ought to be happening. And with the case of duplicate, I tested all native data-types I can think of:

  • struct
  • array
  • query
  • string
  • double (I guess "numeric" in CFML)
  • datetime
  • boolean
  • XML

Lucee failed the whole lot, and ColdFusion failed on numerics and booleans. As this is undocumented behaviour this might seem a bit harsh, but I'm not counting documentation errors against ColdFusion in this case. Also there's no way I'd actually expect numerics and booleans to have a duplicate method... except for the fact that strings do. Now this isn't a method bubbling through from java.lang.String, nor is it some Java method of ColdFusion's string implementation (they're just java.lang.Strings). This is an actively-created CFML member function. So it seems to me that - I guess for the sake of completeness - they implemented for "every" data type... I mean it doesn't make a great deal of sense on a datetime either, really, does it? So the omission of it from numerics and booleans is a bug to me.

This leaves the score:

ColdFusion 3 - 13 Lucee

The last ColdFusion point was cos despite the fact that with the sort operation it makes sense to alter the initial object if the method doesn't return the sorted one... it just doesn't make sense that the sort method has been implemented that way. It should leave the original object alone and return a new sorted object.

ColdFusion 4 - 13 Lucee

My test cases are too long to reproduce here, but you can see 'em on Github: Tests.cfc.

Right so...

Ah FFS.

... I was about to say "right, so that's that: not a great experience coming up with something cool to show to the CFMLers about CFML. Cos shit just didn't work. I found 17 bugs instead".

But I just had a thought about how sort methods work in ColdFusion, trying to find examples of where sort methods return the sorted object, rather than doing an inline support. And I I've found more bugs with both ColdFusion and Lucee.

Here are the cases:

component extends="testbox.system.BaseSpec" {
    function run() {
        describe("Other sort tests", function(){
            it("is a baseline showing using BIFs as a callback", function(){
                var testString = "AbCd";
                var applyTo = function(object, operation){
                    return operation(object);
                };

                var result = applyTo(testString, ucase);
                
                expect(result).toBeWithCase("ABCD");
            });
            describe("using arrays", function(){
                it("can use a function expression calling compareNoCase as a string comparator when sorting", function(){
                    var arrayToSort = ["d","C","b","A"];
                    
                    arrayToSort.sort(function(e1,e2){
                        return compareNoCase(e1, e2);
                    });
                    
                    expect(arrayToSort).toBe(["A","b","C","d"]);
                });
                it("can use the compareNoCase BIF as a string comparator when sorting", function(){
                    var arrayToSort = ["d","C","b","A"];
                    
                    arrayToSort.sort(compareNoCase);
                    
                    expect(arrayToSort).toBe(["A","b","C","d"]);
                });
            });
            describe("using lists", function(){
                it("can use a function expression calling compareNoCase as a string comparator when sorting", function(){
                    var listToSort = "d,C,b,A";
                    
                    var sortedList = listToSort.listSort(function(e1,e2){
                        return compareNoCase(e1, e2);
                    });
                    
                    expect(sortedList).toBe("A,b,C,d");
                    expect(listToSort).toBe("d,C,b,A");
                });
                it("can use the compareNoCase BIF as a string comparator when sorting", function(){
                    var listToSort = "d,C,b,A";
                    
                    var sortedList = listToSort.listSort(compareNoCase);
                    
                    expect(sortedList).toBe("A,b,C,d");
                    expect(listToSort).toBe("d,C,b,A");
                });
            });
        });
    }
}

What I'm doing here is using CFML built-in functions as the callbacks for a sort operation. This should work, because the sort operation needs a comparator function which works exactly like compare / compareNoCase: returns <0, 0, >0 depending on whether the first argument is "less than", "equal to" or "greater than" the second object according to the sort rules. As far as strings go, the built-in functions compare and compareNoCase do this. So they should be usable as callbacks. Since I think CF2016 built-in-functions have been first-class functions, so should be usable wherever something expects a function as an argument.

The first test demonstrates this in action. I have a very contrived situation where I have a function applyTo, which takes an object and a function to apply to it. In the test I pass-in the built-in function ucase as the operation. This test passes fine on ColdFusion; fails on Lucee.

ColdFusion 4 - 14 Lucee

So after I've demonstrated the technique should work, I try to use compareNoCase as the comparator for an array sort. it just doesn't work: it does nothing on ColdFusion, and on Lucee it still just errors (not gonna count that against Lucee, as it's the same bug as in the baseline test).

ColdFusion 5 - 14 Lucee

Next I try to use it on a listSort. This time ColdFusion errors as well. So this is a different bug than the doesn't-do-anything one for arrays.

ColdFusion 6 - 14 Lucee

Here are the results for just this latter tranche of test cases:

ColdFusion:



Lucee:



Fuck me, I've giving up.

This has been the most shit experience I've ever had trying to get CFML to do something. I don't think any of this code is edge-case stuff. Those higher-order functions are perhaps not as commonly used as they ought to be by the CFMLers out there, but I'm just... trying to use them.

So... sorry Brad & Ryan... I tried to come up with something worth showing to the mob that's useful in CFML, but I've failed. And my gut reaction to this exercise is that CFML can go fuck itself, basically.

Righto.

--
Adam