Showing posts with label Unit Testing. Show all posts
Showing posts with label Unit Testing. Show all posts

Wednesday, 5 October 2022

Data-driven tests in JUnit and Kotest (and starting with TestBox & PHPUnit)

G'day:

One thing I did not look at in any of my examinations of Kotest, and then JUnit5 was how to have data-driven tests in each platform. I'm going to start with how I'd've historically approached this task in a coupla frameworks I've used in the past.

TestBox

This is so easy to do in CFML I have not bothered to find out if TestBox has a native / idiomatic way of doing this.

describe("some tests", () => {

    numbers = {
        "one" = "tahi",
        "two" = "rua",
        "three" = "toru",
        "four" = "wha"
    }

    testCases = [
        {input="one", expected="tahi"},
        {input="two", expected="rua"},
        {input="three", expected="toru"},
        {input="four", expected="wha"}
    ]

    testCases.each((testCase) => {
        it("should return #testCase.expected# when passed #testCase.input#", () => {
            expect(numbers[testCase.input]).toBe(testCase.expected)
        })
    })
})

I loop over an array of cases, calling it with each variant.


PHPUnit

PHPUnit has a slightly clunkier approach, but gets there:

class DataProviderTest extends TestCase
{

    public function setUp() : void
    {
        $this->numbers = [
            "one" => "tahi",
            "two" => "rua",
            "three" => "toru",
            "four" => "wha"
        ];
    }

    /** @dataProvider provideCasesForNumberMapperTests */
    public function testNumberMapper($input, $expected)
    {
        $this->assertEquals($this->numbers[$input], $expected);
    }

    public function provideCasesForNumberMapperTests()
    {
        return [
            ["input" => "one", "expected" => "tahi"],
            ["input" => "two", "expected" => "rua"],
            ["input" => "three", "expected" => "toru"],
            ["input" => "four", "expected" => "wha"]
        ];
    }
}

Same principle, except the iteration over the test cases specified in the data provider is handled internally by PHPUnit.

As an aside, I am pretty pleased with a small addition to the test output that PHPUnt has at the moment:

adam@DESKTOP-QV1A45U:/mnt/c/temp/phpunit_test$ vendor/bin/phpunit
PHPUnit 9.5.25 #StandWithUkraine

....                                                                4 / 4 (100%)

Time: 00:00.100, Memory: 6.00 MB

OK (4 tests, 4 assertions)

Kotest (Data Driven Testing)

Kotest is better than PHPUnit, but isn't as straight-forward as TestBox:

class DataDrivenTest : DescribeSpec({
    describe("Data-driven tests") {
        val numbers = mapOf(
            Pair("one", "tahi"),
            Pair("two", "rua"),
            Pair("three", "toru"),
            Pair("four", "wha")
        )

        data class TestCase(val input: String, val expected: String)
        withData(
            TestCase("one", "tahi"),
            TestCase("two", "rua"),
            TestCase("three", "toru"),
            TestCase("four", "wha")
        ) { (input, expected) -> numbers[input] shouldBe expected }
    }
})

It's pretty compact though. Here we need to add that data class (I have not looked at the difference between a "data class" and a "class that just has properties" yet: I had better). The iteration over the test data is intrinsic to the withData function, which takes a lambda that receives the test data unpacked as separate values, and is the actual test.

When these are run, they show as individual cases in the runner output (ie: within IntelliJ):

And in the HTML test report:

That's pretty clear.


JUnit (JUnit 5 User Guide › 2.18. Dynamic Tests)

This is pretty easy too (I was expecting some clunky Java-esque monster here, but no):

class DataDrivenTest {

    private val numbers = mapOf(
        Pair("one", "tahi"),
        Pair("two", "rua"),
        Pair("three", "toru"),
        Pair("four", "wha")
    )

    @TestFactory
    fun `Data-driven tests`() = listOf(
        "one" to "tahi",
        "two" to "rua",
        "three" to "toru",
        "four" to "wha"
    ).map { (input, expected) ->
        DynamicTest.dynamicTest("numbers[$input] should be $expected") {
            numbers[input] shouldBe expected
        }
    }
}

This is pretty similar to TestBox really. One needs that @TestFactory annotation to identify the function as - pretty much - a data provider, then one maps that as dynamicTest calls, which take a label and the lambda for the test (both of which have the data availed to them).

The test output is a bit clearer in this case, as we get to specify the specific test case label.

In IntelliJ:

And HTML test report:


All in all I'm pretty happy with both approaches here - Kotest's and JUnit's. I have to say I think I prefer the JUnit approach in this case. There's not much in it, that said.

The code from this article is at /src/test/kotlin/kotest/system/kotest/DataDrivenTest.kt and /src/test/kotlin/junit/system/junit/DataDrivenTest.kt. I have to concede I did not bother to save the CFML or PHP code. Ooops.

Righto.

--
Adam

Thursday, 2 December 2021

A question about DAO testing, abstraction, and mocking

G'day:

This is another "low-hanging fruit" kinda exercise (see also previous article: "A question about the overhead of OOP in CFML"). I'm extracting this one from the conversation in the CFML Slack channel so that it doesn't get lost. I have a growing concern about how Slack is a bit of a content-black-hole, and am cooling to the notion of helping people there. Especially when my answer is a fairly generic one. The question is a CFML-based one; the answer is just a general testing-strategy one. I say this so people are less likely to stop reading because it's about CFML ;-) Anyhoo: time for some copy and paste.

The question is a simple one:

Just dropping it in here. Unit testing DAO object using qb. How would you guys go about it or why not? Go!
Kevin L. @ CFML Slack Channel

I say "simple". I mean short and on-point.

Here's the rest of the thread. I've tidied up the English in some places, but have not changed any detail of what was said. I will prefix this saying I don't know "qb" from a bar of soap; it's some sort of DBAL. For the purposes of this, how it works doesn't really matter.


The participants here are myself, Kevin and Samuel Knowlton.

It depends on how you design this part of your application.

For me the interface between the application and the storage tier is via a Repository class, which talks to the higher part of the domain in object and collections thereof: no trace of any persistence DSL at all (no SQLish language, for example).

For the purposes of testing, I abstract the actual storage-tier communications: in CFML this would be queryExecute calls, or calls to an external library like qb. I fashion these as DAOs if they contain actual SQL statements, or an Adapter if it was something like qb.

So the repo has all the testable logic in it, the DAO or Adapter simply takes values, calls the persistence implementation, and returns whatever the persistence call returns. The repo converts that lot to/from domain-ready stuff for the application to use.

I'll just add an example of this for the blog version of this conversation:

selectFooById(id) {
    return queryExecute("
        SELECT col1, col2, col3, etc
        FROM some_table
        WHERE id = :id
    ", {id=id})
}

It's important to note that this method does nothing other than abstract the queryExecute call away from any application logic in the method that calls this. No manipulation of inputs; no remapping of outputs. Both of those are the job of the repository method that calls this DAO method. This DAO method only exists to remove the external call from the repo code. That's it.

To that end, there is no unit testable logic in the DAO. But I will do an integration test on it. For a CFML example if I call a method on the DAO, then I do get the kind of query I'd expect (correct columns).

I'd unit test the repo logic along the lines of the mapping to/from the domain/storage is correct, and any transformation logic is correct (converting datetime objects to just date-only strings and the like). I would test that if a DAO call returned a three-row query, then the repo returns a three-object collection, with the expected values.

NB: Repos don't have business logic; they only have mapping / transformation logic.

An initial task here might be to look at the DAO code, and even if it rolls-in transformation/mapping logic and persistence-tier calls, at the very least it doesn't do both in the same method.

One thing to be careful to not do is to re-test qb. Assume it's doing its job correctly, and just test what your code does with the data qb gives you.

In that case my current tests are not correct at all… They felt wrong from the start, but I was struggling with how to test the DAO layer (which is abstracted through an interface). My DAO layer at this point does have a hard reference to / dependency on qb, which in that case violates some OO principles… This is a brainbreaker

Ah yes everything I mentioned above is predicated on using DI and an IoC container which should really be the baseline for any application designed with testing in mind (which should be every application).

(NB: DI is the baseline; using an IoC container to manage the DI is a very-nice-to-have, but is not essential).

You can "fake it til you make it" with the DI, by extracting hard references to qb object creation into methods that solely do that. You can then mock-out those methods in your tests so you can return a testable object from it.

So instead of this:

// MyRepository
component {

    getObjectCollection() {
        qb = new QB()
        
        raw = qb.getStuff()
        
        collection = raw.map(() => {
            // convert to domain model objects. this is what you want to test
        })
        
        return collection
    }
}

One has this:

// MyRepository
component {

    getObjectCollection() {
        qb = getQb()
        
        raw = qb.getStuff()
        
        collection = raw.map(() => {
            // convert to domain model objects. this is what you want to test
        })
        
        return collection
    }
    
    private getQb() {
        return new QB()
    }
}

And can test like this:

it("maps the data correctly", () => {
    sut = new MyRepository()
    mockbox.prepareMock(sut) // sut is still a MyRepository, but can selectively mock methods
    
    rawData = [/* known values that exercise the logic in the mapper */]
    sut.$("getQb").$results(rawData)
    
    collection = sut.getObjectCollection()
    
    expectedCollection = [/* expected values based on known values */]
    
    expect(collection).toBe(expectedCollection)
})

You could also extract the mapping logic from the fetching logic, and test the mapping logic directly. That's getting a bit bitty for my liking though. However do what you need to do to be able to separate the logic into something testable, away from code that isn't testable. It's always doable with a bit of refactoring, even if the refactoring isn't perfect.

[…] I already have a getQueryBuilder method in the DAO object. So that's a good starting point. How would you test update / create / deletion?

Two sets of tests:

Integration tests

Test that your app integrates with QB and the DB. Call the actual live method, hit the DB, and test the results.

So…

  • create a record using your app. Then use queryExecute to get the values back out, and test it's what you expect to have been saved.
  • Insert a row into the DB (either via your create mechanism, or via queryExecute). Update it via your app. Use queryExecute to get the values back out, and test it's what you expect to have been updated.
  • Insert a row into the DB (either via your create mechanism, or via queryExecute). Delete it via your app. Use queryExecute to try to get the values back out, and make sure they ain't there.

Unit tests

  • When you call your app's create method with [given inputs] then those inputs (or equivalents, if you need to transform them) are the ones passed to the expected method of the (mocked) object returned from your mocked getQueryBuilder call.
  • Same with update.
  • Same with delete.
it("deletes the data from storage", () => {
    testId = 17
    
    sut = new MyRepository()
    mockbox.prepareMock(sut)
    
    mockedQb = createMock("QB")
    expectedReturnFromDelete = "I dunno what QB returns; but use some predictable value"
    mockedQb.$("delete").$args(testId).$results(expectedReturnFromDelete)
    
    sut.$("getQueryBuilder").$results(mockedQb)
    
    result = sut.delete(testId)
    
    expect(result).toBe(expectedReturnFromDelete) // it won't have been returned unless testId was passed to QB.delete
})

That's a very simplistic test, obvs.

Oh: run all the integration tests in a transaction that you rollback once done.

Here the unit tests don't hit the actual storage, so are easier to write, maintain, and quick to run in your red/green/refactor cycle. You run these all the time when doing development work. The integration tests are fiddly and slow, so you only run those when you make changes to that tier of the app (DB schema changes that impact the app, generally; and then when putting in a pull request, merging it, and doing a build). It is vital the code being integration tested does not have any logic in it. Because then you need to unit test it, which defeats the purpose of separating-out yer concerns.

[…]To add our two cents: we use QB and Quick all the time in integration tests to make sure that objects are created, updated, or deleted the way we want. Sometimes we will use cfmigrations to pre-populate things so we are not doing a whole create/update/destroy cycle on each test run (which can be awesome, but also slow)

[Agreed]

A lot of information to take in. I'll take my time to read this through. Thank you @Adam Cameron and @sknowlton for taking your time to answer my questions!

NP. And ping away with any questions / need for clarification.


[a few days pass]

[…]Thanks to your input I was able to refactor the "dao" layer through test-driven design!

I do have a general testing question though. Let's assume I have method foo in a service with parameters x, y, z with the following pseudocode:

function foo( x, y, z ) {
  /* does something */
  var args = ...
  
  /* validate, throw, etc... */
 
  return bar( 
    argumentCollection = args 
  );
}

Should the unit test mock out the bar function and assert the args parameter(s) and / or throw scenarios or actually assert the return value of the foo function. Or is the latter case more of a integration test rather than a unit test?

Sorry if this is a noob question, been trying to wrap my head around this…

A handy thing to do here is to frame the tests as testing the requirement, not testing the code.

Based on your sample pseudo code, then you have a requirement which is "it does the foothing so that barthing can happen". I have purposely not used the method names there directly; those are implementation detail, but should describe the action taking place. It could be "it converts the scores into roman numerals": in your example "it converts the scores" is the call to foo, and "into roman numerals" is the call to bar.

If you frame it like that, your tests will be testing all the variations of foothing and barthing - on variation per test, eg:

  • x, y, z are valid and you get the expected result (happy path)
  • x is bung, expected exception is thrown
  • y is bung [etc]
  • z is bung [etc]
  • say there's a conditional within bar… one test for each true / false part of that. The true variant might be part of that initial happy path test though.

This is how I read the pseudo code, given that bar looks like another method in the same class - probably a private method that is part of some refactoring to keep foo streamlined.

If your pseudocode was return dao.bar(argumentCollection = args), then I'd say that the DAO is a boundary of your application (eg, it is an adapter to a queryExecute call), and in this case you would mock-out dao.bar, and use spying to check that it received the correct arguments, based on the logic of foo. For example foo might multiple x by two... make sure bar receives 2x, etc

Does that cover it / make sense? Am off to a(~nother) meeting now, so had to be quick there. I'll check back in later on.

It covers my question, Adam, thanks!

One thing I didn't have time to say... I used a DAO in that example specifically to keep my answer short. A DAO is by definition a boundary.

If it was something else like a RomanNumeralWriterService and it was part of your own application, just not internal to the class the method you're testing is in... it's not so clear cut as to whether to mock or use a live object.

If RomanNumeralWriterService is easy to create (say its constructor take no or very simple arguments), and doesn't itself reach outside your application... then no reason to mock it: let it do its job. If however it's got a bunch of its own dependencies and config and will take a chunk of your test just to create it, or if it internally does its own logging or DB writing or reaching out to another service, or it's sloooow, then you mock it out. You don't want your tests messing around too much with dependencies that are not part of what yer testing, but it's not black or white whether to mock; it's a judgement call.

Yeah I was struggling with the thing you mention in your last paragraph. Service A has Service B as a dependency which itself has a lot of other dependencies (logging, dao, …). The setup for the test takes ages while the test method is 1 line of code :/.

Yeah, mock it out.

Make sure whatever you mock-out is well-test-covered as well in its own right, but it's not the job of these tests to faff around with that stuff.


That was it. I think that was a reasonably handy discussion, so was worth preserving here. YMMV of course: lemme know.

Righto.

--
Adam

 

 

Thursday, 21 October 2021

Unit testing back-fill question: how thorough to be?

G'day:

I'm extracting this from a thread I started on the CFML Slack channel (it's not a CFML-specific question, before you browse-away from here ;-), because it's a chunk of text/content that possibly warrants preservation beyond Slack's content lifetime, plus I think the audience here (such as it is, these days), are probably more experienced than the bods on the Slack channel.

I've found myself questioning my approach to a test case I'm having to backfill in our application. I'm keen to hear what people think.

I am testing a method like this:

function testMe() {
    // irrelevant lines of code here
    
    if (someService.isItSafe()) {
        someOtherService.doThingWeDontWantToHappen()

        things = yetAnotherService.getThings()
        for (stuff in things) {
            stuff.doWhenSafe()
            if (someCondition) {
                return false
            }
        }
        moarService.doThisOnlyWhenSafe()
        
        return true
    }
    
    return false
}

That implementation is fixed for now, So I can't refactor it to make it more testable just yet. I need to play with this specific hand I've been dealt.

The case I am currently implementing is for when it's not safe (someService.isItSafe() returns false to indicate this). When it's not safe, then a subprocess doesn't run (all the stuff in the if block), and the method returns false.

I've thrown-together this sort of test:

it("doesn't do [a description of the overall subprocess] if it's not safe to do so", () => {
    someService = mockbox.createMock("someService")
    someService.$("isItSafe").$results(false)

    someOtherService = mockbox.createMock("SomeOtherService")
    someOtherService.$(method="doThingWeDontWantToHappen", callback=() => fail("this should not be called if it's not safe"))
    
    yetAnotherService = mockbox.createMock("YetAnotherService")
    moarService = mockbox.createMock("MoarService")
    
    serviceToTest = new ServiceToTest(someService, someOtherService, yetAnotherService, moarService)
    
    result = serviceToTest.testMe()
    
    expect(result).toBeFalse
})

In summary, I'm forcing isItSafe to return false, and I'm mocking the first part of the subprocess to fail if it's called. And obviously I'm also testing the return value.

That works. Exactly as written, the test passes. If I tweak it so that isItSafe returns true, then the test fails with "this should not be called if it's not safe".

However I'm left wondering if I should also mock yetAnotherService.getThings, Stuff.doWhenSafe and moarService.doThisOnlyWhenSafe to also fail if they are called. They are part of the sub-process that must not run as well (well I guess it's safe for getThings to run, but not the other two).

In a better world the subprocess code could all actually be in a method called doSubprocess, rather than inline in testMe and the answer would be to simply make sure that wasn't called. We will get there, but for now we need the testing in before we do the refactor to do so.

So the things I'm wondering about are:

  • Is it "OK" to only fail on the first part of the subprocess, or should I belt and braces the rest of it too?
  • Am I being too completist to think to fail each step, especially given they will be unreachable after the first fail?
  • I do not know the timeframe for the refactor. Ideally it's "soon", but it's not scheduled. We have just identified that this code is too critical to not have tests, so I'm backfilling the test cases. If it was "soon", I'd keep the test brief (as it is now); but as it might not be "soon" I wonder if I should go the extra mile.
  • I guess once this test is in place, if we do any more maintenance on testMe, we will look at the tests first (TDD...), and adjust them accordingly if merely checking doThingWeDontWantToHappen doesn't get called is no longer sufficient to test this case. But that's in an ideal world that we do not reside in yet, so I can't even be certain of that.

What do you think?

Righto.

--
Adam

Sunday, 5 September 2021

Testing: reasoning over "testing implementation detail" vs "testing features"

G'day:

I can't see how this article will be a very long one, as it only really dwells on one concept I had to reason through before I decided I wasn't talking horseshit.

I'm helping some community members with improving their undertanding of testing, and I was reviewing some code that was in need of some test coverage. The code below is not the actual code - for obvious "plausible deniability" reasons - but it's pretty much the same situation and logic flow we were looking at, just with different business entities and the basis for the rule has been "uncomplicated" for the sake of this article:

function validateEmail() {
    if ((isInstanceOf(this.orderItems, "ServicesCollection") || isInstanceOf(this.orderItems, "ProductsCollection")) && !isValid("email", this?.email) ) {
        var hasItemWithCost = this.orderItems.some((item) => item.cost > 0)
        if (hasItemWithCost) { // (*)
            this.addError("Valid email is required")
        }
    }
}

// (*) see AdamT's comment below. I was erroneously calling .len() on hasItemWithCost due to me not having an operational brain. Fixed.

This function is called as part of an object's validation before it's processed. It's also slightly contrived, and the only real bit is that it's a validation function. For this example the business rule is that if any items in the order have a cost associated with them; then the order also needs an email address. Say the receipt needs to be emailed or something. The real world code was different, but this is a fairly common-knowledge sort of parallel to the situation.

All the rest of the code contributing to the data being validated already existed, we're just got this new validation rule around the email not always being optional now.

We're not doing TDD here (yet!), so it's a matter of backfilling tests. The exercise was to identify what test need to be written.

We picked two obvious ones:

  • make an orderItems collection with no costs in it, and the email shouldn't be checked;
  • make an orderItems collection with at least one oneitem having a costs in it, and the email should be present and valid;

But then I figured that we've skipped a big chunk of the conditional logic: we're ignoring the check of whether the collection is one of a ServicesCollection or a ProductsCollection. NB: unfortunately the order can comprise collections of two disparate types, and there's no common interface we can check against instead. There's possibly some tech debt there.

I'm thinking we have three more cases here:

  • the collection is a ServicesCollection;
  • the collection is a ProductsCollection;
  • the collection is neither of those, somehow. This seems unlikely, but it seems it could be true.

Then I stopped and thought about whether by dissecting that if statement so closely, I am just chasing 100% line-of-code coverage. And worse: aren't the specifics of that part of the statement just implementation detail, rather than part of the actual requirement? The feature was "update the order system so that orders requiring a financial transaction require a valid email address". It doesn't say anything about what sort of collections we use to store the order.

Initially I concluded it was just implementation detail and there was no (test) case to answer to here. But it stuck in my craw that we were not testing that part of the conditional logic. I chalked that up to me being pedantic about making sure every line of code has coverage. Remember I generally use TDD when writing my code, so it's just weird to me to have conditional code with no specific tests, because the condition would only ever be written because a feature called for it.

But it kept gnawing at me.

Then the penny dropped (this was a few hours later, I have to admit).

It's not just testing implementation detail. The issue / confusion has arisen because we're not doing TDD.

If I rewind to when we were writing the function, and take a TDD approach, I'd be going "right, what's the first test case here?". The first test case - the reason why want to type in isInstanceOf(this.orderItems, "ServicesCollection") - is because part of the stated requirement is implicit. When the client said "if any items in the order have a cost associated with them…" they were not explicit about "this applies to both orders for services as well as orders for products", because in the business's vernacular the distinction between types of orders isn't one that's generally considered. For a lot of requirements there's just the concept of "orders" (this is why there should be an interface for orders!). However the fully-stated requirement could be made more accurately "if any items in an order for services have a cost associated with them…", followed by "likewise, if any items in an order for products have a cost associated with them…". It's this business information that inform our test cases, which would be better enumerated as:

  • it requires an email address if any items in a services order have an associated cost;
  • it requires an email address if any items in a products order have an associated cost;
  • it does not require an email address if the order is empty;

I had to chase up what the story was if the collection was neither of those types, and it turns out there wasn't a third type that was immune to the email address requirement, it's just - as per the nice clear test case! - the order might be empty, but the object as a whole still needs validation.

There is still "implementation detail" here - the code checking the collection type is obviously the implementation of that part of the business requirement: all code is implementation detail after all. The muddiness here arose because we were backfilling tests, rather than using TDD. Had we used TDD I'd not have been questioning myself when I was deciding on my test cases. The test cases are driven by business requirements; they implementation is driven by the test cases. And, indeed, the test implementation has to busy itself with implementation detail too, because to implement those tests we need to pass-in collections of the type the function is checking for.


As an aside, I was not happy with how busy that if condition was. It seemed to be doing too much to me, and mixing a coupla different things: type checks and a validity check. I re-reasoned the situation and concluded that the validation function has absolutely nothing to do if the email is there and is valid: this passes all cases right from the outset. So I will be recommending we separate that out into its own initial guard clause:

function validateEmail() {
    if (isValid("email", this.email)) {
        return
    }
    if ((isInstanceOf(this.orderItems, "ServicesCollection") || isInstanceOf(this.orderItems, "ProductsCollection"))) {
        var hasItemsWithCost = this.orderItems.some((item) => item.cost > 0)
        if (hasItemsWithCost.len()) {
            this.addError("Valid email is required")
        }
    }
}

I think this refactor makes things a bit easier to reason through what's going on. And note that this is a refactor, so we'll do it after we've got the tests in place and green.

I now wonder if this is something I should be looking out for more often? Is a compound if condition that is evaluating "asymmetrical" sub-conditions indicative of a code smell? Hrm… will need to keep an eye on this notion.

Now if only I could get that OrderCollection interface created, the code would be tidier-still, and closer to the business requirements.

Righto.

--
Adam

 

PS: I actually doubted myself again whilst writing this, but by the time I had written it all out, I'm happy that the line of thinking and the test cases are legit.

Tuesday, 24 August 2021

Test coverage: it's not about lines of code

G'day

A coupla days ago someone shared this Twitter status with me:

I'll just repeat the text here for google & copy-n-paste-ability:

My personal opinion: Having 100% test coverage doesn't make your code more effective. Tests should add value but at some point being overly redundant and testing absolutely every line of code is ineffective. Ex) Testing that a component renders doesn't IN MY OPINION add value.

Emma's position here is spot on, IMO. There were also a lot of "interesting" replies to it. Quelle surprise. Go read some of them, but then give up when they get too repetitive / unhelpful.

In a similar - but slightly contrary - context, I was reminded of my erstwhile article on this topic: "Yeah, you do want 100% test coverage".

But hang on: on one hand I say Emma is spot on. On the other hand I cross-post an old article that seems to contradict this. What gives?

The point of differentiation is "testing absolutely every line of code" vs "100% test coverage". Emma is talking about lines of code. I'm talking about test cases.

Don't worry about testing lines of code. That's intrinsically testing "implementation". However do care about your test cases: the variations that define the feature you are working on. you've been asked to develop a feature and its variations, and you don't know you've delivered the feature (or in the case of automated testing: continue to having been delivered in a working state) unless you test it.

Now, yeah, sure: features are made of lines of code, but don't worry about those. A trite example I posited to a mate yesterday is this: consider this to be the implementation of your feature:

doMyFeature() {
    // line 1
    // line 2
    // line 3
    // line 4
    // line 5
    // line 6
    // line 7
    // line 8
    // line 9
    // line 10
}

I challenge you to write a test for "My Feature", and not by implictation happen to test all those lines of code. But it's the feature we care about, not the lines of code.

On the other hand, let's consider a variant:

doMyFeature() {
    // line 1
    // line 2
    // line 3
    // line 4
    // line 5
    if (someVariantOfMyFeature) {
        // line 6
    }
    // line 7
    // line 8
    // line 9
    // line 10
}

If you run your test coverage analysis and see that line 6 ain't tested, this is not a case of wringing one's hands about line 6 of the code not being tested; it's that yer not testing the variation of the feature. It's the feature coverage that's missing. Not the lines-of-code coverage.

Intrinsically your code coverage analysis tooling probably marks individual lines of code as green or red or whatever, but only if you look that closely. If you zoom out a bit, you'll see that the method the lines of code are in is either green or orange or red; and out further the class is likewise green / orange / red, and probably says something like "76% coverage". The tooling necessarily needs to work in lines-of-code because its a dumb machine, and those are the units it can work on. You are the programmer don't need to focus on the lines-of-code. You have a brain, and what the report is saying is "you've not tested part of your feature", and it's saying "the bit of the feature that is represented by these lines of code". That bit. You're not testing it. Go test yer feature properly.

There's parts of the implementation of a feature I won't test maybe. Your DAOs and your other adapters for external services? Maybe not something I'd test during the TDD cycle of things, but it might still be prudent to chuck in an integration test for those. I mean they do need to work and continue to work. But I see this as possibly outside of feature testing. Maybe? Is it?

Also - now I won't repeat too much of that other article - there's a difference between "coverage" and "actually tested". Responsible devs can mark some code as "won't test" (eg in PHPUnit with a @codeCoverageIgnore), and if the reasons are legit then they're "covered" there.

Why I'm writing this fairly short article when it's kinda treading on ground I've already covered is because of some of the comments I saw in reply to Emma. Some devs are a bit shit, and a bit lazy, and they will see what Emma has written, and their take away will be "basically I don't need to test x because x is some lines of code, and we should not be focusing on lines of code for testing". That sounds daft, but I know people that have rationalised their way out of testing perfectly testable (and test-necessary) code due to "oh we should not test implementation details", or "you're just focusing on lines of code, that's wrong", and that sort of shit. Sorry pal, but a feature necessarily has an implementation, and that's made out of lines of code, so you will need to address that implementation in yer testing, which will innately report that those lines of code are "covered".

Also this notion of "100% coverage is too high, so maybe 80% is fine" (possibly using the Pareto Pinciple, poss just pulling a number of out their arses). Serious question: which 80%? If you set that goal, the a lot of devs will go "aah, it's in that 20% we don't test". Or they'll test the easy 80%, and leave the hard 20% - the bit that needs testing - to be the 20% they don't test (as I said in the other article: cos they basically can't be arsed).

Sorry to be a bit of a downer when it comes to devs' level of responsibility / professionalism / what-have-you. There are stack of devs out there who rock and just "get" what Emma (et al) is saying, and crack on with writing excellently-designed and well-tested features. But they probably didn't need Emma's messaging in the first place. I will just always think some more about advice like this, and wonder how it can be interpreted, and then be a contrarian and say "well actually…" (cringe, OK I hope I'm not actually doing that here?), and offer a different informed observation.

So my position remains: set your sights on 100% feature coverage. Use TDD, and your code will be better and leaner and you'll also likely end up with 100% LOC coverage too (but that doesn't matter; it's just a test-implementation detail ;-).

Righto.

--
Adam

Sunday, 16 May 2021

CFWheels: running TestBox instead of RocketUnit

G'day:

CFWheels ships with its own inbuilt (albeit third-party) testing framework. I discuss its merits in an earlier article: "Testing: A Horror Story". You can probably work out my opinion of the inbuilt testing framework - RocketUnit - from the title. That's really all you need to know to contextualise why I am now going to get TestBox working in a CFWheels context. One would expect that this would simply be a matter of installing TestBox and then using the CFWheels API to call methods on its classes to… um… use it. Not so fast there chief.

The CFWheels application has been implemented via a few opaque (to me) architectural decisions, which means it's not straight-forward to run the code outside the context of an application solely designed to respond to HTTP request in an MVC fashion. You've read along already perhaps with my experiences of hacking it about to even be installed in a sensible place ("Short version: getting CFWheels working outside the context of a web-browsable directory"). Code-wise, CFWheels is not an object-oriented application, instead being a more procedural affair, relying on include to share files full of functions. Lack of OO design means it's difficult to interact with in an isolated fashion that one might expect: leveraging/relying on encapsulation, abstraction, inheritance and various other programming techniques familiar to modern CFML. There's no API really, there's just a bunch of independent functions that are simultaneously disconnected from each other, but at the same time strongly-coupled: using a function homed in one .cfm file might require a function homed in another .cfm file, and one just has to know that. All functions are public (given they're not in CFCs, this is can't be helped), but not all of them are designed to be used outside of the CFWheels app (they're prefixed with $). Also the functions might not necessarily keep their variables encapsulated within their own scope; and also fairly frequently rely on other functions having done the same: exposing data into the calling context for it to be there for another funtion to use. There's no CFCs to instantiate, one just needs to include one or more of these .cfm files full of functions. This makes for a fairly fragile and confusing coding experience, IMO.

I'm not going to be daunted by this. I'm also not going to let this architectural approach bleed into my own code: I'm not writing procedural code when I'm using an OO language in 2021. Somehow I'm gonna wrap it up in an adapter component that my tests can leverage to test my code.

Controllers

My first task is to work out how to execute the code for a route programmatically. I could simply make an external HTTP request to to the route and inspect its response, but this approach won't help me with lower level testing (like calling model functionality, in the next step). So I'm going to work out how I can write code along these lines: myResponseObject = cfwheelsAdapter.someMethod(), so I can then poke and prod the response object to see if it's done the expected thing.

Looking through the CFWheels docs ("Testing Your Application"), I find this test example:

function testRedirectAndFlashStatus() {
  // define the controller, action and user params we are testing
  local.params = {
    controller="users",
    action="create",
    user={
      firstName="Hugh",
      lastName="Dunnit",
      email="hugh@somedomain.com",
      username="myusername",
      password="foobar",
      passwordConfirmation="foobar"
    }
    };

  // process the create action of the controller
  result = processRequest(params=local.params, method="post", returnAs="struct");
  
  // make sure that the expected redirect happened
  assert("result.status eq 302");
  assert("result.flash.success eq 'Hugh was created'");
  assert("result.redirect eq '/users/show/1'");
  
}

OK so I don't use a route, I just call the controller. That's cool, although I'd rather be doing this by providing a route, a method, and the various params the controller needs - query params, body content, headers, etc - and make like a request, but I might look into that later. For now the object of the exercise is to be able to call the CFWheels code and check the response. And the function I want to call is this processRequest. This is an example of what I was saying before: it's not a method of a controller object or anything, it's just a headless function. And looking at that code I have no idea where it's implemented or how it came to be available to call in my test method. CFWheels just assumes any code you write will already be in the right context for this to work. I'm going to create my adapter object and include whatever file that processRequest function is in.

Well. Actually first I'm gonna create a test for this operation (/test/functional/WheelsRequestRunnerTest.cfc):

component extends=testbox.system.BaseSpec {

    function run() {
        describe("Test WheelsRequestRunner functionality", () => {
            it("can execute a request and receive the expected response", () => {
                runner = new test.WheelsRequestRunner()

                result = runner.processRequest(
                    params = {
                        controller = "testRoute",
                        action = "httpStatusOkTest"
                    },
                    method = "get",
                    returnAs = "struct"
                )

                expect(result).toHaveKey("status")
                expect(result.status).toBe(200)
                expect(result).toHaveKey("body")
                expect(result.body).toInclude("[200-OK]")
            })
        })
    }
}

This is kinda eating its own dogfood. My test of the runner class is that it can be used to test a controller. The controller in question simply renders a view (indeed: there's no controller, it's just the view in this case), and the view does nothing other than emit [200-OK].

The WheelsRequestRunner class is gonna encapsulate any CFWheels files I need to include to make stuff work (and contain an variables-scope bleeding the include files leak). First up I need to expose this processRequest function, and before I do that, I need to find where CFWheels implements it. The only way to do this is just doing a find in the codebase for function processRequest. It turns out it's in /wheels/global/misc.cfm. Not a hugely helpful filename that, and I wonder why a very controller-specific function is in a file called global/misc: it'd be nice if a controller function was in a file called controller.cfm or something. Anyway, I sling misc.cfm into my runner class:

component {
    include template="/wheels/global/misc.cfm"
}

When I run my test now, I get an error:

No matching function [$ARGS] found
global.misc_cfm$cf.udfCall2(/wheels/global/misc.cfm:284)

I track-down $args (not an excellent fuction name: what is "$args" supposed to do?) in /wheels/global/internal.cfm, so I include that too.

component {
    include template="/wheels/global/misc.cfm"
    include template="/wheels/global/internal.cfm"
}

At this point my WheelsRequestRunner is exposing 72 public methods, when in reality I only need one of those. Less than ideal.

But we're not done: we have a second error:

No matching function [$GET] found
global.misc_cfm$cf.udfCall2(/wheels/global/misc.cfm:307)

I found $get in /wheels/global/util.cfm. I wonder what the perceived difference is between global/misc, and global/util? All three components of those file paths are just meaningless and generic filler terms. But anyway… I'll include this too, and get another error, suggesting probably a fourth file I'm going to need to include:

No matching function [$DOUBLECHECKEDLOCK] found
global.misc_cfm$cf.udfCall2(/wheels/global/misc.cfm:482)

I found $doubleCheckedLock in /wheels/global/cfml.cfm. I'll include this too, and…

 

It works!

Note: WheelsRequestRunner has needed to expose 99 functions, spread across four disparate files, to avail itself of the one function it needed to call:

component {
    include template="/wheels/global/misc.cfm"
    include template="/wheels/global/internal.cfm"
    include template="/wheels/global/util.cfm"
    include template="/wheels/global/cfml.cfm"
}

I just want to do another controller test as well: an unhappy-path one:

it("can test a 400 response", () => {
    try{
        result = runner.processRequest(
            name = "testRoute",
            params = {
                controller = "testRoute",
                action = "badRequest"
            },
            method = "get",
            returnAs = "struct"
        )

        expect(result).toHaveKey("status")
        expect(result.status).toBe(400)
        expect(result).toHaveKey("body")
        expect(result.body).toInclude("[400-BAD-REQUEST]")
    } finally {
        header statuscode=200;
    }
})

This worked too without any more adjustment to WheelsRequestRunner. I've needed to put that try/finally around the test code because of the way CFWheels handles responses. It doesn't create a response object or anything like that that I could set my response status code in, the CFWheels approach is just that if I want to change the response status, I just change it … directly changing the header of the current request (in this case the request being made by the test runner). This is less than ideal. I'm currently using CFML's inbuilt header statement to do this, later when I work out how to do dependency injection with CFWheels I'll write my own wrapper to do this, and mock it out for the test. I do need to actively address this in my test though, because returning a 400 also means the test run reports as a failure because it doesn't complete with a 200-OK. Easily fixed, but I really oughtn't have to do this sort of thing.

For now, I'm satisfied I can test basic controllers. I want to move on to test a basic model, because I expect I'll need to do some more reconfiguration to make that work.

Models

I've created a dead-simple model class:

component extends=models.Model {

    function init()
    {
        table(false)
        property(name="name", label="Name")
        property(name="email", label="Email address")
        property(name="password", label="Password")
    }
}

I'm just going to check that I can create an instance of one of these, and the property values get used:

it("can test a model", () => {
    properties = {name="TEST_NAME", email="TEST_EMAIL", password="TEST_PASSWORD"}

    model = runner.model("SomeModel")
    testModel = model.new(properties)
    expect(testModel).toBeInstanceOf("models.SomeModel")
    expect(testModel.properties()).toBe(properties)
})

When I run this, I get another instance of CFWheels not being able to find something:

invalid component definition, can't find component [...src.models.SomeModel]

This is a different situation though: this is because - I am betting - because I am running the request from the /test/ directory, not the site's root. I'm going to need to dig around to see how CFWheels concludes that it should be looking for SomeModel in ...src.models.SomeModel. To misquote Captain Oates: "I am just going [into the CFWheels codebase] and may be some time."…

Right so I was close, but it was more my fault than anything. Kinda. Firstly I had a setting slightly wrong in my config/settings.cfm, due to a misunderstanding as to what CFWheels was expecting with some settings. Previously when configuring my CFWheels site to work with the source code "above" the web root, I had to make a coupla settings changes for CFWheels to be able to find my controllers and views, and those settings took file system paths:

set(viewPath = "../src/views")
set(controllerPath = "/app/controllers")

Without thinking I had just assumed that the equivalently-named modelPath would also be a file-system path, and had set it thus:

set(modelPath = "../src/models")

However it's actually a component path reference, so just needs to be this:

set(modelPath = "/models")

And this is used in conjunction with a mapping I had missing in Application.cfc:

component {

    // ...    

    thisDirectory = getDirectoryFromPath(getCurrentTemplatePath())
    // ... 
    this.mappings["/models"] = getCanonicalPath("#thisDirectory#models")
    // ... 
}

If I didn't have both of those, CFWheels was resolving the CFC path to be ...src.models.SomeModel, and even with that fixed, it was looking for the correct path (models.SomeModel) from a base context of the public directory, not the src directory. Once I sorted that out, the test passed.

Views

I wasn't actively going to test this because I've been wittering on enough already in this article, but last week I'd actually already written some tests that checked how the linkTo function worked under different URLRewriting settings, and I've just noticed the one test I had that was working satisfactorily regarding this is currently broken, and it looks to be down to what I've been doing with TestBox recently. Here's the test:

it("uses the URI when URLRewriting is on", () => {
    runner = new test.WheelsRequestRunner()
    runner.set(URLRewriting="On")

    result = runner.processRequest(
        params = {
            controller = "testRoute",
            action = "linktotest"
        },
        method = "get",
        returnAs = "struct"
    )

    expect(result).toHaveKey("body")
    expect(result.body).toMatch('<a\b[^>]*href="/testRoute/linkToTestTarget"[^>]*>')
})

It's pretty simple: it hits an endpoint that returns mark-up with a link generated by linkTo:

<cfoutput>#linkTo(route="testRouteLinkToTestTarget",text="Click Me")#</cfoutput>

This test had been working, but it's erroring with because the URL for the link it's returning is runTests.cfm/testroute/linktotesttarget. it should be just /testRoute/linkToTestTarget. "Coincidentally" the name of the file I'm running the tests from is… runTest.cfm. Hazarding a guess, I am assuming the linkTo is doing some shenanigans with a find and replace expecting the CGI.script_name to be just /index.cfm. The script name shouldn't come into it with a fully re-written URL, given it doesn't figure in it, but… well…. I'll track down the code in question…

… yeah it was as I suspected. In /wheels/global/misc.cfm we have this:

// When URL rewriting is on we remove the rewrite file name (e.g. rewrite.cfm) from the URL so it doesn't show.
// Also get rid of the double "/" that this removal typically causes.
if (arguments.$URLRewriting == "On") {
    local.rv = Replace(local.rv, application.wheels.rewriteFile, "");
    local.rv = Replace(local.rv, "//", "/");
}

(For future reference, CFWheels Team, if you feel the need to block-out code like that with an explanatory comment, it means a) your function is doing too much (this function is close to 200 lines long!!!); b) the code that's been blocked-out should be in its own function)

So yeah, CFWheels expecting one of two possibilities for the file part of the script name here: index.cfm or rewrite.cfm. My runTests.cfm doesn't cut the mustard. However this is easy enough to solve, I can override the setting in test/Application.cfc:

component extends=cfmlInDocker.Application {

    this.name = "testApplication"

    // ...

    function onApplicationStart() {
        super.onApplicationStart()
        set(rewriteFile="runTests.cfm")
    }

    // ...

}

I've also given the test app a new name so its application scope settings don't interfere with the application scope for the front of the site.

Summary

To get at least controllers and models testable wasn't much effort. I had to do this:

That's not so bad. It was more hassle finding out where the relevant CFWheels functions were, and following the logic to see what I had to override/fix and why, but the end result was all right. I'm just glad I can write decent tests now.

Righto.

--
Adam

Monday, 3 May 2021

Testing: A Horror Story

G'day:

My adventures in Lucee / CFWheels / Docker continues. This time I'm looking at what CFWheels offers by way of testing.

2021-05-04 - editorial update

In this article I am very - but I feel justifiably - harsh about RocketUnit (you need to read down a bit before you get to this). I'm also harsh towards the CFWHeels team's historical decision to include it in CFWheels. I really do think that was a poor decision. But so be it, it's done: other than mentioning it, there's no point me dwelling on it beyond that.

Some of the tone and wording of this article has be read as an indictment of CFWheel and/or its team. Beyond that initial poor decision, all my opprobrium here is directed at RocketUnit, not CFWheels, and not the team. I can see how people would think otherwise though.

To be clear: this article is largely about RocketUnit. CFWheels was just the path via which I came to know about it.

I'm going to be brief, because it's so appalling I don't want to waste too much time on it. It can be summarised thus:

Are you f***ing joking?

Firstly I'm gonna quote the CFWheels docs (Testing Your Application › The Test Framework):

Like everything else in CFWheels, the testing framework is very simple, yet powerful. You don't need to remember a hundred different functions because CFWheels' testing framework contains only a handful.

(my emphasis there)

Sorry but I see wording like this implying something about CFML developers' capabilities all the time, and it really annoys me. To me that statement is patronising as well as disingenuous. It plays on this perception that somehow CFML devs aren't capable of doing anything unless it's reeeeally easy, and that that is just how things should be. Can we please not? Can we please not normalise CFML devs as somehow not being able to cope with a small amount of complexity if it will ultimately assist them growing as developers? It's also a bit disingenuous because no testing framework requires one to learn any more than a handful of methods, and learning how to write code is our job. It suggests testing tooling is somehow usually hard (which it just isn't).

Next I read further into the docs:

Evaluation

assert(): This is the main method that you will be using when developing tests. To use, all you have to do is provide a quoted expression. The power of this is that ANY 'truthy' expression can be used.

An example test that checks that two values equal each other:

function testActualEqualsExpected() {
    actual = true;
    expected = true;
    assert("actual eq expected");
}

I'm sorry, come again? You. Give. It. A. Quoted. Expression. It's using evaluate(), isn't it? Now I'm not actually fundamentally against evaluate as a thing. Not like some of the CFML community groupthink. It has its place, just that that place is seldom "in your code": one hardly ever needs to use it. I can't wait to see why it's being used here. Other than, like, cos for some reason the assert function expects a string, so it needs to be evaluated to even work.

And "[the] power of this is that ANY 'truthy' expression can be used"? So: just like any other implementation of assert that any other testing framework ever written. Except none of those need to pass a string to the assert function. This is not a MSP of this testing framework, and the implementation we're being shown is inferior to any other testing framework I've seen. I won't look at the actual code for this just yet, as there's still more horror to see in the guidance docs first.

I'm going to wind back up the docs a bit now:

Do not var-scope [any] variables used in your tests. In order for the testing framework to access the variables within the tests that you're writing, all variables need to be within the component's variables scope.

When I first read that I was like "oh yer having a laugh, right?" and it was not until I got to the "boolean expression as a string" and "evaluate" that it suddenly made "sense". Because the string expression needs to be evaluated within the implementation of assert, of course the variables in the "expression" can't be function-local to your test: assert's code won't be able to see those. So now because the implementation has chosen to pass the test expression as a string, it's forcing us to write bad, flaky, fragile test code, with variables bleeding all over the place by design. I can't really see how this framework could ever be used seriously outside a proof of concept or other very superficial environment. It'd drive the devs batty.

And now I hasten to add, the rest of the CFWHeels docs on testing are actually pretty helpful when it focuses-away from the test framework, and back to CFWheels stuff.

So what's with this test code? What is going on with this assert function? Oh yeah, btw: assert is the only assertion this framework offers. So there's goes your elegant, expressive, easy to understand test code. It's just a bunch of vanilla assert calls now. This is a telling stark contrast every other test framework out there that has decided to implement a lot of different assertions, and sees this as a good thing. I mean obviously at the end of the day they are all asserting if something is true; if you look at the internal implementations, generally all the facade assertions end up doing just that: calling a central "assert-true" assert method. The situation-specific assertions are there to make your test code easier and simpler to understand, which is vital in testing. I cannot understand why there was a perception that it was a good thing for the framework to have only one assertion.

Right. The code. Which is all in wheels/test/functions.cfm (nice file name there, CFWheels Team :-|. And I also have yet to work out why CFWheels is designed such that all the code for its .cfc files are implemented in .cfm files: this is a question for another day).

This is interesting:

Copyright 2007 RocketBoots Pty Limited - http://www.rocketboots.com.au

OK so here's where I clock that CFWheels have just bundled this "RocketUnit" thing into the framework. OK so in defence of the CFWheels team, this is possibly just a really poor decision to include this, rather than the team actively writing this… this… thing. I note that CFWheels (in 2021) still bundles only v1.0 of RocketUnit (from something like 2007).

And here we go:

public void function assert(required string expression) {
    // other stuff snipped
    if (!Evaluate(arguments.expression)) {

And why is it doing this? This is hilarious. The way this code has been implemented, and the reason that one needs to pass a string to assert is because if the assertion fails, then the code basically picks through the string, does a primitive tokenisation effort to find dynamic expressions, and then evaluates them again to get values to return in the assertion-failed message. EG; if your expression is "x == 3" and variables.x is 4, it'll re-evaluate each element of the string so it can say something like "uh-oh x was 4 not 3". And the entire reason it needs to do this with an equality-check is it's shot itself in the foot by only having assert, instead of taking the obvious route of having an equality assertion that takes two values to compare. Neither of which need to be in a global scope; neither of which need to be re-evaluated strings, because the assertion was passed their values. It could be called, I dunno, assertEquals or something.

It actually gets better (read: "worse"). In v2.x of RocketUnit, there's no need for the quoted string any more, because what the assertion implementation does when the assertion fails is it thows a fake exception to generate a callstack, and then crawls its way up that to try to find the line of code that called assert, and extract the expression from that. Seriously, have a look:

try {
    throw(errorCode=DUMMY_ERRCODE);
} catch(any) {
    // assert is one stack frame up from this function [1], therefore [2]
    source = cfcatch.tagContext[2].codePrintPlain;
    startLineNumber = lineNumber = cfcatch.tagContext[2].line;
}

If you look further down, you can see how the implementation itself knows how flaky its own approach even is, given the various places the code that tries to extract the expression needs to bail out.

Argh!

When discussing this with another CFML community member, they sent me this:

* see footnote regarding usage of this image

I think this is something that should have set-off alarm bells when the RocketUnit project first started… evolving.


Here's how it would seem that one needs to safely write tests with this shambles:

component extends="app.tests.Test" {

    function testBasicRoutingWorks() {
        variables.response = processRequest(
            params = {
                controller = "testroute",
                action = "debug"
            },
            returnAs = "struct"
        )
        try {
            assert("variables.response.status eq 200")
            assert("variables.response.body contains 'EXPECTED_CONTENT'")
        } finally {
            structDelete(variables, "response")
        }
    }
}

One can't just leave variables-scoped variables lying around the place in test code, so you need to get rid. Test code needs to be well encapsulated and robust and not prone to interference from unexpected avenues. Or one could just have one test per CFC. Or one could painstakingly make sure that tests don't accidentally share variables. Or hey just suck it and see (I suspect this is what RocketUnit expects). It's just too easy to accidentally not reinitialise a variable and be using an uncontrolled value from another test in a subsequent one.

Do not var-scope [any] variables

I have to wonder why - at the point one realises one needed to document something that contravenes standard good coding practice that languages go out of their way to accommodate - the person didn't go "ah now lads we've f***ed this one", and rethink things.


Do me a favour if you're using CFWheels? Don't use its in-built testing framework. It's shite. Use TestBox instead. It's dead easy to use, still supported, still being actively developed on a daily basis, and makes for really nice easy to read, easy to develop test code. There's a strong and willing community out there to help you if you get stuck with anything. It'll also be a cinch to test yer CFWheels work with, as it's completely situation-agnostic.

And to the CFWheels team: please consider just ripping this out of CFWheels wholesale. This is really letting your framework down, I think.

Righto.

--
Adam

* I am not the copyright holder of this image. I'm hoping I'm covered by "fair use" here, but if you are the copyright holder and disagree: let me know, and I'll replace it with something else.

Thursday, 8 April 2021

TDD and external services

G'day

You might have noticed I spend a bit of my time encouraging people to use TDD, or at the very least making sure yer code is tested somehow. But use TDD ;-)

As an interesting aside, I recently failed a technical interview because the interviewer didn't feel I was strong enough at the testing side of things. Given what I see around the industry… that seems to be a moderately high bar yer setting for yerselves there, peeps. Or perhaps I'm just shit at articulating myself. Hrm. But anyway.

OK, so I rattled out a quick article a few days ago - "TDD & professionalism: a brief follow-up to Thoughts on Working Code podcast's Testing episode" - which revisits some existing ground and by-and-large is not relevant to what I'm going to say here, other than the "TDD & professionalism" being why I bang on about it so much. And you might think I bang on about it here, but I also bang on about it at work (when I have work I mean), and in my background conversations too. I try to limit it to only my technical associates, that said.

Right so Mingo hit me up in a comment on that article, asking this question:

Something I ran into was needing to access the external API for the tests and I understand that one usually uses mocking for that, right? But, my question is then: how do you then **know** that you're actually calling the API correctly? Should I build the error handling they have in their API into my mocked up API as well (so I can test my handling of invalid inputs)? This feels like way too much work. I chose to just call the API and use a test account on there, which has it's own issues, because that test account could be setup differently than the multiple different live ones we have. I guess I should just verify my side of things, it's just that it's nice when it's testing everything together.

Yep, good question. With new code, my approach to the TDD is based on the public interface doing what's been asked of it. One can see me working through this process in my earlier article "Symfony & TDD: adding endpoints to provide data for front-end workshop / registration requirements". Here I'm building a web service end point - by definition the public interface to some code - and I am always hitting the controller (via the routing). And whatever I start testing, I just "fake it until I make it". My first test case here is "It needs to return a 200-OK status for GET requests on the /workshops endpoint", and the test is this:

/**
 * @testdox it needs to return a 200-OK status for successful GET requests
 * @covers \adamCameron\fullStackExercise\Controller\WorkshopsController
 */
public function testDoGetReturns200()
{
    $this->client->request('GET', '/workshops/');

    $this->assertEquals(Response::HTTP_OK, $this->client->getResponse()->getStatusCode());
}

To get this to pass, the first iteration of the implementation code is just this:

public function doGet() : JsonResponse
{
    return new JsonResponse(null);
}

The next case is "It returns a collection of workshop objects, as JSON", implemented thus:

/**
 * @testdox it returns a collection of workshop objects, as JSON
 * @covers \adamCameron\fullStackExercise\Controller\WorkshopsController
 */
public function testDoGetReturnsJson()
{
    $workshops = [
        new Workshop(1, 'Workshop 1'),
        new Workshop(2, 'Workshop 2')
    ];

    $this->client->request('GET', '/workshops/');

    $resultJson = $this->client->getResponse()->getContent();
    $result = json_decode($resultJson, false);

    $this->assertCount(count($workshops), $result);
    array_walk($result, function ($workshopValues, $i) use ($workshops) {
        $workshop = new Workshop($workshopValues->id, $workshopValues->name);
        $this->assertEquals($workshops[$i], $workshop);
    });
}

And the code to make it work shows I've pushed the mocking one level back into the application:

class WorkshopsController extends AbstractController
{

    private WorkshopCollection $workshops;

    public function __construct(WorkshopCollection $workshops)
    {
        $this->workshops = $workshops;
    }

    public function doGet() : JsonResponse
    {
        $this->workshops->loadAll();

        return new JsonResponse($this->workshops);
    }
}

class WorkshopCollection implements \JsonSerializable
{
    /** @var Workshop[] */
    private $workshops;

    public function loadAll()
    {
        $this->workshops = [
            new Workshop(1, 'Workshop 1'),
            new Workshop(2, 'Workshop 2')
        ];
    }

    public function jsonSerialize()
    {
        return $this->workshops;
    }
}

(I've skipped a step here… the first iteration could/should be to mock the data right there in the controller, and then refactor it into the model, but this isn't about refactoring, it's about mocking).

From here I refactor further, so that instead of having the data itself in loadAll, the WorkshopCollection calls a repository, and the repository calls a DAO, which for now ends up being:

class WorkshopsDAO
{
    public function selectAll() : array
    {
        return [
            ['id' => 1, 'name' => 'Workshop 1'],
            ['id' => 2, 'name' => 'Workshop 2']
        ];
    }
}

The next step is where Mingo's question comes in. The next refactor is to swap out the mocked data for a DB call. We'll end up with this:

class WorkshopsDAO
{
    private Connection $connection;

    public function __construct(Connection $connection)
    {
        $this->connection = $connection;
    }

    public function selectAll() : array
    {
        $sql = "
            SELECT
                id, name
            FROM
                workshops
            ORDER BY
                id ASC
        ";
        $statement = $this->connection->executeQuery($sql);

        return $statement->fetchAllAssociative();
    }
}

But wait. if we do that, our unit tests will be hitting the DB. Which we are not gonna do. We've run out of things to directly mock as we're at the lower-boundary of our application, and the connection object is "someon else's code" (Doctrine/DBAL in this case). We can't mock that, but fortunately this is why I have the DAO tier. It acts as the frontier between our app and the external service provider, and we still mock that:

public function testDoGetReturnsJson()
{
    $workshopDbValues = [
        ['id' => 1, 'name' => 'Workshop 1'],
        ['id' => 2, 'name' => 'Workshop 2']
    ];

    $this->mockWorkshopDaoInServiceContainer($workshopDbValues);

    // ... unchanged ...

    array_walk($result, function ($workshopValues, $i) use ($workshopDbValues) {
        $this->assertEquals($workshopDbValues[$i], $workshopValues);
    });
}

private function mockWorkshopDaoInServiceContainer($returnValue = []): void
{
    $mockedDao = $this->createMock(WorkshopsDAO::class);
    $mockedDao->method('selectAll')->willReturn($returnValue);

    $container = $this->client->getContainer();
    $workshopRepository = $container->get('test.WorkshopsRepository');

    $reflection = new \ReflectionClass($workshopRepository);
    $property = $reflection->getProperty('dao');
    $property->setAccessible(true);
    $property->setValue($workshopRepository, $mockedDao);
}

We just use a mocking library (baked into PHPUnit in this case) to create a runtime mock, and we put that into our repository.

The tests pass, the DB is left alone, and the code is "complete" so we can push it to production perhaps. But we are not - as Mingo observed - actually testing that what we are asking the DB to do is being done. Because all our tests mock the DB part of things out.

The solution is easy, but it's not done via a unit test. It's done via an integration test (or end-to-end test, or acceptance test or whatever you wanna call it), which hits the real endpoint which queries the real database, and gets the real data. Adjacent to that in the test we hit the DB directly to fetch the records we're expecting, and then we compare the JSON that the end point returns represents the same data we manually fetched from the DB. This tests the SQL statement in the DAO, that the data fetched models OK in the repo, and that the model (WorkshopCollection here) applies whatever business logic is necessary to the data from the repo before passing it back to the controller to return with the response, which was requested via the external URL. IE: it tests end-to-end.

public function testDoGetExternally()
{
    $client = new Client([
        'base_uri' => 'http://fullstackexercise.backend/'
    ]);

    $response = $client->get('workshops/');
    $this->assertEquals(Response::HTTP_OK, $response->getStatusCode());
    $workshops = json_decode($response->getBody(), false);

    /** @var Connection */
    $connection = static::$container->get('database_connection');
    $expectedRecords = $connection->query("SELECT id, name FROM workshops ORDER BY id ASC")->fetchAll();

    $this->assertCount(count($expectedRecords), $workshops);
    array_walk($expectedRecords, function ($record, $i) use ($workshops) {
        $this->assertEquals($record['id'], $workshops[$i]->id);
        $this->assertSame($record['name'], $workshops[$i]->name);
    });
}

Note that despite I'm saying "it's not a unit test, it's an integration test", I'm still implementing it via PHPUnit. The testing framework should just provide testing functionality: it should not dictate what kind of testing you implement with it. And similarly not all tests written with PHPUnit are unit tests. They are xUnit style tests, eg: in a class called SomethingTest, and the the methods are prefixed with test and use assertion methods to implement the test constraints.

Also: why don't I just use end-to-end tests then? They seem more reliable? Yep they are. However they are also more fiddly to write as they have more set-up / tear-down overhead, so they take longer to write. Also they generally take longer to run, and given TDD is supposed to be a very quick cadence of test / run / code / run / refactor / run, the less overhead the better. The slower your tests are, the more likely you are to switch to writing code and testing later once you need to clear your head. In the mean time your code design has gone out the window. Also unit tests are more focused - addressing only a small part of the codebase overall - and that has merit in itself. Aso I used a really really trivial example here, but some end-to-end tests are really very tricky to write, given the overall complexity of the functionality being tested. I've been in the lucky place that at my last gig we had a dedicated QA development team, and they wrote the end-to-end tests for us, but this also meant that those tests were executed after the dev considered the tasks "code complete", and QA ran the tests to verify this. There is no definitive way of doing this stuff, that said.

To round this out, I'm gonna digress into another chat I had with Mingo yesterday:

Normally I'd say this:

Unit tests
Test logic of one small part of the code (maybe a public method in one class). If you were doing TDD and about to add a condition into your logic, you'd write a until test to cover the new expectations that the condition brings to the mix.
Functional tests
These are a subset of unit tests which might test a broader section of the application, eg from the public frontier of the application (so like an endpoint) down to where the code leaves the system (to a logger, or a DB, or whatever). The difference between unit tests and functional tests - to me - are just how distributed the logic being tests is throughout the system.
Integration tests
Test that the external connections all work fine. So if you use the app's DB configuration, the correct database is usable. I'd personally consider a test an integration test if it only focused on a single integration.
Acceptance tests(or end-to-end tests)
Are to integration tests what functional tests are to unit tests: a broader subset. That test above is an end-to-end test, it tests the web server, the application and the DB.

And yes I know the usages of these terms vary a bit.

Furthermore, considering the distinction between BDD and TDD:

  • The BDD part is the nicely-worded case labels, which in theory (but seldom in practise, I find) are written in direct collaboration with the client user.
  • The TDD part is when in the design-phase they are created: with TDD it's before the implementation is written; I am not sure whether in BDD it matters or is stipulated.
  • But both of them are design / development strategies, not testing strategies.
  • The tests can be implemented as any sort of test, not specifically unit tests or functional tests or end-to-end tests. The point is the test defines the design of the piece of code being written: it codifies the expectations of the behaviour of the code.
  • BDD and TDD tests are generally implemented via some unit testing framework, be it xUnit (testMyMethodDoesSomethingRight), or Jasmine-esque (it("does something right", function (){}).

One can also do testing that is not TDD or BDD, but it's a less than ideal way of going about things, and I would image result in subpar tests, fragmented test coverage, and tests that don't really help understand the application, so are harder to maintain in a meaningful way. But they are still better than no tests at all.

When I am designing my code, I use TDD, and I consider my test cases in a BDD-ish fashion (except I do it on the client's behalf generally, and sadly), and I use PHPUnit (xUnit) to do so on PHP, and Mocha (Jasime-esque) to do so on Javascript.

Hopefully that clarifies some things for people. Or people will leap at me and tell me where I'm wrong, and I can learn the error in my ways.

Righto.

--
Adam