Saturday, 8 October 2022

TDD: are tests that just "measure twice, cut once" legit?


There's a bit of archaeology going on here: I started writing this in Jan 2022, but never got past the first para. I have a few articles like this, and I've decided to either complete them if they have merit, or get rid if they don't.

Part of my problem previously on this blog is that if I didn't think I could spin the article out to be say ~1500 words, I didn't think it was worth writing about. I think that was wrong-headed, and there's nothing wrong with taking a quick look at something I've been thinking about. Let's see if this works.

The situation I describe herein actually happened after I had wondered about the subject matter. It bore out/validated my hypothesis quite nicely.

Today I had an interesting TDD case. We have a features that rely on finding the "next working day" in various countries. The code behind these features ultimately calls a function isBankHoliday, and the logic in that function maintains a list of bank holidays within its implementation. This is perhaps not the ideal approach, but it's the approach we have.

I was loading in the holidays for 2023, and me being me, the first thing I did was to look for the unit tests for the function. We didn't have one. That was odd because we had a job a few months prior to add the 2022 holidays, and I can recall discussing the need for tests with the dev concerned. Apparently this never happened, and I can't follow-up because said dev is no longer with us. Harrumph.

I can't reproduce any code relating to this here because it's my employer's code, but I can describe what I did. I opened the official govt sites that list public holidays (eg in the UK it is, and wrote a test that iterated through the list of 2023(*) holidays there and called isBankHoliday on each of them, expecting it to be true. I then picked some edge-case days around bank holidays and tested those expected the result to be false. This gives both a control group (the tests returning false), and the TDD tests for the work I was about to do. I got the expected pass/failures: all good.

(*) why did I not do all of them, and only tested the 2022 ones I was adding? We have a policy of not backfilling missing unit tests, because we'd be there all day/month/year if we did. We only have maybe 25% code coverage of our codebase for… "historical reasons".

OK, so the tests are in place. I then closed the test file, went back to and re-copied all the holiday dates again. I did not pull the prior work out of the test and refactor to suit the method implementation. I even made sure to specifically use a different data structure in the tests than I knew we had in the method implementation: the tests used an array because I'm not a muppet; the method implementation used a comma-delimited string (ugh).

Why did I do this? Well: measure twice, cut once. If I wrote the tests, then lifted the test data out of the tests and put them in the implementation, all I'd be doing is testing my ability to copy and paste. I would not be testing that I had keyed the source data I was testing against correctly. A typo in the test data would translate to a typo in the implementation, but the test would still pass.

That code went into production and has not caused us any problems.

Back in May this year we had a system glitch where a bunch of processing didn't run. It was 2022-05-03. We looked into it and found that the dev who had loaded in the 2022 holidays had miskeyed, and had entered the Queen's Platinum Jubilee holiday as 2022-05-03 not 2022-06-03. And with no tests: we did not catch it. It went through code review, that said, and we didn't actually catch it there either.

This situation caused a bunch of work for us because not only did we have to remediate the unprocessed work from 2022-05-03, but the system had also started to queue-up work for 2022-06-03 when it shouldn't have, and we had to unpick that lot too.

All because we didn't measure twice, and cut once.