Showing posts with label Unit Testing. Show all posts
Showing posts with label Unit Testing. Show all posts

Friday 19 February 2021

Part 12: unit testing Vue.js components

G'day

OKOK, another article in that bloody series I've been doing. Same caveats as usual: go have a breeze through the earlier articles if you feel so inclined. That said this is reasonably stand-alone.

  1. Intro / Nginx
  2. PHP
  3. PHPUnit
  4. Tweaks I made to my Bash environment in my Docker containers
  5. MariaDB
  6. Installing Symfony
  7. Using Symfony
  8. Testing a simple web page built with Vue.js using Mocha, Chai and Puppeteer
  9. I mess up how I configure my Docker containers
  10. An article about moving files and changing configuration
  11. Setting up a Vue.js project and integrating some existing code into it
  12. Unit testing Vue.js components (this article)

This is really very frustrating. You might recall I ended my previous article with this:

Now I just need to work out how to implement a test of just the GreetingMessage.vue component discretely, as opposed to the way I'm doing it now: curling a page it's within and checking the page's content. […]

Excuse me whilst I do some reading.

[Adam runs npm install of many combinations of NPM libs]

[Adam downgrades his version of Vue.js and does the npm install crap some more]

OK screw that. Seems to me - on initial examination - that getting all the libs together to make stand-alone components testable is going to take me longer to work out than I have patience for. I'll do it later. […] Sigh.

I have returned to this situation, and have got everything working fine. Along the way I had an issue with webpack that I eventually worked around, but once I circled back to replicate the work and write this article, I could no longer replicate the issue. Even rolling back to the previous version of the application code and step-by-step repeating the steps to get to where the problem was. This is very frustrating. However other people have had similar issues in the past so I'm going to include the steps to solve the problem here, even if I have to fake the issue to get error messages, etc.

Right so I'm back with the Vue.js docs regarding testing: "Vue.JS > Testing > Component Testing". The docs are incredibly content-lite, and pretty much just fob you off onto other people to work out what to do. Not that cool, and kinda suggests Vue.js considers the notion of unit testing pretty superficially. I did glean I needed to install a coupla Node modules.

First, @testing-library/vue, for which I ran into a glitch immediately:

root@00e2ea0a3109:/usr/share/fullstackExercise# npm install --save-dev @testing-library/vue
npm ERR! code ERESOLVE
npm ERR! ERESOLVE unable to resolve dependency tree
npm ERR!
npm ERR! While resolving: full-stack-exercise@2.13.0
npm ERR! Found: vue@3.0.5
npm ERR! node_modules/vue
npm ERR!   vue@"^3.0.0" from the root project
npm ERR!
npm ERR! Could not resolve dependency:
npm ERR! peer vue@"^2.6.10" from @testing-library/vue@5.6.1
npm ERR! node_modules/@testing-library/vue
npm ERR!   dev @testing-library/vue@"*" from the root project
npm ERR!
npm ERR! Fix the upstream dependency conflict, or retry
npm ERR! this command with --force, or --legacy-peer-deps
npm ERR! to accept an incorrect (and potentially broken) dependency resolution.
npm ERR!
npm ERR! See /var/cache/npm/eresolve-report.txt for a full report.

npm ERR! A complete log of this run can be found in:
npm ERR!     /var/cache/npm/_logs/2021-02-19T11_29_15_901Z-debug.log
root@00e2ea0a3109:/usr/share/fullstackExercise#

The current version of @testing-library/vue doesn't work with the current version of Vue.js. Neato. Nice one Vue team. After some googling of "wtf?", I landed on an issue someone else had raised already: "Support for Vue 3 #176": I need to use the next branch (npm install --save-dev @testing-library/vue@next). This worked OK.

The other module I needed was @vue/cli-plugin-unit-mocha. That installed with no problem. This all gives me the ability to run vue-cli-service test:unit, which will run call up Mocha and run some tests. The guidance is to set this up in package.json, thus:

  "scripts": {
    "test": "mocha test/**/*Test.js",
    "serve": "vue-cli-service serve --watch",
    "build": "vue-cli-service build",
    "lint": "vue-cli-service lint",
    "test:unit": "vue-cli-service test:unit test/unit/**/*.spec.js"
  },

Then one can jus run it as npm run test:unit.

I looked at the docs for how to test a component ("Unit Testing Vue Components" - for Vue v2, but there's no equivalent page in the v3 docs), and knocked together an initial first test which would just load the component and do nothing else: just to check everything was running (frontend/test/unit/GreetingMessage.spec.js (final version) on Github):

import GreetingMessage from "../../src/gdayWorld/components/GreetingMessage";

import { shallowMount } from "@vue/test-utils";

import {expect} from "chai";

describe("Tests of GreetingMessage component", () => {
    it("should successfully load the component", () => {
        let greetingMessage = shallowMount(GreetingMessage, {propsData: {message: "G'day world"}});

        expect(true).to.be.true;
    });
});

Here's where I got to the problem I now can't replicate. When I ran this, I got something like:

root@eba0490b453d:/usr/share/fullstackExercise# npm run test:unit

> full-stack-exercise@2.13.0 test:unit
> vue-cli-service test:unit test/unit/**/*.spec.js

 WEBPACK  Compiling...

  [=========================] 98% (after emitting)

 ERROR  Failed to compile with 1 error

error  in ./node_modules/mocha/lib/cli/cli.js

Module parse failed: Unexpected character '#' (1:0)
File was processed with these loaders:
* ./node_modules/cache-loader/dist/cjs.js
* ./node_modules/babel-loader/lib/index.js
* ./node_modules/eslint-loader/index.js
You may need an additional loader to handle the result of these loaders.
> #!/usr/bin/env node
| 'use strict';
|
[…etc…]

There's a JS file that has a shell-script shebang thing at the start of it, and the Babel transpiler doesn't like that. Fair enough, but I really don't understand why it's trying to transpile stuff in the node_modules directory, but at this point in time, I just thought "Hey-ho, it knows what it's doing so I'll take its word for it".

Googling about I found a bunch of other people having a similar issue with Webpack compilation, and the solution seemed to be to use a shebang-loader in the compilation process (see "How to keep my shebang in place using webpack?", "How to Configure Webpack with Shebang Loader to Ignore Hashbang…", "Webpack report an error about Unexpected character '#'"). All the guidance for this solution was oriented aroud sticking stuff in the webpack.config.js file, but of course Vue.js hides that away from you, and you need to do things in a special Vue.js way, but adding stuff with a special syntax to the vue.config.js file. The docs for this are at "Vue.js > Working with Webpack". The docs there showed how to do it using chainWebpack, but I never actually got this approach to actually solve the problem, so I mention this only because it's "something I tried".

From there I starting thinking, "no, seriously why is it trying to transpile stuff in the node_modules directory?" This does not seem right. I changed my googling tactic to try to find out what was going on there, and came across "Webpack not excluding node_modules", and that let me to update my vue.config.js file to actively exclude node_modules (copied from that answer):

var nodeExternals = require('webpack-node-externals');
...
module.exports = {
    ...
    target: 'node', // in order to ignore built-in modules like path, fs, etc. 
    externals: [nodeExternals()], // in order to ignore all modules in node_modules folder 
    ...
};

And that worked. Now when I ran the test I have made progress:

root@eba0490b453d:/usr/share/fullstackExercise# npm run test:unit

> full-stack-exercise@2.13.0 test:unit
> vue-cli-service test:unit test/unit/**/*.spec.js

 WEBPACK   Compiling...

  [=========================] 98% (after emitting)

 DONE   Compiled successfully in 1161ms

  [=========================] 100% (completed)

WEBPACK  Compiled successfully in 1161ms

MOCHA  Testing...



  Tests of GreetingMessage component
    ✓ should successfully load the component


  1 passing (17ms)

MOCHA  Tests completed successfully

root@eba0490b453d:/usr/share/fullstackExercise#

From there I rounded out the tests properly (frontend/test/unit/GreetingMessage.spec.js):

import GreetingMessage from "../../src/gdayWorld/components/GreetingMessage";

import { shallowMount } from "@vue/test-utils";

import {expect} from "chai";

describe("Tests of GreetingMessage component", () => {

    let greetingMessage;
    let expectedText = "TEST_MESSAGE";

    before("Load test GreetingMessage component", () => {
        greetingMessage = shallowMount(GreetingMessage, {propsData: {message: expectedText}});
    });

    it("should return the correct heading", () => {
        let heading = greetingMessage.find("h1");
        expect(heading.exists()).to.be.true;

        let headingText = heading.text();
        expect(headingText).to.equal(expectedText);
    });

    it("should return the correct content", () => {
        let contentParagraph = greetingMessage.find("h1+p");
        expect(contentParagraph.exists()).to.be.true;

        let contentParagraphText = contentParagraph.text();
        expect(contentParagraphText).to.equal(expectedText);
    });
});

Oh! Just a reminder of what the component is (frontend/src/gdayWorld/components/GreetingMessage.vue)! Very simple stuff, as the tests indicate:

<template>
    <h1>{{ message }}</h1>
    <p>{{ message }}</p>
</template>

<script>
export default {
  name: 'GreetingMessage',
  props : {
    message : {
      type: String,
      required: true
    }
  }
}
</script>

One thing I found was that every time I touched the test file, I was getting this compilation error:

> full-stack-exercise@2.13.0 test:unit
> vue-cli-service test:unit test/unit/**/*.spec.js

WEBPACK  Compiling...

  [=========================] 98% (after emitting)

ERROR  Failed to compile with 1 error

error  in ./test/unit/GreetingMessage.spec.js

Module Error (from ./node_modules/eslint-loader/index.js):

/usr/share/fullstackExercise/test/unit/GreetingMessage.spec.js
   7:1  error  'describe' is not defined  no-undef
  12:5  error  'before' is not defined    no-undef
  16:5  error  'it' is not defined        no-undef
  24:5  error  'it' is not defined        no-undef

error4 problems (4 errors, 0 warnings)

But if I ran it again, the problem went away. Somehow ESLint was getting confused by things; it only lints things when they've changed, and on the second - and subsequent - runs it doesn't run so the problem doesn't occur. More googling, and I found this: "JavaScript Standard Style does not recognize Mocha". The guidance here is to let the linter know I'm running Mocha, with the inference that there will be some global functions it can just assume are legit. This is just an entry in package.json:

  "eslintConfig": {
    "root": true,
    "env": {
      "node": true,
      "mocha": true
    },
    // etc

Having done that, everything works perfectly, and despite the fact that is a very very simple unit test… I'm quite pleased with myself that I got it running OK.

After sorting it all out, I reverted everything in source control back to how it was before I started the exercise, so as to replicate it and write it up here. This is when I was completely unable to reproduce that shebang issue at all. I cannot - for the life of me - work out why not. Weird. Anyway, I decided to not waste too much time trying to reproduce a bug I had solved, and just decided to mention it here as a possible "thing" that could happen, but otherwise move on with my life.

I have now got to where I wanted to get at the beginning of this series of articles, so I am gonna stop now. My next task with Vue.js testing is to work out how to mock things, because my next task will require me to make remote calls and/or hit a DB. And I don't want to be doing that when I'm running tests. But that was never an aim of this article series, so I'll revert to stand-alone articles now.

Righto.

--
Adam

Thursday 11 February 2021

Thoughts on Working Code podcast's Testing episode

G'day:

Working Code (@WorkingCodePod on Twitter) is a podcast by some friends and industry colleagues: Tim Cunningham, Carol Hamilton, Ben Nadel and Adam Tuttle.


(apologies for swiping your image without permission there, team)

It's an interesting take on a techo podcast, in their own words from their strapline:

Working Code is a technology podcast unlike all others. Instead of diving deep into specific technologies to learn them better, or focusing on soft-skills, this one is like hanging out together at the water cooler or in the hallway at a technical conference. Working Code celebrates the triumphs and fails of working as a developer, and aims to make your career in coding more enjoyable.

I think they achieve this, and it makes for a good listen.

So that's that, I just wanted to say they've done good work, and go listen.

Oh, just one more thing.

Yesterday they released their episode "Testing". I have to admit my reaction to a lot of what was said was… "poor", so I pinged my namesake and said "I have some feedback". After a brief discussion on Signal, Adam & I concluded that I might try to do a "reaction blog article" on the topic, and they might see if they can respond to the feedback, if warranted, at a later date. They are recording tonight apparently, and I'm gonna try to get this across to them for their morning coffee.

Firstly as a reminder: I'm pretty keen on testing, and I am also keen on TDD as a development practice. I've written a fair bit on unit testing and TDD both. I'm making a distinction between unit testing and TDD very deliberately. I'll come back to this. But anyway this is why I was very very interested to see what the team had to say about testing. Especially as I already knew one of them doesn't do automated testing (Ben), and another (Carol) I believe has only recently got into it (I think that's what she said, a coupla episodes ago). I did not know Adam or Tim's position on it.

And just before I get under way, I'll stick a coupla Twitter messages here I saw recently. At the time I saw them I was thinking about Ben's claim to a lack of testing, and they struck a particular chord with me.




I do not know Maaret or Mathias, but I think they're on the money here.

OK So I'm now listening to the podcast now. I'm going to pull quotes from it, and comment on them where I think it's "necessary".

[NB I'm exercising my right to reproduce small parts of the Working Code Podcast transcript here, as I'm doing so for the purposes of commentary / criticism]


Ahem.

Adam @ 11:27:

…up front we should acknowledge you know we're not testing experts. None of us [...] have been to like 'testing college'. […]There's a good chance we're going to get something wrong.

I think, in hindsight, this podcast needed this caveat to be made louder and clearer. To be blunt - and I don't think any would disagree with me here - all four are very far from being testing experts. Indeed one is even a testing naesayer. I think there's some dangerously ill-informed opinions being expressed as the podcast progresses, and as these are all people who are looked-up-to in their community, I think there's a risk people will take onboard what they say as "advice". Even if there's this caveat at the beginning. This might seem like a very picky thing to draw on, but perhaps I should have put it at then end of the article, after there's more context.

Carol @ 12:07:

Somebody find that monster already.

Hi guys.

Ben @ 12:41

I test nothing. And it's not like a philosophical approach to life, it's more just I'm not good at testing

Adam @ 13:09:

Clearly that's working out pretty well for you, you've got a good career going.

I hear this a bit. "I don't test and I get by just fine". This is pretty woolly thinking, and it's false logic people will jump on to justify why they don't do things. And Adam is just perpetuating the myth here. The problem with this rationalisation is demonstrated with an analogy of going on a journey without a map and just wandering aimlessly but you still (largely accidentally) arrive at your intended destination. In contrast had you used a map, you'd've been more efficient with your time and effort, and been able to progress even further on the next leg of your journey sooner. Ben's built a good career for himself. Undoubtedly. Who knows how much better it would be had he… used a map.

Also going back to Ben's comment about kind of explaining away why he doesn't test because he's not good at it. Everyone starts off not being good at testing mate. The rest of us do something about it. This is a disappointing attitude from someone as clued-up as Ben. Also if you don't know about something… don't talk about it mate. Inform yourself first and then talk about it.

Ben @ 13:21:

[…] some additional context. So - one - I work on a very small team. Two: all the people who work on my team are very very familiar with the software. Three: we will never ever hire a new engineer specifically for my team. Cos I work on the legacy codebase. The legacy codebase is in the process of being phased out. […] I am definitely in a context where I don't have to worry about hiring a new person and training the up on a system and then thinking they'll touch something in the code that they don't understand how it works. That's like the farthest possible thing from my day-to-day operations currently.

Um… so? I'm not being glib. You're still writing new logic, or altering existing logic. If you do that, it intrinsically needs testing. I mean you admitted you do manual testing, but it beggars belief that a person in the computer industry will favour manually performing a pre-defined repetitive task as a (prone-to-error) human, rather than automating this. We're in the business of automating repetitive tasks!

I'd also add that this would be a brilliant, low-risk, environment for you to get yourself up to speed with TDD and unit testing, and work towards the point where it's just (brain) muscle memory to work that way. And then you'll be all ready once you progress to more mission-critical / contemporary codebases.

Ben @ 14:42:

I can wrap my head around testing when it comes to testing a data workflow that is completely pure, meaning you have a function or you have a component that has functions and you give it some inputs and it generates some outputs. I can 100% wrap my head around testing that. And sometimes actually when I'm writing code that deals with something like that, even though I'm not writing tests per se, I might write a scratch file that instantiates that component and sends data to it and checks the output just during the development process that I don't have to load-up the whole application.

Ben. That's a unit test. You have written a unit test there. So why don't you put it in a test class instead of a scratch file, and - hey presto - you have a persistent test that will guard against that code's behaviour somehow changing to break those rules later on. You are doing the work here, you're just not doing it in a sensible fashion.

Ben @ 15:18:

Where it breaks down immediately for me is when I have to either a) involve a database, or b) involve a user interface. And I know that ther's all kinds of stuff that the industry has brought to cater to those problems. I've just never taken the time to learn.

There's a bit of a blur here between the previous train of thought which was definitely talking about unit tests of a unit of code, and now we're talking about end-to-end testing. These are two different things. I am not saying Ben doesn't realise this, but they're jammed up next to each other in the podcast so the distinction is not being made. These two kinds of testing are separate ideas, only really coupled by the fact they are both types of testing. Ben's right, the tooling is there, and - in my experience at least with the browser emulation stuff - it's pretty easy and almost fun to use. Ben already tests his stuff manually every time he does a release, so it would seem sensible to me to take the small amount of time it takes to get up to speed with these things, and then instead of testing something manually, take the time to automate the same testing, then it's taken care of thenceforth. It just a matter of being a bit more wise with one's time usage.

Adam @ 16:39

The reason that we don't have a whole lot of automated tests for our CFML code is simply performance. So when we started our product I tried really hard to do TDD. If I was writing a new module or a new section of that module I would work on tests along with that code, and would try to stay ahead of the game there. And what ended up happening was I had for me - let's say - 500 functions that could run, I had 400 tests. And I don't want to point a finger at any particular direction, but when you take the stack as a whole and you say "OK now run my test suite" and it takes ten minutes to run those tests and [my product, the project I was working on] is still in its infancy, and you can see this long road of so much more work that has to be done, and it takes ten minutes to run the tests - you know, early on - there was no way that that was going to be sustainable. So we kind of abandoned hope there. [...] I have, in more recent years, on a more recent stack seen way better performance of tests. [...] So we are starting to get more into automated testing and finding it actually really helpful. [...] I guess what I wanted to say there is that a perfectly valid reason to have fewer or no tests is if it doesn't work well on your platform.

Adam starts off well here, both in what he's saying and his historical efforts with tests, but he then goes on to pretty much blame ColdFusion for not being very good at running tests. This is just untrue, sorry mate. We had thousands upon thousands of tests running on ColdFusion, and they ran in an amount of time best measured in seconds. And when we ported that codebase to PHP, we had a similar number of test cases, and they ran in round about the same amount of time (PHP was faster, that said, but also the tests were better). I think the issue is here - and confirms this about 30sec after the quote above, and again about 15min later when he comes back to this - is that your tests weren't written so well, and they were not focused on the logic (as TDD-oriented tests ought to be), they were basically full integration tests. Full integration tests are excellent, but you don't want your tight red / green / refactor testing cycle to be slowed down by external services like databases. That's the wrong sort of testing there. My reaction to you saying your test runs are slow is not to say "ColdFusion's fault", it's to say "your tests' fault". And that's not a reason to not test. It's a reason to check what you've been doing, and fix it. I'm applying hindsight here for you obviously, but this ain't the conclusion/message you should be delivering here.

Carol @ 20:03:

I also want to say that if you are starting out and you're starting to add test, don't let slowness stop you from doing it.

Spot on. I hope Ben was listening there. When you start to learn something new, it is going to take more time. I think this is sometimes why people conclude that testing takes a lot of time: the people arriving at that conclusion are basing it on their time spent on the learning curve. Accept that things go slow when you are learning, but also accept that things will become second nature. And especially with writing automated tests it's not exactly rocket-science, the initial learning time is not that long. Just… decide to learn how to test stuff, start automating your testing, and stick at it.

Ben @ 21:24:

I was thinking about debugging incidents and getting a page in the middle of the night and having to jump on a call and you seeing the problem, and now you have to do a hotfix, and push a deployment in the middle of the night […]. And imagine having to sit there for 30 minutes for your tests to run just so you can push out a hotfix. Which I thought to myself: that would drive me crazy.

At this point I think Ben is just trying to invent excuses to justify to himself why he's right to eschew testing. I'm reminded of Maaret's Twitter message I included above. The subtext of Ben's here is that if one tests manually, then you're more flexible in what you can choose to re-test when you are hotfixing. Well obviously if you can make that call re manual tests, then you can make the same call with automated tests! So his position here is just specious. Doubly so because automated tests are intrinsically going to be faster than the equivalent manual tests to start with. Another thing I'll note that with this entire analogy: you've already got yourself in a shit situation by needing to hotfix stuff in the middle of the night. Are you really sure you want to be less diligent in how you implement that fix? In my experience that approach can lead to a second / third / fourth hotfix being needed in rapid succession. Hotfix situations are definitely ones of "work smarter, not faster".

Ben @ 21:56:

I'm wondering if there should be a test budget that you can have for your team where you like have "here is the largest amount of time we're willing to let testing block a deployment". And anything above that have to be tests that sit in an optional bucket where it's up to the developer to run them as they see fit, but isn't necessarily tests that would block deployment. I don't know if that's totally crazy.

Adam continues @ 22:36:

You have to figure out which tests are critical path, which ones are "must pass", and these ones are like "low risk areas" […] are the things I would look for to make optional.

Yep, OK there's some sense here, but I can't help thinking that we are talking about testing in this podcast, and we're spending our time inventing situations in which we're not gonna test. It all seems a bit inverted to me. How about instead you just do your testing and then if/when a situation arises you then deal with it. Instead of deciding there will be situations and justify to yourselves why you oughtn't test in the first place.

But I also have to wonder: why the perceived rush here? What's wrong with putting over 30min to test stuff if it "proves" thaty your work has maintained stability, and you'll be less likely to need that midnight hotfix. What percentage of the whole cycle time of feature request to delivery is that 30 minutes? Especially if taking the effort to write the tests in the first place will inately improve the stability of your code in the first place, and then help to keep it stable? It's a false economy.

Tim @ 23:15:

When we have contractors do work for us. I require unit tests. I require so much testing just because it's a way for me to validate the truth of what they're saying they've done. So that everything that we have that's done by third parties is very well tested, and it's fantastic because I have a high level of confidence.

Well: precisely. Why do you not want that same level of confidence in your in-house work? Like you say: confidence is fantastic. Be fantastic, Tim. Also: any leader should eat their own dogfood I think. If there's sense in you making the contractors work like this, clearly you ought to be working that way yourself.

Tim @ 23:36:

Any time I start a new project, if I have a greenfields project, I always start with some level of unit tests, and then I get so involved in the actual architecture of the system that I put it off, and like "well I don't really need a test for this", "I'm not really sure where I'm going with this, so I'm not going to write a test first" because I'm kinda experimenting. Then my experiment becomes reality, then my reality becomes the released version. And then it's like "well what's the point of writing a test now?"

I think we've all been there. I think what Tim needs here is just a bit more self-discipline in identifying what is "architectural spike" and what's "now doing the work". If one is doing TDD, then the spike can be used to identify the test cases (eg "it's going to need to capture their phone number") without necessarily writing the test to prove the phone number has been captured. So you write this:

describe("my new thing", function () {
    it ("needs to capture the phone number", function () {
        // @todo need to test this
    });
});

And then when you detect you are not spiking any more, you write the test, and then introduce the code to make the test pass. I also think Tim is overlooking that the tests are not simply there for that first iteration, they are then there proving that code is stable for the rest of the life of the code. This… builds confidence.

Adam @ 24:22:

That's what testing is all about, right? It's increasing confidence that you can deploy this code and nothing is going to be wrong with it. […] When I think about testing, the pinnacle of testing for me is 100% confidence that I can deploy on my way out the door at 4:55pm on Friday afternoon, with [a high degree of ~] confidence that I am not going to get paged on Saturday at 4am because some of that code that I just deployed… it went "wrong".

Exactly.

Carol @ 25:12:

What difference between the team I'm on and the team you guys have is we have I think it's 15-ish people touching the exact same code daily. So a patch I can put out today may have not even been in the codebase they pulled yesterday when they started working on a bug, or a week ago when they had theirs. So me writing that little extra bit of test gives them some accountability for what I've done, and me some.

Again: exactly.

Ben @ 26:36:

Even if you have a huge test suite, I can't help but think you have to do the manual testing, because what if something critical was missed. [...] I think the exhaustive test suite, what that does is it catches unexpected bugs unrelated. Or things that broke because you didn't expect them to break in a certain way. And I think that's very important.

To Ben's first point, you could just as easily (and arguably more validly) switch that around: a human, doing ad-hoc manual testing is more likely to miss something, because every manual test run is at their whim and subject to their focus and attention at the time. Whereas the automated tests - which let's not forget were written by a diligent human, but right at the time they are most focused on the requirements - are run by the computer and it will do exactly the same job every time. What having the historical corpus of automated tests give you is increased confidence that all that stuff being tested still works the way it is supposed to, so the manual testing - which is always necessary, can be more a case of dotting the Is and crossing the Ts. With no automated tests, the manual tests need to be exhaustive. And the effort needs to be repeated every release (Adam mentions this a few minutes later as well).

To the second point: yeah precisely. Automated tests will pick up regressions. And the effort to do this only needs to be done once (writing the test). Without automated tests, you rely on the manual testing to pick this stuff up, but - being realistic - if your release is focused on PartX of the code, your manual tests are going to focus there, and possibly not bother to re-test PartZ which has just inadvertantly been broken by PartX's work.

Ben also mentions this quote from Rich Hickey "Q: What happened to every bug out there? A: it passed the type checker, and it passed all tests." (I found this reference on Google: Simple Made Easy, it's at about 15:45). It's a nifty quote, but what it's also saying is that there wasn't actually a test for the buggy behaviour. Because if there was one: the test would have caught it. The same could be said more readily of manual-only testing. Obviously nothing is going to be 100%, but automated tests are going to be more reliable at maintaining the same confidence level, and be less effort, than manual-only testing.

Ben @ 28:15:

When people say it increases the velocity of development over time. I have trouble embracing that.

(Ben's also alluding back to a comment he made immediately prior to that, relating to always needing to manually test anyhow). "Over time" is one of the keys here. Once a test is written once, it's there. It sticks around. Every subsequent test round there is no extra effort to test that element of the application (Tim draws attention to this a coupla minutes later too). With manual testing the effort needs to be duplicated every time you test. Surely this is not complicated to understand. Ben's point about "you still need to manually test" misses the point that if there's a foundation of automated tests, your manual testing can become far more perfunctory. Without the tests: the manual testing is monolithic. Every. Single. Time. To be honest though, I don't know why I need to point this out. It's a) obvious; and b) very well-trod ground. There's an entire industry that thinks automated tests are the foundation of testing. And then there's Ben who's "just not sure". This is like someone "just not sure" that the world isn't actually flat. It's no small amount of hubris on his part, if I'm honest. And obviously Ben is not the only person out there in the same situation. But he's the one here on this podcast supposedly discussing testing.

Ben @ 37:08:

One thing that I've never connected with: when I hear people talk about testing, there's this idea of being able to - I think they call them spies? - create these spies where you can see if private methods get called in certain ways. And I always think to myself: "why do you care about your private methods?" That's an implementation detail. That private method may not exist next week. Just care about what your public methods are returning and that should inherently test your private methods. And people have tried to explain it to me why actually sometimes you wanna know, but I've ust never understood it.

Yes good point. I can try to explain. I think there's some nuance missing in your understanding of what's going on, and what we're testing. It starts with your position that testing is only concerning itself with (my wording, paraphrasing you from earlier) "you're interested in what values a public method takes, and what it returns". Not quite. You care about given inputs to a unit, whether the behaviour within the unit correctly provides the expected outputs from the unit. The outputs might not be the return value. Think about a unit that takes a username and password, hashes the password, and saves it to the DB. We then return the new ID of the record. Now… we're less interested in the ID returned by the method, we are concerned that the hashing takes place correctly. There is an output boundary of this unit at the database interface. We don't want our tests to actually hit the database (too slow, as Adam found out), but we mock-out the DB connector or the DAO method being called that takes the value that the model layer has hashed. When then spy on the values passed to the DB boundary, and make sure it's worked OK. Something like this:

describe("my new thing", function () {
    it ("hashes the password", function () {
        testPassword = "letmein"
        expectedHash = "whatevs"
        
        myDAO = new Mock(DAO)
        myDAO.insertRecord.should.be.passed(anything(), expectedHash)
        
        myService = new Service(myDAO)
        
        newId = myService.addUser("LOGIN_ID_NOT_TESTED", testPassword)
        
        newId.should.be.integer() // not really that useful
    });
});

class Service {

    private dao

    Service(dao) {
        this.dao = dao
    }
    
    addUser(loginId, password) {
        hashedPassword = excellentHashingFunction(password)
        
        return this.dao.insertRecord(loginId, hashedPassword)
    }
}

class DAO {
    insertRecord(loginId, password) {
        return db.insertQuery("INSERT INTO users (loginId, password) VALUES (:loginId, :password)", [loginId, password])
    }
}

OK so insertRecord isn't a private method here, but the DAO is just an abstraction from the public interface of the unit anyhow, so it amounts to the same thing, and it makes my example clearer. insertRecord could be a private method of Service.

So the thing is that you are checking boundaries, not specifically method inputs/outputs.

Also, yes, the implementation of DAO might change tomorrow. But if we're doing TDD - and we should be - the tests will be updated at the same time. More often than not though, the implementation isn't as temporary as this line of thought often assumes (for the convenience of the argument, I suspect).

Adam @ 48:41:

The more that I learn how to test well, and the more that I write good tests, the more I become a believer in automated testing (Carol: Amen). […] The more I do it the better I get. And the better I get the more I appreciate what I can get from it.

Indeed.

Tim @ 49:32:

In a business I think that short term testing is a sunk cost maybe, but long term I have seen the benefit of it. Particularly whenever you are adding stuff to a mature system, those tests pay dividends later. They don't pay dividends now […] (well they don't pay as many dividends now) […] but they do pay dividends in the long run.

Also a good quote / mindset. Testing is about the subsequent rounds of development as much as the current one.

Ben @ 50:05:

One thing I've never connected with emotionally, when I hear people talk about testing, is when they refer to tests as providing documentation about how a feature is supposed to work. And as someone who has tried to look at tests to understand why something's not working, I have found that they provide no insight into how the feature is supposed to work. Or I guess I should say specifically they don't provide answers to the question that I have.

Different docs. They don't provide developer docs, but if following BDD practices, they can indicate the expected behaviour of the piece of functionality. Here's the test run from some tests I wrote recently:

root@1b011f8852b1:/usr/share/nodeJs# npm test

> nodejs@1.0 test
> mocha test/**/*.js



  Tests for Date methods functions
    Tests Date.getLastDayOfMonth method
      ✓ returns Jan 31, given Jan 1
      ✓ returns Jan 31, given Jan 31
      ✓ returns Feb 28, given Feb 1 in 2021
      ✓ returns Feb 29, given Feb 1 in 2020
      ✓ returns Dec 31, given Dec 1
      ✓ returns Dec 31, given Dec 31
    Tests Date.compare method
      ✓ returns -1 if d1 is before d2
      ✓ returns 1 if d1 is after d2
      ✓ returns 0 if d1 is the same d2
      ✓ returns 0 if d1 is the same d2 except for the time part
    Tests Date.daysBetween method
      ✓ returns -1 if d1 is the day before d2
      ✓ returns 1 if d1 is the day after d2
      ✓ returns 0 if d1 is the same day as d2
      ✓ returns 0 if d1 is the same day as d2 except for the time part
    Tests for addDays method
      ✓ works within a month
      ✓ works across the end of a month
      ✓ works across the end of the year
      ✓ works with zero
      ✓ works with negative numbers

  Tests a method Reading.getEstimatesFromReadingsArray that returns an array of Readings representing month-end estimates for the input range of customer readings
    Tests for validation cases
      ✓ should throw a RangeError if the readings array does not have at least two entries
      ✓ should not throw a RangeError if the readings array has at least two entries
      ✓ should throw a RangeError if the second date is not after the first date
    Tests for returned estimation array cases
      ✓ should not include a final month-end reading in the estimates
      ✓ should return the estimate between two monthly readings
      ✓ should return three estimates between two reading dates with three missing estimates
      ✓ should return the integer part of the estimated reading
      ✓ should return all estimates between each pair of reading dates, for multiple reading dates
      ✓ should not return an estimate if there was an actual reading on that day
      ✓ should return an empty array if all readings are on the last day of the month
      ✓ tests a potential off-by-one scenario when the reading is the day before the end of the month

  Tests for helper functions
    Tests for Reading.getEstimationDatesBetweenDates method
      ✓ returns nothing when there are no estimates dates between the test dates
      ✓ correctly omits the first date if it is an estimation date
      ✓ correctly omits the second date if it is an estimation date
      ✓ correctly returns the last date of the month for all months between the dates

  Test Timer
    ✓ handles a lap (100ms)

  Test TimerViaPrototype
    ✓ handles a lap (100ms)


  36 passing (221ms)

root@1b011f8852b1:/usr/share/nodeJs#

When I showed this to the person I was doing the work for, he immediately said "no, that test case is wrong, you have it around the wrong way", and they were right, and I fixed it. That's the documentation "they" are talking about.

Oh, and Carol goes on to confirm this very thing one minute later.

Also bear in mind that just cos a test could be written in such a way as to impart good clear information doesn't mean that all tests do. My experience with looking at open-source project's tests to get any clarity on things (and I include testing frameworks' own tests in this!), I am left knowing less than I did before I looked. It's almost like there's a rule in OSS projects that the code needs to be shite or they won't accept it ;-)


And that's it. It was an interesting podcast, but I really really strongly disagreed with most of what Ben said, and why he said it. It would be done thing if he was held to account (and the others tried this at times), but as it is other than joking that Ben is a nae-sayer, I think there's some dangerous content in here.

Oh, one last thing… in the outro the team suggests some resources for testing. Most of what the suggested seems to be "what to do", not "why you do it". I think the first thing one should do when considering testing is to read Test Driven Development by Kent Beck. Start with that. Oh this reminds me… not actually much discussion on TDD in this episode. TDD is tangential to testing per se, but it's an important topic. Maybe they can do another episode focusing on that.

Follow-up

The Working Code Podcast team have responded to my obersvations here in a subsequent podcast episode, which you can listen to here: 011: Listener Questions #1. Go have a listen.

Righto.

--
Adam

Thursday 21 January 2021

Listening to the console log of a page loaded with Puppeteer

G'day:

This is a follow-up from something I touched on yesterday ("Polishing my Vue / Puppeteer / Mocha / Chai testing some more"). In that exercise I was using Puppeteer to load a web page I was testing, and then pulling some DOM element values out and checking they matched expectations. The relevant bits of code are thus:

describe.only("Tests of githubProfiles page using github data", function () {
    let browser;
    let page;
    let expectedUserData;

    before("Load the page and test data", async function () {
        await loadTestPage();
        expectedUserData = await loadTestUserFromGithub();
    });

    let loadTestPage = async function () {
        browser = await puppeteer.launch( {args: ["--no-sandbox"]});
        page = await browser.newPage();

        await Promise.all([
            page.goto("http://webserver.backend/githubProfiles.html"),
            page.waitForNavigation()
        ]);
    }

    it("should have the expected person's name", async function () {
        let name = await page.$eval("#app>.card>.content>a.header", headerElement => headerElement.innerText);
        name.should.equal(expectedUserData.name);
    });

  • Load the page with Puppeteer
  • Example test checking the page's DOM

This code seemed to be running fine, and the tests were passing. As I was adding more code ot my Vue component on the client end, I suddenly found the tests started to fail. Sometimes. If I ran them ten times, they'd fail maybe three times. At the same time, if I was just hitting the page in the browser, it was working 100% of the time. Odd. I mean clearly I was doing something wrong, and I'm new to all this async code I'm using, so figured I was using values before they were available or something. But it seemed odd that this was only manifesting sometimes. The way the tests were failing was telling though:

1) Tests of githubProfiles page using github data
       should have the expected person's name:

      AssertionError: expected '' to equal 'Alex Kyriakidis'
      + expected - actual

      +Alex Kyriakidis

The values coming from the DOM were blank. And note that it's not a case of the DOM being wrong, because if that was the case, the tests would barf all the time, with something like this:

Error: Error: failed to find element matching selector "#app>.card>.content>a.header"

The relevant mark-up here is:

<a class="header" :href="pageUrl">{{name}}</a>

So {{name}} isn't getting its value sometimes.

I faffed around for a bit reading up on Vue components, and their lifecycle handlers in case created was not the right place to load the data or something like that, but the code seemed legit.

My JS debugging is not very sophisticated, and it's basically a matter of console.logging stuff and see what happens. I chucked a bunch of log calls in to see what happens:

created () {
    console.debug(`before get call [${this.username}]`);
    axios.get(
        `https://api.github.com/users/${this.username}`,
        {
            auth: {
                username: this.$route.query.GITHUB_PERSONAL_ACCESS_TOKEN
            }
        }
    )
    .then(response => {
        console.debug(`beginning of then [${response.data.name}]`);
        this.name = response.data.name;
		// [etc...]
        console.debug("end of then");
    });
    console.debug("after get call");
}

Along with some other ones around the place, these all did what I expected when I hit the page in the browser:

beginning of js
githubProfiles.js:46 before VueRouter
githubProfiles.js:52 before Vue
githubProfiles.js:23 before get call [hootlex]
githubProfiles.js:43 after get call
githubProfiles.js:63 end of js
githubProfiles.js:33 beginning of then [Alex Kyriakidis]
githubProfiles.js:41 end of then

I noted that the then call was being fulfilled after the mainline code had finished, but in my test I was waiting for the page to fully load, so I'd catered for this. Repeated from above:

await Promise.all([
    page.goto("http://webserver.backend/githubProfiles.html"),
    page.waitForNavigation()
]);

I ran my tests, and was not seeing anything in the console which momentarily bemused me. But then I was just "errr… duh, Cameron. That stuff is logging in the web page's console. Not Node's console from the test run". I'm really thick sometimes.

This flumoxed me for a bit as I wondered how the hell I was going to get telemetry out of the page that I was calling in the Puppeteer headless browser. Then it occurred to me that I would not be the first person to wonder this, so just RTFMed.

It's really easy! The Puppeteer Page object exposes event listeners one can hook into, and one of the events is console. Perfect. All I needed to do is put this into my test code:

page = await browser.newPage();

page.on("console", (log) => console.debug(`Log from client: [${log.text()}] `));

await Promise.all([
    page.goto("http://webserver.backend/githubProfiles.html"),
    page.waitForNavigation()
]);

Then when I ran my tests, I was console-logging the log entries made in the headless browser as they occurred. What I was seeing is:

  Tests of githubProfiles page using github data
Log from client: [beginning of js]
Log from client: [before VueRouter]
Log from client: [before Vue]
Log from client: [before get call [hootlex]]
Log from client: [after get call]
Log from client: [end of js]
    1) should have the expected person's name
    2) should have the expected person's github page URL
    3) should have the expected person's avatar
    4) should have the expected person's joining year
    5) should have the expected person's description
Log from client: [beginning of xxxxxx then [Alex Kyriakidis]]
Log from client: [end of then]
    6) should have the expected person's number of friends
     should have the expected person's friends URL


  1 passing (4s)
  6 failing

Note how the tests get underway before the then call takes place. And shortly after that, the tests start passing because by then the dynamic values have actually been loaded into the DOM. This is my problem! that page.waitForNavigation() is not waiting long enough! My first reaction was to blame Puppeteer, but I quickly realised that's daft and defensive of me, given this is the first time I've messed with this stuff, almost certainly I'm doing something wrong. Then it occurred to me that a page is navigable once the various asset files are loaded, but not necessarily when any code in them has run. Duh. I figured Puppeteer would have thought of this, so there'd be something else I could make it wait for. I googled around and found the docs for page.waitForNavigation, and indeed I needed to be doing this:

page.waitForNavigation({waitUntil: "networkidle0"})

After I did that, I found the tests still failing sometimes, but now due to a time out:

  Tests of githubProfiles page using github data
Log from client: [beginning of js]
Log from client: [before VueRouter]
Log from client: [before Vue]
Log from client: [before get call [hootlex]]
Log from client: [after get call]
Log from client: [end of js]
Log from client: [beginning of then [Alex Kyriakidis]]
Log from client: [end of then]
    1) "before all" hook: Load the page for "should have the expected person's name"


  0 passing (4s)
  1 failing

  1) Tests of githubProfiles page using github data
       "before all" hook: Load the page for "should have the expected person's name":
     Error: Timeout of 5000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/usr/share/fullstackExercise/tests/functional/public/GithubProfilesTest.js)

I had the time out set for five seconds, but now the tests are waiting for the client to finish its async call as well, I was just edging over that five second mark sometimes. So I just bumped it to 10 seconds, and thenceforth the tests all passed all the time. I've left the telemetry in for one last successful run here:

  Tests of githubProfiles page using github data
Log from client: [beginning of js]
Log from client: [before VueRouter]
Log from client: [before Vue]
Log from client: [before get call [hootlex]]
Log from client: [after get call]
Log from client: [end of js]
Log from client: [beginning of then [Alex Kyriakidis]]
Log from client: [end of then]
     should have the expected person's name
     should have the expected person's github page URL
     should have the expected person's avatar
     should have the expected person's joining year
     should have the expected person's description
     should have the expected person's number of friends
     should have the expected person's friends URL


  7 passing (5s)

OK so that was a bit of a newbie exercise, but I'm a noob so yer gonna get that. It was actually pretty fun working through it though. I'm really liking all this tooling I'm checking out ATM, so yer likely get a few more of these basic articles from me.

Righto.

--
Adam

Saturday 9 November 2019

I need to backtrack on some TDD guidance. I think

G'day:
Yes, this thing still exists.

Possibly foolishly my employer has made me tech lead of one of our software applications, and part of that is to lead from the front when it comes to maintaining code quality, including our TDD efforts. So I tend to pester the other devs about TDD techniques and the like. O how they love that. Some of them. Occasionally. Actually seldom. Poor bastards.

Anyway, recently I started to think about the difference between "fixing tests that would break if we made this new change to the code" and "writing a TDD-oriented test case before making the code change". Just to back up a bit... this is in the context of "write a failing test for the case being tested... then writing the code to pass the test". The red and green from the "red green refactor" trope. This started from doing code review and seeing existing tests being update to accommodate new code changes, but not see any actual new test case explicitly testing the new requirement. This did not sit well with me, so I've been asking the devs "where's yer actual test case for the change we're making here?"

To help with this notion I wrote a quick article on WTF I was on about. I was initially pleased with it and thought I'd nailed it, but when I asked my more-knowledgeable colleague Brian Sadler to sanity check it, he wasn't convinced I had nailed it, and moreover wasn't sure I was even right. Doh. I've been mulling over this for the rest of the week and I've come to the conclusion... yeah, he's right and I'm wrong. I think. I'm writing this article to put my thinking down in public, and see what you lot think. If anyone still reads this I mean. I will be chasing some ppl to read it, that said...

I can't just reproduce the original article cos it was an internal document and work owns it. So I'm reproducing a similar thing here.

Here's some sample code. I stress the code used here bears no relation to any code or situation we have in our codebase. It's just a simplified case to contextualise the discussion:

class ThingRepository {

    public function getThing($id) {
        $dao = DaoFactory::getThingDao();
        $dbData = $dao->getThing($id);

        return Thing::createFromArray($dbData);
    }
}

It gets raw data for the requested thing from a DAO and returns the data as a Thing object.

It also has an existing test:


class ThingRepositoryTest extends TestCase {

    private $repository;

    protected function setUp() : void {
        $this->repository = new ThingRepository();
    }

    public function testGetThingReturnsCorrectThingForId(){
        $testId = 17;
        $expectedName = "Thing 1";
        $thing = $this->repository->getThing(17);

        $this->assertInstanceOf(Thing::class, $thing);
        $this->assertSame($testId, $thing->id);
        $this->assertSame($expectedName, $thing->name);
    }
}

And this all works fine:

C:\src\php\general>vendor\bin\phpunit test\tdd\fixingExistingTests\ThingRepositoryTest.php
PHPUnit 8.4.3 by Sebastian Bergmann and contributors.

.                                                                   1 / 1 (100%)

Time: 54 ms, Memory: 4.00 MB

OK (1 test, 3 assertions)

C:\src\php\general>

If you were paying attention, you'll've spotted we're tightly coupling the DAO implementation here to its repository. At least we're using a factory, but still it is tightly coupled to the Repository implementation. We want to use dependency injection to provide the DAO to the repository.

Here's where my stance gets contentious.

Someone has addressed this requirement, and what they've done is updated that existing test to expect the passed-in DAO:

class ThingRepositoryExistingTestUpdatedTest extends TestCase {

    private $repository;
    private $dao;

    protected function setUp() : void {
        $this->dao = $this->createMock(ThingDao::class);
        $this->repository = new ThingRepositoryWithDaoDependency($this->dao);
    }

    public function testGetThingReturnsCorrectThingForId(){
        // implementation that now uses mocked DAO
    }
}

Note that this is exactly how I presented this information in the original work article. With the test implementation changes elided. This becomes relevant later.

Having prepped that test - and if fails nicely, cool - the dev then goes and sorts out the repo class.

I went on to reject that approach as not being valid TDD. My position is that "fixing other tests so they won't fail after we make the change" is not TDD, it's just technical debt recovery. If we're doing TDD, then we need to start with an explicit test case for the change we're going to make. I'll put this in a highlight box to draw attention to it:

Fixing other tests so they won't fail after we make the change is not TDD, it's just technical debt recovery. If we're doing TDD, then we need to start with an explicit test case for the change we're going to make.
This is what I'd expect to see:

class ThingRepositoryTest extends TestCase {

    private $repository;
    private $dao;

    protected function setUp() : void {
        $this->dao = $this->createMock(ThingDao::class);
        $this->repository = new ThingRepositoryWithDaoDependency($this->dao);
    }

    public function testGetThingUsesDaoPassedIntoConstructor(){
        $testId = 17;
        $mockedThingData = ['id'=>$testId, 'thing' => 'stuff'];

        $this->dao
            ->expects($this->once())
                        ->method('getThing')
                        ->with($testId)
                        ->willReturn($mockedThingData);

        $result = $this->repository->getThing($testId);

        $this->assertEquals(
            new Thing($mockedThingData['id'], $mockedThingData['thing']),
            $result
        )
    }
    
    // ... rest of test cases that were already there, including this one...

    public function testGetThingReturnsCorrectThingForID(){
        // updated implementation that doesn't break because of the DAO change
    }
}

Here we have an explicit test case for the change we're making. This is TDD. We're explicitly injecting a DAO into the test repo, and we're checking that DAO is being used to get the data.

This test is red:

C:\src\php\general>vendor\bin\phpunit test\tdd\fixingExistingTests\ThingRepositoryTest.php
PHPUnit 8.4.3 by Sebastian Bergmann and contributors.

F                                                                   1 / 1 (100%)

Time: 61 ms, Memory: 4.00 MB

There was 1 failure:

1) me\adamcameron\general\test\tdd\fixingExistingTests\ThingRepositoryTest::testGetThingUsesDaoPassedIntoConstructor
Expectation failed for method name is "getThing" when invoked 1 time(s).
Method was expected to be called 1 times, actually called 0 times.

FAILURES!
Tests: 1, Assertions: 4, Failures: 1.

C:\src\php\general>

But that means we're now good to make the changes to the repository:

class ThingRepository {

    private $dao;

    public function __construct($dao) {
        $this->dao = $dao;
    }

    public function getThing($id){
        $dbData = $this->dao->getThing($id);

        return Thing::createFromArray($dbData);
    }
}

Now our test is green, and we're done.

I rounded out the article with this:

As a rule of thumb... if you don't start your TDD work by typing in a case name - eg testGetThingUsesDaoPassedIntoConstructor - then you're not doing TDD, you are just doing maintenance code. These are... not the same thing.

Brian read the article and - my words not his - didn't get the distinction I was trying to make. Why was the first test update not sufficient? I could not articulate myself any differently than repeat what I'd said - no, not helpful - so we decided to get together later and look at how the implementation of the test I had elided would pan out, and how that wouldn't itself be a decent test. Due to work being busy this week we never had that confab, but I continued to think about it, and this morning concluded that Brian was right, and I was mistaken.

It comes down to how I omitted the implementation of that first test we updated. I did not do this on purpose, I just didn't want to show an implementation that was not helpful to the article. But in doing that, I never thought about what the implementation would be, and set-up a straw man to support my argument. I went back this morning and implemented it...

public function testGetThingReturnsCorrectThingForId(){
    $testId = 17;
    $expectedName = "Thing 1";

    $this->dao->expects($this->once())
        ->method('getThing')
        ->with($testId)
        ->willReturn(['id' => $testId, 'name' => $expectedName]);

    $thing = $this->repository->getThing(17);

    $this->assertInstanceOf(Thing::class, $thing);
    $this->assertSame($testId, $thing->id);
    $this->assertSame($expectedName, $thing->name);
}


This is what I'd need to do to that first test to get it to not break when we change the DAO to being injected. And... erm... it's the same frickin' logic as in the test I said we need to explicitly make. I mean I wrote the two versions at different times so solved the same problem slightly differently, but it's the same thing for all intents and purposes.

So... let's revisit my original position then:

Fixing other tests so they won't fail after we make the change is not TDD, it's just technical debt recovery. If we're doing TDD, then we need to start with an explicit test case for the change we're going to make.

It's not intrinsically true. Actually fixing existing tests can indeed be precisely the TDD-oriented cover we need.

I love being wrong. No sarcsm there: I really do. Now I'm thinking better about this topic. Win.

The only thing left after the distillation is the difference between these two:

Original test name (detailing its original test case requirement):

testGetThingReturnsCorrectThingForId

And my suggested case name for the new injected-DAO requirement:

testGetThingUsesDaoPassedIntoConstructor

Now I do still think there's merit in being that explicit about what we're testing. But it's well into the territory of "perfect is the enemy of good". We're fine just updating that original test.

I thought of how to deal with this, and came up with this lunacy:

public function testGetThingUsesDaoPassedIntoConstructor()
{
    $this->testGetThingReturnsCorrectThingForId();
}

But I think that's too pedantic even for me. I think we can generally just take it as given that the existing test covers us.



I'll close by saying I do think this is good advice when starting to do the testing for a code change to first come up with the test case name. It gets one into the correct mindset for approaching things in a TDD manner. But one oughtn't be as dogmatic about this as I clearly tend to be... existing tests quite possible could provide the TDD-oriented cover we need for code changes. We do still need to start with a breaking test, and the test does actually need to cover the case for the change... but it might be an existing test.

Oh and making the code change then fixing the tests that break as a result is not TDD. That's still wrong ;-)

I'm dead keen to hear what anyone might have to say about all this...

Righto.

--
Adam

Saturday 19 August 2017

Mishmash: code review as a learning exercise, loops vs higher-order-functions and testing

G'day:
There's a few things going on in my head here, and none individually are worthy of an article, but perhaps if I dwell on a bunch of stuff something might shake out. I have no idea as I have not drafted this one in my head like I usually do, I'm just "let's start typing and see if I get anywhere". Eek.

OK so yesterday I was code reviewing one of my mates' work, and breezed past this code (this is not the exact code, but it's a replication of it):

public function getSomeThings($rowsToGet)
{
    $thingsValues = $this->dao->getThingsFromStorage($rowsToGet);
    
    $things = [];
    foreach ($thingsValues as $thingValues) {
        $things[] = $this->thingFactory->createFromRawData($thingValues);
    }

    return $things;
}

I just put a note in the pull request "array_map?".

Now I do a few things in a code review. Firstly I check for obvious flaws, unclean code, bad test coverage, typos, etc, and I will raise those as "tasks" in BitBucket: stuff that needs to be addressed before I'll press the "Approved" button. This is our standard approach to code review, everyone expects it, and it's fine. It works.

But secondly I'll just make observations about the code, ask questions, suggest alternative approaches we could have taken, etc. Just as food for thought, a discussion point, or a chance for some learning. This was the latter. I was not suggesting to my mate they needed to change their code to match my whim... the code is fine as it is. It was just "did you think of this?".

An interesting conversation ensued, which I will paraphrase here. Note that Fred and I do not work in the same physical office.

Fred (his name's not Fred): I'm not sure the benefit of swapping simple code for a more complex solution, unless there's some performance gain I'm unaware of.

Adam (my name is Adam): Hmmm... well array_map is one expression, and it reduces the complexity here to a single variable assignment statement(*). The loop version requires building the array by hand, and requires two assignments and the whole loop thing too. So I think the array_map approach is less complex innit?

return array_map(function ($thingValues) {
    return $this->thingFactory->createFromRawData($thingValues);
}, $thingsValues);

(*) OK it's two statements. Never let it be said I don't hyperbolise to better make my point.

One thing I did not get around to saying is that performance is not a consideration here. One of the calls is hitting an external DB, so any processing time differences are going to be eclipsed - by orders of magnitude - by the external call. And, yes, I know that if we'd be inclined to worry about micro-optimisations, then calling a higher-order function is slower than a loop. One should simply not care.

Adam (cont'ed): further more the array_map version only requires a single test, whereas the loop version needs three. That in itself demonstrates the increased complexity.

Fred: err... why's that then?

I'll break away from the conversation here.

The dev process here is that we identify the need to have a function to get stuff from the DB, and then model that and return it to the calling code. So doing TDD we write a test which uses a mocked-out DAO to return known data, and we test that the values in the resultant models are what we expected them to be based on the mocked DAO data.

Here's a very simplified, self-contained example:

First our test:

describe('comparing testing for foreach vs array_map', function() {
    given('sut', function () {
        return new ForeachVsArrayMap();
    });
    given('allObjects', function () {
        return [
            (object) ['id' => 1, 'mi' => 'tahi'],
            (object) ['id' => 2, 'mi' => 'rua'],
            (object) ['id' => 3, 'mi' => 'toru'],
        ];
    });
    describe('tests of foreach', function() {
        it('returns the numbers as objects', function () {
            $result = $this->sut->getObjects();
            expect($result)->toEqual($this->allObjects);
        });
    });
});

And now our implementation:

class ForeachVsArrayMap
{
    private $allData;

    public function __construct()
    {
        $this->allData = [
            ['id' => 1, 'mi' => 'tahi'],
            ['id' => 2, 'mi' => 'rua'],
            ['id' => 3, 'mi' => 'toru']
        ];
    }

    private function getNumbers($rows)
    {
        return array_slice($this->allData, 0, $rows);
    }

    public function getObjects($rows = 3)
    {
        $data = $this->getNumbers($rows);
        $numbers = [];
        foreach ($data as $row) {
            $numbers[] = (object) $row;
        }

        return $numbers;
    }
}

And that passes. That's our TDD side of things done.

However we also now have to do edge testing as well. We've got a loop in there, and all loops need tests for three things: zero iterations, one iteration, and more than one iterations. There are easily introduced logic errors around the plurality of loop iteration.

Here our TDD test has coverd the "more than one iterations" case, so we only need the other two:

it('handles one result', function () {
    $result = $this->sut->getObjectsUsingForeach(1);
    $expected = [$this->allObjects[0]];

    expect($result)->toEqual($expected);
});

it('handles zero results', function () {
    $result = $this->sut->getObjectsUsingForeach(0);
    $expected = [];

    expect($result)->toEqual($expected);
});

Pretty simple tests, but still: two additional tests.

Why don't we need these two tests when using array_map? Because array_map's iteration is all handled within its implementation. It's PHP's job to test that, not ours. So all we need to test is that the inline callback works. And we only need to test that once: when it's called, it does what we expect it to do. We can safely stipulate that array_map will always return an array with the same number of elements as the input value. PHP deals with that; we don't need to.

One good thing about TDD (and test coverage in general) is now we can safely swap in the array_map version, and the test coverage still passes, so we know - with a high degree of certainty - that our code is A-OK.

public function getObjects($rows = 3)
{
    return array_map(function ($row) {
        return (object) $row;
    }, $this->getNumbers($rows));
}

Coincidentally I had another conversation, in another code review, on Friday about the necessity of always testing loops for 0, 1, >1 iterations. "One can just look at the code and know it's fine". This statement didn't come up in the conversation, but I've heard it before.

Three weeks ago I was show-stopped by a bug in one of our libraries. Taking the analogy above, this was the code:

public function getObjectsUsingForeach($rows = 3)
{
    $data = $this->getNumbers($rows);

    foreach ($data as $row) {
        $numbers[] = (object) $row;
    }

    // do something else with $numbers here
    // ...
}

The code had no tests. Can you see what's wrong? The general expectation when calling a function to get data is that there's actually data to get, given the provided criteria. In the case I fell foul of, the value of $rows was zero. And this was legit. So if $rows is 0, what value does $numbers have after the loop? Well: $numbers was undefined.

Had the dev done their testing properly, then they'd've noticed their bug. They never initialised $numbers outside of the code in the loop, so there's a chance it might never get defined. In this case not only did the dev not do the edge-case testing, they didn't even do the baseline TDD testing either, but that's a different story.

I needed to down-tools and fix the library before I could continue my own work. And it was made very tricky by the fact the code in the library had no tests at all, and wasn't written in a clean-code fashion (so was basically untestable). But we are not allowed to work that way any more, so I needed to do a bunch of work the original developer ought to have done.

I have also fell foul of this brainfart too:

public function getObjectsUsingForeach($rows = 3)
{
    $data = $this->getNumbers($rows);

    foreach ($data as $row) {
        $numbers = [];
        $numbers[] = (object) $row;
    }

    // do something else with $numbers here
    // ...
}

This usually crops up when code that was previously used once suddenly needs to be used for multiple iterations. Note that the initialisation of $numbers is inisde the loop, so it will re-initialise every iteration. This would be caught by the "more than one iterations" test. This is rarer, and usually only crops up in completely untested code: it'd be a shit tester who tested a loop with only one iteration as the test, but I've seen it done. Saves time not having to mock so much data, you see? Berks.

So, anyhow... always test yer loops. And try to avoid loops if there are built-in functions that handle the iteration for you.

Also always do TDD, but also understand that there is more testing to do after the TDD cycle. TDD is not the end of the story there.



In the title of this article I mention "code review as a learning exercise". I greatly summarised the back and forth between Fred and myself here... there was an order of magnitude more discussion than there was code under discussion. this sort of interaction is gold. Some people get the idea that code review is kinda like a teacher marking homework. It's partly that, but that's only part. As programmers and coders our job is one of communication: communication of instructions to a computer, both to the computer itself, but also to our future colleagues and selves about the nature of the instructions. Programming languages are a means to communicate instructions to other humans. It's tangential the computer can also be made to understand them. Similarly reviewing each other's code should be a communications and collaboration process, and we absolutely should be having these discussions. Both Fred and I benefited from this conversation: a lot of what I typed here wasn't formalised in my brain before explaning it, but now I have a much clearly picture of what I do and why. And Fred knows too. This was not a waste of time, as we're both now better developers for it. And it might have seemed like a lot of back and forth in the code review, but it really only represents about 10min of discussion. And all this came from just a conversation point, not me saying "change the code".

If we were in the same office, we might have had the conversation via voice instead of typing in the code review, and some people might think that's better. In a way it is, but it also means that the details of the conversation don't get recorded, so benefit fewer people. That said, when programmers are not agreeing on things, often it is better to switch to voice instead of continue the discussion via text-based back-and-forth. It's just easier and more collaborative sometimes.

Right, that's that. A bit random, but didn't turn out so bad I think. And I seem to have found a point to make.

Righto.

--
Adam

Monday 31 July 2017

Yeah, you do want 100% test coverage

G'day:
I've read a number of blog articles recently that have been eshewing 100% test coverage. The rationale being that there's little difference between 90% and 100%, and that that last 10% can be quite tricky to attain as - we being human 'n' all - sometimes leave the worst to last. Obviously we should all be doing TDD so that situation would not arise, but - again - we're human so don't always do TDD for one reason or another ("can't be arsed" is the usual one).

Another rationale is that it's not a race to 100%. IE: the object of the exercise is delivering working software which adds business value; it's not "getting 100% test coverage".

The latter is true. But we still aim for 100% test coverage for basically one good reason: visibility.

First, look at this:



Not bad I guess. Not great.

Now look at this:



We've accidentally forgotten to cover something new. Can you see it (it's one of the provider methods).

On the other hand look at these two:



Somewhat easier to spot, yeah?

This is bloody handy. In this case I just commented-out one of my tests to get the report off 100%, but in the real world I'd see that and go "huh, WTF?", and perhaps discover I had a logic branch untested, or a test I thought was covering something wasn't actually doing so. Or I just forgot to put the @covers annotation on the test. Whatever the situation: it's handy for the shortfall to be so obvious.

What it also means is part of our pull-request process is running a Bamboo build which runs all our tests (plus some other code quality tools):

 Everything's green here, so it's OK for code review. If it was failing, I'd get a nudge from my reviewers going "erm... you sure this is ready to go?" And even if I thought it was, we can't merge anything whilst the build is failing anyhow. The issue needs to be addressed.

Now the key thing here is this:


We don't test everything, but we cover everything. Where "cover" might be "we don't need to test this". In this case a single variable assignment does not need testing. We trust PHP to follow simple instructions. The thing is: this still shows up as green in the test coverage report.

We do this in a few situations:

  • the method has not branching logic and is overwise pretty simple. Ignore it.
  • we're backfilling coverage on old code we need to maintain, but it's such a mess we simply don't have time to solve all the testing shortfalls right now, so we test what is urgent (the bit we're changing), but just mark as ignore the rest of it. Now that file is green, and we can see if it stops being green.
  • similar to the above but we know the code the code in question is  not long for this world. So we ignore it and stick a TODO cross referencing to the ticket that will be dealing with it. And the ticket must already be lined-up to be worked on "soon".

There might be other situations where we need to be pragmatic and skip some stuff, but this needs to be explained to the code reviewers, and get their agreement.

But being able to run a report that says "100%", and shows up anything that isn't 100% is gold.

Get yer coverage report green, but don't be afraid to be pragmatic and skip stuff, if there's an informed reason to do so.

Now this isn't to say one can just pepper ignore annotations around the place just to get the report green. That's daft. The object of the exercise is not to make some report green / say 100%. It's to get good code quality. This is just a tool to help identify the quality: it's still up to the devs to provide the quality. Don't forget that. It needs to be an informed, honest decision for one to ignore code.

Righto.

--
Adam