Showing posts with label Unit Testing. Show all posts
Showing posts with label Unit Testing. Show all posts

Friday, 19 February 2021

Part 12: unit testing Vue.js components

G'day

OKOK, another article in that bloody series I've been doing. Same caveats as usual: go have a breeze through the earlier articles if you feel so inclined. That said this is reasonably stand-alone.

  1. Intro / Nginx
  2. PHP
  3. PHPUnit
  4. Tweaks I made to my Bash environment in my Docker containers
  5. MariaDB
  6. Installing Symfony
  7. Using Symfony
  8. Testing a simple web page built with Vue.js using Mocha, Chai and Puppeteer
  9. I mess up how I configure my Docker containers
  10. An article about moving files and changing configuration
  11. Setting up a Vue.js project and integrating some existing code into it
  12. Unit testing Vue.js components (this article)

This is really very frustrating. You might recall I ended my previous article with this:

Now I just need to work out how to implement a test of just the GreetingMessage.vue component discretely, as opposed to the way I'm doing it now: curling a page it's within and checking the page's content. […]

Excuse me whilst I do some reading.

[Adam runs npm install of many combinations of NPM libs]

[Adam downgrades his version of Vue.js and does the npm install crap some more]

OK screw that. Seems to me - on initial examination - that getting all the libs together to make stand-alone components testable is going to take me longer to work out than I have patience for. I'll do it later. […] Sigh.

I have returned to this situation, and have got everything working fine. Along the way I had an issue with webpack that I eventually worked around, but once I circled back to replicate the work and write this article, I could no longer replicate the issue. Even rolling back to the previous version of the application code and step-by-step repeating the steps to get to where the problem was. This is very frustrating. However other people have had similar issues in the past so I'm going to include the steps to solve the problem here, even if I have to fake the issue to get error messages, etc.

Right so I'm back with the Vue.js docs regarding testing: "Vue.JS > Testing > Component Testing". The docs are incredibly content-lite, and pretty much just fob you off onto other people to work out what to do. Not that cool, and kinda suggests Vue.js considers the notion of unit testing pretty superficially. I did glean I needed to install a coupla Node modules.

First, @testing-library/vue, for which I ran into a glitch immediately:

root@00e2ea0a3109:/usr/share/fullstackExercise# npm install --save-dev @testing-library/vue
npm ERR! code ERESOLVE
npm ERR! ERESOLVE unable to resolve dependency tree
npm ERR!
npm ERR! While resolving: full-stack-exercise@2.13.0
npm ERR! Found: vue@3.0.5
npm ERR! node_modules/vue
npm ERR!   vue@"^3.0.0" from the root project
npm ERR!
npm ERR! Could not resolve dependency:
npm ERR! peer vue@"^2.6.10" from @testing-library/vue@5.6.1
npm ERR! node_modules/@testing-library/vue
npm ERR!   dev @testing-library/vue@"*" from the root project
npm ERR!
npm ERR! Fix the upstream dependency conflict, or retry
npm ERR! this command with --force, or --legacy-peer-deps
npm ERR! to accept an incorrect (and potentially broken) dependency resolution.
npm ERR!
npm ERR! See /var/cache/npm/eresolve-report.txt for a full report.

npm ERR! A complete log of this run can be found in:
npm ERR!     /var/cache/npm/_logs/2021-02-19T11_29_15_901Z-debug.log
root@00e2ea0a3109:/usr/share/fullstackExercise#

The current version of @testing-library/vue doesn't work with the current version of Vue.js. Neato. Nice one Vue team. After some googling of "wtf?", I landed on an issue someone else had raised already: "Support for Vue 3 #176": I need to use the next branch (npm install --save-dev @testing-library/vue@next). This worked OK.

The other module I needed was @vue/cli-plugin-unit-mocha. That installed with no problem. This all gives me the ability to run vue-cli-service test:unit, which will run call up Mocha and run some tests. The guidance is to set this up in package.json, thus:

  "scripts": {
    "test": "mocha test/**/*Test.js",
    "serve": "vue-cli-service serve --watch",
    "build": "vue-cli-service build",
    "lint": "vue-cli-service lint",
    "test:unit": "vue-cli-service test:unit test/unit/**/*.spec.js"
  },

Then one can jus run it as npm run test:unit.

I looked at the docs for how to test a component ("Unit Testing Vue Components" - for Vue v2, but there's no equivalent page in the v3 docs), and knocked together an initial first test which would just load the component and do nothing else: just to check everything was running (frontend/test/unit/GreetingMessage.spec.js (final version) on Github):

import GreetingMessage from "../../src/gdayWorld/components/GreetingMessage";

import { shallowMount } from "@vue/test-utils";

import {expect} from "chai";

describe("Tests of GreetingMessage component", () => {
    it("should successfully load the component", () => {
        let greetingMessage = shallowMount(GreetingMessage, {propsData: {message: "G'day world"}});

        expect(true).to.be.true;
    });
});

Here's where I got to the problem I now can't replicate. When I ran this, I got something like:

root@eba0490b453d:/usr/share/fullstackExercise# npm run test:unit

> full-stack-exercise@2.13.0 test:unit
> vue-cli-service test:unit test/unit/**/*.spec.js

 WEBPACK  Compiling...

  [=========================] 98% (after emitting)

 ERROR  Failed to compile with 1 error

error  in ./node_modules/mocha/lib/cli/cli.js

Module parse failed: Unexpected character '#' (1:0)
File was processed with these loaders:
* ./node_modules/cache-loader/dist/cjs.js
* ./node_modules/babel-loader/lib/index.js
* ./node_modules/eslint-loader/index.js
You may need an additional loader to handle the result of these loaders.
> #!/usr/bin/env node
| 'use strict';
|
[…etc…]

There's a JS file that has a shell-script shebang thing at the start of it, and the Babel transpiler doesn't like that. Fair enough, but I really don't understand why it's trying to transpile stuff in the node_modules directory, but at this point in time, I just thought "Hey-ho, it knows what it's doing so I'll take its word for it".

Googling about I found a bunch of other people having a similar issue with Webpack compilation, and the solution seemed to be to use a shebang-loader in the compilation process (see "How to keep my shebang in place using webpack?", "How to Configure Webpack with Shebang Loader to Ignore Hashbang…", "Webpack report an error about Unexpected character '#'"). All the guidance for this solution was oriented aroud sticking stuff in the webpack.config.js file, but of course Vue.js hides that away from you, and you need to do things in a special Vue.js way, but adding stuff with a special syntax to the vue.config.js file. The docs for this are at "Vue.js > Working with Webpack". The docs there showed how to do it using chainWebpack, but I never actually got this approach to actually solve the problem, so I mention this only because it's "something I tried".

From there I starting thinking, "no, seriously why is it trying to transpile stuff in the node_modules directory?" This does not seem right. I changed my googling tactic to try to find out what was going on there, and came across "Webpack not excluding node_modules", and that let me to update my vue.config.js file to actively exclude node_modules (copied from that answer):

var nodeExternals = require('webpack-node-externals');
...
module.exports = {
    ...
    target: 'node', // in order to ignore built-in modules like path, fs, etc. 
    externals: [nodeExternals()], // in order to ignore all modules in node_modules folder 
    ...
};

And that worked. Now when I ran the test I have made progress:

root@eba0490b453d:/usr/share/fullstackExercise# npm run test:unit

> full-stack-exercise@2.13.0 test:unit
> vue-cli-service test:unit test/unit/**/*.spec.js

 WEBPACK   Compiling...

  [=========================] 98% (after emitting)

 DONE   Compiled successfully in 1161ms

  [=========================] 100% (completed)

WEBPACK  Compiled successfully in 1161ms

MOCHA  Testing...



  Tests of GreetingMessage component
    ✓ should successfully load the component


  1 passing (17ms)

MOCHA  Tests completed successfully

root@eba0490b453d:/usr/share/fullstackExercise#

From there I rounded out the tests properly (frontend/test/unit/GreetingMessage.spec.js):

import GreetingMessage from "../../src/gdayWorld/components/GreetingMessage";

import { shallowMount } from "@vue/test-utils";

import {expect} from "chai";

describe("Tests of GreetingMessage component", () => {

    let greetingMessage;
    let expectedText = "TEST_MESSAGE";

    before("Load test GreetingMessage component", () => {
        greetingMessage = shallowMount(GreetingMessage, {propsData: {message: expectedText}});
    });

    it("should return the correct heading", () => {
        let heading = greetingMessage.find("h1");
        expect(heading.exists()).to.be.true;

        let headingText = heading.text();
        expect(headingText).to.equal(expectedText);
    });

    it("should return the correct content", () => {
        let contentParagraph = greetingMessage.find("h1+p");
        expect(contentParagraph.exists()).to.be.true;

        let contentParagraphText = contentParagraph.text();
        expect(contentParagraphText).to.equal(expectedText);
    });
});

Oh! Just a reminder of what the component is (frontend/src/gdayWorld/components/GreetingMessage.vue)! Very simple stuff, as the tests indicate:

<template>
    <h1>{{ message }}</h1>
    <p>{{ message }}</p>
</template>

<script>
export default {
  name: 'GreetingMessage',
  props : {
    message : {
      type: String,
      required: true
    }
  }
}
</script>

One thing I found was that every time I touched the test file, I was getting this compilation error:

> full-stack-exercise@2.13.0 test:unit
> vue-cli-service test:unit test/unit/**/*.spec.js

WEBPACK  Compiling...

  [=========================] 98% (after emitting)

ERROR  Failed to compile with 1 error

error  in ./test/unit/GreetingMessage.spec.js

Module Error (from ./node_modules/eslint-loader/index.js):

/usr/share/fullstackExercise/test/unit/GreetingMessage.spec.js
   7:1  error  'describe' is not defined  no-undef
  12:5  error  'before' is not defined    no-undef
  16:5  error  'it' is not defined        no-undef
  24:5  error  'it' is not defined        no-undef

error4 problems (4 errors, 0 warnings)

But if I ran it again, the problem went away. Somehow ESLint was getting confused by things; it only lints things when they've changed, and on the second - and subsequent - runs it doesn't run so the problem doesn't occur. More googling, and I found this: "JavaScript Standard Style does not recognize Mocha". The guidance here is to let the linter know I'm running Mocha, with the inference that there will be some global functions it can just assume are legit. This is just an entry in package.json:

  "eslintConfig": {
    "root": true,
    "env": {
      "node": true,
      "mocha": true
    },
    // etc

Having done that, everything works perfectly, and despite the fact that is a very very simple unit test… I'm quite pleased with myself that I got it running OK.

After sorting it all out, I reverted everything in source control back to how it was before I started the exercise, so as to replicate it and write it up here. This is when I was completely unable to reproduce that shebang issue at all. I cannot - for the life of me - work out why not. Weird. Anyway, I decided to not waste too much time trying to reproduce a bug I had solved, and just decided to mention it here as a possible "thing" that could happen, but otherwise move on with my life.

I have now got to where I wanted to get at the beginning of this series of articles, so I am gonna stop now. My next task with Vue.js testing is to work out how to mock things, because my next task will require me to make remote calls and/or hit a DB. And I don't want to be doing that when I'm running tests. But that was never an aim of this article series, so I'll revert to stand-alone articles now.

Righto.

--
Adam

Thursday, 11 February 2021

Thoughts on Working Code podcast's Testing episode

G'day:

Working Code (@WorkingCodePod on Twitter) is a podcast by some friends and industry colleagues: Tim Cunningham, Carol Hamilton, Ben Nadel and Adam Tuttle.


(apologies for swiping your image without permission there, team)

It's an interesting take on a techo podcast, in their own words from their strapline:

Working Code is a technology podcast unlike all others. Instead of diving deep into specific technologies to learn them better, or focusing on soft-skills, this one is like hanging out together at the water cooler or in the hallway at a technical conference. Working Code celebrates the triumphs and fails of working as a developer, and aims to make your career in coding more enjoyable.

I think they achieve this, and it makes for a good listen.

So that's that, I just wanted to say they've done good work, and go listen.

Oh, just one more thing.

Yesterday they released their episode "Testing". I have to admit my reaction to a lot of what was said was… "poor", so I pinged my namesake and said "I have some feedback". After a brief discussion on Signal, Adam & I concluded that I might try to do a "reaction blog article" on the topic, and they might see if they can respond to the feedback, if warranted, at a later date. They are recording tonight apparently, and I'm gonna try to get this across to them for their morning coffee.

Firstly as a reminder: I'm pretty keen on testing, and I am also keen on TDD as a development practice. I've written a fair bit on unit testing and TDD both. I'm making a distinction between unit testing and TDD very deliberately. I'll come back to this. But anyway this is why I was very very interested to see what the team had to say about testing. Especially as I already knew one of them doesn't do automated testing (Ben), and another (Carol) I believe has only recently got into it (I think that's what she said, a coupla episodes ago). I did not know Adam or Tim's position on it.

And just before I get under way, I'll stick a coupla Twitter messages here I saw recently. At the time I saw them I was thinking about Ben's claim to a lack of testing, and they struck a particular chord with me.




I do not know Maaret or Mathias, but I think they're on the money here.

OK So I'm now listening to the podcast now. I'm going to pull quotes from it, and comment on them where I think it's "necessary".


Ahem.

Adam @ 11:27:

…up front we should acknowledge you know we're not testing experts. None of us [...] have been to like 'testing college'. […]There's a good chance we're going to get something wrong.

I think, in hindsight, this podcast needed this caveat to be made louder and clearer. To be blunt - and I don't think any would disagree with me here - all four are very far from being testing experts. Indeed one is even a testing naesayer. I think there's some dangerously ill-informed opinions being expressed as the podcast progresses, and as these are all people who are looked-up-to in their community, I think there's a risk people will take onboard what they say as "advice". Even if there's this caveat at the beginning. This might seem like a very picky thing to draw on, but perhaps I should have put it at then end of the article, after there's more context.

Carol @ 12:07:

Somebody find that monster already.

Hi guys.

Ben @ 12:41

I test nothing. And it's not like a philosophical approach to life, it's more just I'm not good at testing

Adam @ 13:09:

Clearly that's working out pretty well for you, you've got a good career going.

I hear this a bit. "I don't test and I get by just fine". This is pretty woolly thinking, and it's false logic people will jump on to justify why they don't do things. And Adam is just perpetuating the myth here. The problem with this rationalisation is demonstrated with an analogy of going on a journey without a map and just wandering aimlessly but you still (largely accidentally) arrive at your intended destination. In contrast had you used a map, you'd've been more efficient with your time and effort, and been able to progress even further on the next leg of your journey sooner. Ben's built a good career for himself. Undoubtedly. Who knows how much better it would be had he… used a map.

Also going back to Ben's comment about kind of explaining away why he doesn't test because he's not good at it. Everyone starts off not being good at testing mate. The rest of us do something about it. This is a disappointing attitude from someone as clued-up as Ben. Also if you don't know about something… don't talk about it mate. Inform yourself first and then talk about it.

Ben @ 13:21:

[…] some additional context. So - one - I work on a very small team. Two: all the people who work on my team are very very familiar with the software. Three: we will never ever hire a new engineer specifically for my team. Cos I work on the legacy codebase. The legacy codebase is in the process of being phased out. […] I am definitely in a context where I don't have to worry about hiring a new person and training the up on a system and then thinking they'll touch something in the code that they don't understand how it works. That's like the farthest possible thing from my day-to-day operations currently.

Um… so? I'm not being glib. You're still writing new logic, or altering existing logic. If you do that, it intrinsically needs testing. I mean you admitted you do manual testing, but it beggars belief that a person in the computer industry will favour manually performing a pre-defined repetitive task as a (prone-to-error) human, rather than automating this. We're in the business of automating repetitive tasks!

I'd also add that this would be a brilliant, low-risk, environment for you to get yourself up to speed with TDD and unit testing, and work towards the point where it's just (brain) muscle memory to work that way. And then you'll be all ready once you progress to more mission-critical / contemporary codebases.

Ben @ 14:42:

I can wrap my head around testing when it comes to testing a data workflow that is completely pure, meaning you have a function or you have a component that has functions and you give it some inputs and it generates some outputs. I can 100% wrap my head around testing that. And sometimes actually when I'm writing code that deals with something like that, even though I'm not writing tests per se, I might write a scratch file that instantiates that component and sends data to it and checks the output just during the development process that I don't have to load-up the whole application.

Ben. That's a unit test. You have written a unit test there. So why don't you put it in a test class instead of a scratch file, and - hey presto - you have a persistent test that will guard against that code's behaviour somehow changing to break those rules later on. You are doing the work here, you're just not doing it in a sensible fashion.

Ben @ 15:18:

Where it breaks down immediately for me is when I have to either a) involve a database, or b) involve a user interface. And I know that ther's all kinds of stuff that the industry has brought to cater to those problems. I've just never taken the time to learn.

There's a bit of a blur here between the previous train of thought which was definitely talking about unit tests of a unit of code, and now we're talking about end-to-end testing. These are two different things. I am not saying Ben doesn't realise this, but they're jammed up next to each other in the podcast so the distinction is not being made. These two kinds of testing are separate ideas, only really coupled by the fact they are both types of testing. Ben's right, the tooling is there, and - in my experience at least with the browser emulation stuff - it's pretty easy and almost fun to use. Ben already tests his stuff manually every time he does a release, so it would seem sensible to me to take the small amount of time it takes to get up to speed with these things, and then instead of testing something manually, take the time to automate the same testing, then it's taken care of thenceforth. It just a matter of being a bit more wise with one's time usage.

Adam @ 16:39

The reason that we don't have a whole lot of automated tests for our CFML code is simply performance. So when we started our product I tried really hard to do TDD. If I was writing a new module or a new section of that module I would work on tests along with that code, and would try to stay ahead of the game there. And what ended up happening was I had for me - let's say - 500 functions that could run, I had 400 tests. And I don't want to point a finger at any particular direction, but when you take the stack as a whole and you say "OK now run my test suite" and it takes ten minutes to run those tests and [my product, the project I was working on] is still in its infancy, and you can see this long road of so much more work that has to be done, and it takes ten minutes to run the tests - you know, early on - there was no way that that was going to be sustainable. So we kind of abandoned hope there. [...] I have, in more recent years, on a more recent stack seen way better performance of tests. [...] So we are starting to get more into automated testing and finding it actually really helpful. [...] I guess what I wanted to say there is that a perfectly valid reason to have fewer or no tests is if it doesn't work well on your platform.

Adam starts off well here, both in what he's saying and his historical efforts with tests, but he then goes on to pretty much blame ColdFusion for not being very good at running tests. This is just untrue, sorry mate. We had thousands upon thousands of tests running on ColdFusion, and they ran in an amount of time best measured in seconds. And when we ported that codebase to PHP, we had a similar number of test cases, and they ran in round about the same amount of time (PHP was faster, that said, but also the tests were better). I think the issue is here - and confirms this about 30sec after the quote above, and again about 15min later when he comes back to this - is that your tests weren't written so well, and they were not focused on the logic (as TDD-oriented tests ought to be), they were basically full integration tests. Full integration tests are excellent, but you don't want your tight red / green / refactor testing cycle to be slowed down by external services like databases. That's the wrong sort of testing there. My reaction to you saying your test runs are slow is not to say "ColdFusion's fault", it's to say "your tests' fault". And that's not a reason to not test. It's a reason to check what you've been doing, and fix it. I'm applying hindsight here for you obviously, but this ain't the conclusion/message you should be delivering here.

Carol @ 20:03:

I also want to say that if you are starting out and you're starting to add test, don't let slowness stop you from doing it.

Spot on. I hope Ben was listening there. When you start to learn something new, it is going to take more time. I think this is sometimes why people conclude that testing takes a lot of time: the people arriving at that conclusion are basing it on their time spent on the learning curve. Accept that things go slow when you are learning, but also accept that things will become second nature. And especially with writing automated tests it's not exactly rocket-science, the initial learning time is not that long. Just… decide to learn how to test stuff, start automating your testing, and stick at it.

Ben @ 21:24:

I was thinking about debugging incidents and getting a page in the middle of the night and having to jump on a call and you seeing the problem, and now you have to do a hotfix, and push a deployment in the middle of the night […]. And imagine having to sit there for 30 minutes for your tests to run just so you can push out a hotfix. Which I thought to myself: that would drive me crazy.

At this point I think Ben is just trying to invent excuses to justify to himself why he's right to eschew testing. I'm reminded of Maaret's Twitter message I included above. The subtext of Ben's here is that if one tests manually, then you're more flexible in what you can choose to re-test when you are hotfixing. Well obviously if you can make that call re manual tests, then you can make the same call with automated tests! So his position here is just specious. Doubly so because automated tests are intrinsically going to be faster than the equivalent manual tests to start with. Another thing I'll note that with this entire analogy: you've already got yourself in a shit situation by needing to hotfix stuff in the middle of the night. Are you really sure you want to be less diligent in how you implement that fix? In my experience that approach can lead to a second / third / fourth hotfix being needed in rapid succession. Hotfix situations are definitely ones of "work smarter, not faster".

Ben @ 21:56:

I'm wondering if there should be a test budget that you can have for your team where you like have "here is the largest amount of time we're willing to let testing block a deployment". And anything above that have to be tests that sit in an optional bucket where it's up to the developer to run them as they see fit, but isn't necessarily tests that would block deployment. I don't know if that's totally crazy.

Adam continues @ 22:36:

You have to figure out which tests are critical path, which ones are "must pass", and these ones are like "low risk areas" […] are the things I would look for to make optional.

Yep, OK there's some sense here, but I can't help thinking that we are talking about testing in this podcast, and we're spending our time inventing situations in which we're not gonna test. It all seems a bit inverted to me. How about instead you just do your testing and then if/when a situation arises you then deal with it. Instead of deciding there will be situations and justify to yourselves why you oughtn't test in the first place.

But I also have to wonder: why the perceived rush here? What's wrong with putting over 30min to test stuff if it "proves" thaty your work has maintained stability, and you'll be less likely to need that midnight hotfix. What percentage of the whole cycle time of feature request to delivery is that 30 minutes? Especially if taking the effort to write the tests in the first place will inately improve the stability of your code in the first place, and then help to keep it stable? It's a false economy.

Tim @ 23:15:

When we have contractors do work for us. I require unit tests. I require so much testing just because it's a way for me to validate the truth of what they're saying they've done. So that everything that we have that's done by third parties is very well tested, and it's fantastic because I have a high level of confidence.

Well: precisely. Why do you not want that same level of confidence in your in-house work? Like you say: confidence is fantastic. Be fantastic, Tim. Also: any leader should eat their own dogfood I think. If there's sense in you making the contractors work like this, clearly you ought to be working that way yourself.

Tim @ 23:36:

Any time I start a new project, if I have a greenfields project, I always start with some level of unit tests, and then I get so involved in the actual architecture of the system that I put it off, and like "well I don't really need a test for this", "I'm not really sure where I'm going with this, so I'm not going to write a test first" because I'm kinda experimenting. Then my experiment becomes reality, then my reality becomes the released version. And then it's like "well what's the point of writing a test now?"

I think we've all been there. I think what Tim needs here is just a bit more self-discipline in identifying what is "architectural spike" and what's "now doing the work". If one is doing TDD, then the spike can be used to identify the test cases (eg "it's going to need to capture their phone number") without necessarily writing the test to prove the phone number has been captured. So you write this:

describe("my new thing", function () {
    it ("needs to capture the phone number", function () {
        // @todo need to test this
    });
});

And then when you detect you are not spiking any more, you write the test, and then introduce the code to make the test pass. I also think Tim is overlooking that the tests are not simply there for that first iteration, they are then there proving that code is stable for the rest of the life of the code. This… builds confidence.

Adam @ 24:22:

That's what testing is all about, right? It's increasing confidence that you can deploy this code and nothing is going to be wrong with it. […] When I think about testing, the pinnacle of testing for me is 100% confidence that I can deploy on my way out the door at 4:55pm on Friday afternoon, with [a high degree of ~] confidence that I am not going to get paged on Saturday at 4am because some of that code that I just deployed… it went "wrong".

Exactly.

Carol @ 25:12:

What difference between the team I'm on and the team you guys have is we have I think it's 15-ish people touching the exact same code daily. So a patch I can put out today may have not even been in the codebase they pulled yesterday when they started working on a bug, or a week ago when they had theirs. So me writing that little extra bit of test gives them some accountability for what I've done, and me some.

Again: exactly.

Ben @ 26:36:

Even if you have a huge test suite, I can't help but think you have to do the manual testing, because what if something critical was missed. [...] I think the exhaustive test suite, what that does is it catches unexpected bugs unrelated. Or things that broke because you didn't expect them to break in a certain way. And I think that's very important.

To Ben's first point, you could just as easily (and arguably more validly) switch that around: a human, doing ad-hoc manual testing is more likely to miss something, because every manual test run is at their whim and subject to their focus and attention at the time. Whereas the automated tests - which let's not forget were written by a diligent human, but right at the time they are most focused on the requirements - are run by the computer and it will do exactly the same job every time. What having the historical corpus of automated tests give you is increased confidence that all that stuff being tested still works the way it is supposed to, so the manual testing - which is always necessary, can be more a case of dotting the Is and crossing the Ts. With no automated tests, the manual tests need to be exhaustive. And the effort needs to be repeated every release (Adam mentions this a few minutes later as well).

To the second point: yeah precisely. Automated tests will pick up regressions. And the effort to do this only needs to be done once (writing the test). Without automated tests, you rely on the manual testing to pick this stuff up, but - being realistic - if your release is focused on PartX of the code, your manual tests are going to focus there, and possibly not bother to re-test PartZ which has just inadvertantly been broken by PartX's work.

Ben also mentions this quote from Rich Hickey "Q: What happened to every bug out there? A: it passed the type checker, and it passed all tests." (I found this reference on Google: Simple Made Easy, it's at about 15:45). It's a nifty quote, but what it's also saying is that there wasn't actually a test for the buggy behaviour. Because if there was one: the test would have caught it. The same could be said more readily of manual-only testing. Obviously nothing is going to be 100%, but automated tests are going to be more reliable at maintaining the same confidence level, and be less effort, than manual-only testing.

Ben @ 28:15:

When people say it increases the velocity of development over time. I have trouble embracing that.

(Ben's also alluding back to a comment he made immediately prior to that, relating to always needing to manually test anyhow). "Over time" is one of the keys here. Once a test is written once, it's there. It sticks around. Every subsequent test round there is no extra effort to test that element of the application (Tim draws attention to this a coupla minutes later too). With manual testing the effort needs to be duplicated every time you test. Surely this is not complicated to understand. Ben's point about "you still need to manually test" misses the point that if there's a foundation of automated tests, your manual testing can become far more perfunctory. Without the tests: the manual testing is monolithic. Every. Single. Time. To be honest though, I don't know why I need to point this out. It's a) obvious; and b) very well-trod ground. There's an entire industry that thinks automated tests are the foundation of testing. And then there's Ben who's "just not sure". This is like someone "just not sure" that the world isn't actually flat. It's no small amount of hubris on his part, if I'm honest. And obviously Ben is not the only person out there in the same situation. But he's the one here on this podcast supposedly discussing testing.

Ben @ 37:08:

One thing that I've never connected with: when I hear people talk about testing, there's this idea of being able to - I think they call them spies? - create these spies where you can see if private methods get called in certain ways. And I always think to myself: "why do you care about your private methods?" That's an implementation detail. That private method may not exist next week. Just care about what your public methods are returning and that should inherently test your private methods. And people have tried to explain it to me why actually sometimes you wanna know, but I've ust never understood it.

Yes good point. I can try to explain. I think there's some nuance missing in your understanding of what's going on, and what we're testing. It starts with your position that testing is only concerning itself with (my wording, paraphrasing you from earlier) "you're interested in what values a public method takes, and what it returns". Not quite. You care about given inputs to a unit, whether the behaviour within the unit correctly provides the expected outputs from the unit. The outputs might not be the return value. Think about a unit that takes a username and password, hashes the password, and saves it to the DB. We then return the new ID of the record. Now… we're less interested in the ID returned by the method, we are concerned that the hashing takes place correctly. There is an output boundary of this unit at the database interface. We don't want our tests to actually hit the database (too slow, as Adam found out), but we mock-out the DB connector or the DAO method being called that takes the value that the model layer has hashed. When then spy on the values passed to the DB boundary, and make sure it's worked OK. Something like this:

describe("my new thing", function () {
    it ("hashes the password", function () {
        testPassword = "letmein"
        expectedHash = "whatevs"
        
        myDAO = new Mock(DAO)
        myDAO.insertRecord.should.be.passed(anything(), expectedHash)
        
        myService = new Service(myDAO)
        
        newId = myService.addUser("LOGIN_ID_NOT_TESTED", testPassword)
        
        newId.should.be.integer() // not really that useful
    });
});

class Service {

    private dao

    Service(dao) {
        this.dao = dao
    }
    
    addUser(loginId, password) {
        hashedPassword = excellentHashingFunction(password)
        
        return this.dao.insertRecord(loginId, hashedPassword)
    }
}

class DAO {
    insertRecord(loginId, password) {
        return db.insertQuery("INSERT INTO users (loginId, password) VALUES (:loginId, :password)", [loginId, password])
    }
}

OK so insertRecord isn't a private method here, but the DAO is just an abstraction from the public interface of the unit anyhow, so it amounts to the same thing, and it makes my example clearer. insertRecord could be a private method of Service.

So the thing is that you are checking boundaries, not specifically method inputs/outputs.

Also, yes, the implementation of DAO might change tomorrow. But if we're doing TDD - and we should be - the tests will be updated at the same time. More often than not though, the implementation isn't as temporary as this line of thought often assumes (for the convenience of the argument, I suspect).

Adam @ 48:41:

The more that I learn how to test well, and the more that I write good tests, the more I become a believer in automated testing (Carol: Amen). […] The more I do it the better I get. And the better I get the more I appreciate what I can get from it.

Indeed.

Tim @ 49:32:

In a business I think that short term testing is a sunk cost maybe, but long term I have seen the benefit of it. Particularly whenever you are adding stuff to a mature system, those tests pay dividends later. They don't pay dividends now […] (well they don't pay as many dividends now) […] but they do pay dividends in the long run.

Also a good quote / mindset. Testing is about the subsequent rounds of development as much as the current one.

Ben @ 50:05:

One thing I've never connected with emotionally, when I hear people talk about testing, is when they refer to tests as providing documentation about how a feature is supposed to work. And as someone who has tried to look at tests to understand why something's not working, I have found that they provide no insight into how the feature is supposed to work. Or I guess I should say specifically they don't provide answers to the question that I have.

Different docs. They don't provide developer docs, but if following BDD practices, they can indicate the expected behaviour of the piece of functionality. Here's the test run from some tests I wrote recently:

root@1b011f8852b1:/usr/share/nodeJs# npm test

> nodejs@1.0 test
> mocha test/**/*.js



  Tests for Date methods functions
    Tests Date.getLastDayOfMonth method
      ✓ returns Jan 31, given Jan 1
      ✓ returns Jan 31, given Jan 31
      ✓ returns Feb 28, given Feb 1 in 2021
      ✓ returns Feb 29, given Feb 1 in 2020
      ✓ returns Dec 31, given Dec 1
      ✓ returns Dec 31, given Dec 31
    Tests Date.compare method
      ✓ returns -1 if d1 is before d2
      ✓ returns 1 if d1 is after d2
      ✓ returns 0 if d1 is the same d2
      ✓ returns 0 if d1 is the same d2 except for the time part
    Tests Date.daysBetween method
      ✓ returns -1 if d1 is the day before d2
      ✓ returns 1 if d1 is the day after d2
      ✓ returns 0 if d1 is the same day as d2
      ✓ returns 0 if d1 is the same day as d2 except for the time part
    Tests for addDays method
      ✓ works within a month
      ✓ works across the end of a month
      ✓ works across the end of the year
      ✓ works with zero
      ✓ works with negative numbers

  Tests a method Reading.getEstimatesFromReadingsArray that returns an array of Readings representing month-end estimates for the input range of customer readings
    Tests for validation cases
      ✓ should throw a RangeError if the readings array does not have at least two entries
      ✓ should not throw a RangeError if the readings array has at least two entries
      ✓ should throw a RangeError if the second date is not after the first date
    Tests for returned estimation array cases
      ✓ should not include a final month-end reading in the estimates
      ✓ should return the estimate between two monthly readings
      ✓ should return three estimates between two reading dates with three missing estimates
      ✓ should return the integer part of the estimated reading
      ✓ should return all estimates between each pair of reading dates, for multiple reading dates
      ✓ should not return an estimate if there was an actual reading on that day
      ✓ should return an empty array if all readings are on the last day of the month
      ✓ tests a potential off-by-one scenario when the reading is the day before the end of the month

  Tests for helper functions
    Tests for Reading.getEstimationDatesBetweenDates method
      ✓ returns nothing when there are no estimates dates between the test dates
      ✓ correctly omits the first date if it is an estimation date
      ✓ correctly omits the second date if it is an estimation date
      ✓ correctly returns the last date of the month for all months between the dates

  Test Timer
    ✓ handles a lap (100ms)

  Test TimerViaPrototype
    ✓ handles a lap (100ms)


  36 passing (221ms)

root@1b011f8852b1:/usr/share/nodeJs#

When I showed this to the person I was doing the work for, he immediately said "no, that test case is wrong, you have it around the wrong way", and they were right, and I fixed it. That's the documentation "they" are talking about.

Oh, and Carol goes on to confirm this very thing one minute later.

Also bear in mind that just cos a test could be written in such a way as to impart good clear information doesn't mean that all tests do. My experience with looking at open-source project's tests to get any clarity on things (and I include testing frameworks' own tests in this!), I am left knowing less than I did before I looked. It's almost like there's a rule in OSS projects that the code needs to be shite or they won't accept it ;-)


And that's it. It was an interesting podcast, but I really really strongly disagreed with most of what Ben said, and why he said it. It would be done thing if he was held to account (and the others tried this at times), but as it is other than joking that Ben is a nae-sayer, I think there's some dangerous content in here.

Oh, one last thing… in the outro the team suggests some resources for testing. Most of what the suggested seems to be "what to do", not "why you do it". I think the first thing one should do when considering testing is to read Test Driven Development by Kent Beck. Start with that. Oh this reminds me… not actually much discussion on TDD in this episode. TDD is tangential to testing per se, but it's an important topic. Maybe they can do another episode focusing on that.

Follow-up

The Working Code Podcast team have responded to my obersvations here in a subsequent podcast episode, which you can listen to here: 011: Listener Questions #1. Go have a listen.

Righto.

--
Adam

Saturday, 19 August 2017

Mishmash: code review as a learning exercise, loops vs higher-order-functions and testing

G'day:
There's a few things going on in my head here, and none individually are worthy of an article, but perhaps if I dwell on a bunch of stuff something might shake out. I have no idea as I have not drafted this one in my head like I usually do, I'm just "let's start typing and see if I get anywhere". Eek.

OK so yesterday I was code reviewing one of my mates' work, and breezed past this code (this is not the exact code, but it's a replication of it):

public function getSomeThings($rowsToGet)
{
    $thingsValues = $this->dao->getThingsFromStorage($rowsToGet);
    
    $things = [];
    foreach ($thingsValues as $thingValues) {
        $things[] = $this->thingFactory->createFromRawData($thingValues);
    }

    return $things;
}

I just put a note in the pull request "array_map?".

Now I do a few things in a code review. Firstly I check for obvious flaws, unclean code, bad test coverage, typos, etc, and I will raise those as "tasks" in BitBucket: stuff that needs to be addressed before I'll press the "Approved" button. This is our standard approach to code review, everyone expects it, and it's fine. It works.

But secondly I'll just make observations about the code, ask questions, suggest alternative approaches we could have taken, etc. Just as food for thought, a discussion point, or a chance for some learning. This was the latter. I was not suggesting to my mate they needed to change their code to match my whim... the code is fine as it is. It was just "did you think of this?".

An interesting conversation ensued, which I will paraphrase here. Note that Fred and I do not work in the same physical office.

Fred (his name's not Fred): I'm not sure the benefit of swapping simple code for a more complex solution, unless there's some performance gain I'm unaware of.

Adam (my name is Adam): Hmmm... well array_map is one expression, and it reduces the complexity here to a single variable assignment statement(*). The loop version requires building the array by hand, and requires two assignments and the whole loop thing too. So I think the array_map approach is less complex innit?

return array_map(function ($thingValues) {
    return $this->thingFactory->createFromRawData($thingValues);
}, $thingsValues);

(*) OK it's two statements. Never let it be said I don't hyperbolise to better make my point.

One thing I did not get around to saying is that performance is not a consideration here. One of the calls is hitting an external DB, so any processing time differences are going to be eclipsed - by orders of magnitude - by the external call. And, yes, I know that if we'd be inclined to worry about micro-optimisations, then calling a higher-order function is slower than a loop. One should simply not care.

Adam (cont'ed): further more the array_map version only requires a single test, whereas the loop version needs three. That in itself demonstrates the increased complexity.

Fred: err... why's that then?

I'll break away from the conversation here.

The dev process here is that we identify the need to have a function to get stuff from the DB, and then model that and return it to the calling code. So doing TDD we write a test which uses a mocked-out DAO to return known data, and we test that the values in the resultant models are what we expected them to be based on the mocked DAO data.

Here's a very simplified, self-contained example:

First our test:

describe('comparing testing for foreach vs array_map', function() {
    given('sut', function () {
        return new ForeachVsArrayMap();
    });
    given('allObjects', function () {
        return [
            (object) ['id' => 1, 'mi' => 'tahi'],
            (object) ['id' => 2, 'mi' => 'rua'],
            (object) ['id' => 3, 'mi' => 'toru'],
        ];
    });
    describe('tests of foreach', function() {
        it('returns the numbers as objects', function () {
            $result = $this->sut->getObjects();
            expect($result)->toEqual($this->allObjects);
        });
    });
});

And now our implementation:

class ForeachVsArrayMap
{
    private $allData;

    public function __construct()
    {
        $this->allData = [
            ['id' => 1, 'mi' => 'tahi'],
            ['id' => 2, 'mi' => 'rua'],
            ['id' => 3, 'mi' => 'toru']
        ];
    }

    private function getNumbers($rows)
    {
        return array_slice($this->allData, 0, $rows);
    }

    public function getObjects($rows = 3)
    {
        $data = $this->getNumbers($rows);
        $numbers = [];
        foreach ($data as $row) {
            $numbers[] = (object) $row;
        }

        return $numbers;
    }
}

And that passes. That's our TDD side of things done.

However we also now have to do edge testing as well. We've got a loop in there, and all loops need tests for three things: zero iterations, one iteration, and more than one iterations. There are easily introduced logic errors around the plurality of loop iteration.

Here our TDD test has coverd the "more than one iterations" case, so we only need the other two:

it('handles one result', function () {
    $result = $this->sut->getObjectsUsingForeach(1);
    $expected = [$this->allObjects[0]];

    expect($result)->toEqual($expected);
});

it('handles zero results', function () {
    $result = $this->sut->getObjectsUsingForeach(0);
    $expected = [];

    expect($result)->toEqual($expected);
});

Pretty simple tests, but still: two additional tests.

Why don't we need these two tests when using array_map? Because array_map's iteration is all handled within its implementation. It's PHP's job to test that, not ours. So all we need to test is that the inline callback works. And we only need to test that once: when it's called, it does what we expect it to do. We can safely stipulate that array_map will always return an array with the same number of elements as the input value. PHP deals with that; we don't need to.

One good thing about TDD (and test coverage in general) is now we can safely swap in the array_map version, and the test coverage still passes, so we know - with a high degree of certainty - that our code is A-OK.

public function getObjects($rows = 3)
{
    return array_map(function ($row) {
        return (object) $row;
    }, $this->getNumbers($rows));
}

Coincidentally I had another conversation, in another code review, on Friday about the necessity of always testing loops for 0, 1, >1 iterations. "One can just look at the code and know it's fine". This statement didn't come up in the conversation, but I've heard it before.

Three weeks ago I was show-stopped by a bug in one of our libraries. Taking the analogy above, this was the code:

public function getObjectsUsingForeach($rows = 3)
{
    $data = $this->getNumbers($rows);

    foreach ($data as $row) {
        $numbers[] = (object) $row;
    }

    // do something else with $numbers here
    // ...
}

The code had no tests. Can you see what's wrong? The general expectation when calling a function to get data is that there's actually data to get, given the provided criteria. In the case I fell foul of, the value of $rows was zero. And this was legit. So if $rows is 0, what value does $numbers have after the loop? Well: $numbers was undefined.

Had the dev done their testing properly, then they'd've noticed their bug. They never initialised $numbers outside of the code in the loop, so there's a chance it might never get defined. In this case not only did the dev not do the edge-case testing, they didn't even do the baseline TDD testing either, but that's a different story.

I needed to down-tools and fix the library before I could continue my own work. And it was made very tricky by the fact the code in the library had no tests at all, and wasn't written in a clean-code fashion (so was basically untestable). But we are not allowed to work that way any more, so I needed to do a bunch of work the original developer ought to have done.

I have also fell foul of this brainfart too:

public function getObjectsUsingForeach($rows = 3)
{
    $data = $this->getNumbers($rows);

    foreach ($data as $row) {
        $numbers = [];
        $numbers[] = (object) $row;
    }

    // do something else with $numbers here
    // ...
}

This usually crops up when code that was previously used once suddenly needs to be used for multiple iterations. Note that the initialisation of $numbers is inisde the loop, so it will re-initialise every iteration. This would be caught by the "more than one iterations" test. This is rarer, and usually only crops up in completely untested code: it'd be a shit tester who tested a loop with only one iteration as the test, but I've seen it done. Saves time not having to mock so much data, you see? Berks.

So, anyhow... always test yer loops. And try to avoid loops if there are built-in functions that handle the iteration for you.

Also always do TDD, but also understand that there is more testing to do after the TDD cycle. TDD is not the end of the story there.



In the title of this article I mention "code review as a learning exercise". I greatly summarised the back and forth between Fred and myself here... there was an order of magnitude more discussion than there was code under discussion. this sort of interaction is gold. Some people get the idea that code review is kinda like a teacher marking homework. It's partly that, but that's only part. As programmers and coders our job is one of communication: communication of instructions to a computer, both to the computer itself, but also to our future colleagues and selves about the nature of the instructions. Programming languages are a means to communicate instructions to other humans. It's tangential the computer can also be made to understand them. Similarly reviewing each other's code should be a communications and collaboration process, and we absolutely should be having these discussions. Both Fred and I benefited from this conversation: a lot of what I typed here wasn't formalised in my brain before explaning it, but now I have a much clearly picture of what I do and why. And Fred knows too. This was not a waste of time, as we're both now better developers for it. And it might have seemed like a lot of back and forth in the code review, but it really only represents about 10min of discussion. And all this came from just a conversation point, not me saying "change the code".

If we were in the same office, we might have had the conversation via voice instead of typing in the code review, and some people might think that's better. In a way it is, but it also means that the details of the conversation don't get recorded, so benefit fewer people. That said, when programmers are not agreeing on things, often it is better to switch to voice instead of continue the discussion via text-based back-and-forth. It's just easier and more collaborative sometimes.

Right, that's that. A bit random, but didn't turn out so bad I think. And I seem to have found a point to make.

Righto.

--
Adam

Monday, 31 July 2017

Yeah, you do want 100% test coverage

G'day:
I've read a number of blog articles recently that have been eshewing 100% test coverage. The rationale being that there's little difference between 90% and 100%, and that that last 10% can be quite tricky to attain as - we being human 'n' all - sometimes leave the worst to last. Obviously we should all be doing TDD so that situation would not arise, but - again - we're human so don't always do TDD for one reason or another ("can't be arsed" is the usual one).

Another rationale is that it's not a race to 100%. IE: the object of the exercise is delivering working software which adds business value; it's not "getting 100% test coverage".

The latter is true. But we still aim for 100% test coverage for basically one good reason: visibility.

First, look at this:



Not bad I guess. Not great.

Now look at this:



We've accidentally forgotten to cover something new. Can you see it (it's one of the provider methods).

On the other hand look at these two:



Somewhat easier to spot, yeah?

This is bloody handy. In this case I just commented-out one of my tests to get the report off 100%, but in the real world I'd see that and go "huh, WTF?", and perhaps discover I had a logic branch untested, or a test I thought was covering something wasn't actually doing so. Or I just forgot to put the @covers annotation on the test. Whatever the situation: it's handy for the shortfall to be so obvious.

What it also means is part of our pull-request process is running a Bamboo build which runs all our tests (plus some other code quality tools):

 Everything's green here, so it's OK for code review. If it was failing, I'd get a nudge from my reviewers going "erm... you sure this is ready to go?" And even if I thought it was, we can't merge anything whilst the build is failing anyhow. The issue needs to be addressed.

Now the key thing here is this:


We don't test everything, but we cover everything. Where "cover" might be "we don't need to test this". In this case a single variable assignment does not need testing. We trust PHP to follow simple instructions. The thing is: this still shows up as green in the test coverage report.

We do this in a few situations:

  • the method has not branching logic and is overwise pretty simple. Ignore it.
  • we're backfilling coverage on old code we need to maintain, but it's such a mess we simply don't have time to solve all the testing shortfalls right now, so we test what is urgent (the bit we're changing), but just mark as ignore the rest of it. Now that file is green, and we can see if it stops being green.
  • similar to the above but we know the code the code in question is  not long for this world. So we ignore it and stick a TODO cross referencing to the ticket that will be dealing with it. And the ticket must already be lined-up to be worked on "soon".

There might be other situations where we need to be pragmatic and skip some stuff, but this needs to be explained to the code reviewers, and get their agreement.

But being able to run a report that says "100%", and shows up anything that isn't 100% is gold.

Get yer coverage report green, but don't be afraid to be pragmatic and skip stuff, if there's an informed reason to do so.

Now this isn't to say one can just pepper ignore annotations around the place just to get the report green. That's daft. The object of the exercise is not to make some report green / say 100%. It's to get good code quality. This is just a tool to help identify the quality: it's still up to the devs to provide the quality. Don't forget that. It needs to be an informed, honest decision for one to ignore code.

Righto.

--
Adam

Thursday, 9 March 2017

PHPUnit: setMethods and how I've been writing too much code

G'day:
Seems our technical architect is becoming my PHP muse. Good on ya, Pete.

Anyway, he just sent out what I thought was a controversial email in which he said (paraphrase) "we're calling setMethods way too often: in general we don't need to call it". To contextualise, I mean this:

$dependency = $this
    ->getMockBuilder(MyDependency::class)
    ->setMethods(['someMethod'])
    ->getMock();

$dependency
    ->method('someMethod')
    ->willReturn('MOCKED RESULT');


See here I'm explicitly saying "mock out the someMethod method". Pete's contention is that that's generally unnecessary, cos getMockBuilder mocks 'em all by default anyhow. This goes against my understanding of how it works, and I've written bloody thousands of unit tests with PHPUnit so it surprised me I was getting it wrong. I kinda believed him cos he said he tested it, but I didn't believe him so much as to not go test it myself.

So I knocked out a quick service and dependency for it:

namespace me\adamcameron\myApp;

class MyService
{

    private $myDecorator;

    public function __construct(MyDecorator $myDecorator)
    {
        $this->myDecorator = $myDecorator;
    }

    public function decorateMessage($message)
    {
        $message = $this->myDecorator->addPrefix($message);
        $message = $this->myDecorator->addSuffix($message);
        return $message;
    }

}

namespace me\adamcameron\myApp;

class MyDecorator
{

    public function addPrefix($message)
    {
        return "(ACTUAL PREFIX) $message";
    }

    public function addSuffix($message)
    {
        return "$message (ACTUAL SUFFIX)";
    }

}

Nothing surprising there: the service takes that decorator as a constructor arg, and then when calling decorateMessage the decorator's methods are called.

In our tests of decorateMessage, we don't want to use the real decorator methods, we want to use mocks.

First I have a test the way I'd normally mock things: explicitly calling setMethods with the method names:

public function testWithExplicitMock()
{
    $decorator = $this
        ->getMockBuilder(MyDecorator::class)
        ->setMethods(['addPrefix', 'addSuffix'])
        ->getMock();

 $decorator
  ->method('addPrefix')
  ->willReturn('(MOCKED PREFIX)');

 $decorator
  ->method('addSuffix')
  ->willReturn('(MOCKED SUFFIX)');

 $service = new MyService($decorator);

 $result = $service->decorateMessage('TEST MESSAGE');

 $this->assertSame(
        '(MOCKED SUFFIX)',
        $result
    );
}

And this mocks out the method correctly. This is my baseline.

Secondly, I still call setMethods, but I give it an empty array:

public function testWithImplicitMock()
{
    $decorator = $this
        ->getMockBuilder(MyDecorator::class)
        ->setMethods([])
        ->getMock();

 $decorator
  ->method('addPrefix')
  ->willReturn('(MOCKED PREFIX)');

 $decorator
  ->method('addSuffix')
  ->willReturn('(MOCKED SUFFIX)');

 $service = new MyService($decorator);

 $result = $service->decorateMessage('TEST MESSAGE');

 $this->assertSame(
        '(MOCKED SUFFIX)',
        $result
    );
}

This also mocks out the method correctly. Passing an empty array mocks out all the methods in the mocked class.

Next I try passing null to setMethods:

public function testWithNull()
{
    $decorator = $this
        ->getMockBuilder(MyDecorator::class)
        ->setMethods(null)
        ->getMock();

    $decorator
        ->method('addPrefix')
        ->willReturn('(MOCKED PREFIX)');

    $decorator
        ->method('addSuffix')
        ->willReturn('(MOCKED SUFFIX)');

    $service = new MyService($decorator);

    $result = $service->decorateMessage('TEST MESSAGE');

    $this->assertSame(
        '(ACTUAL PREFIX) TEST MESSAGE (ACTUAL SUFFIX)',
        $result
    );
}

This creates the mocked object, but the methods are not mocked. So the result here is using the actual methods.

In the last example I mock one of the methods (addPrefix), but not the other (addSuffix):

public function testWithOneExplicitMock()
{
    $decorator = $this
        ->getMockBuilder(MyDecorator::class)
        ->setMethods(['addPrefix'])
        ->getMock();

    $decorator
        ->method('addPrefix')
        ->willReturn('(MOCKED PREFIX)');

    $service = new MyService($decorator);

    $result = $service->decorateMessage('TEST MESSAGE');

    $this->assertSame(
        '(MOCKED PREFIX) (ACTUAL SUFFIX)',
        $result
    );
}

Here we see the important bit: when one explicitly mocks one or a subset of methods, then the other methods are not mocked. Which is kinda obvious now I say it, but I guess we had been thinking one needed to call setMethods to be able to then mock the behaviour of the method. And, TBH, I don't think we thought through what was happening to the methods we were not mocking. I think we started calling setMethods all the time because we were slavishly following the PHPUnit docs, which always call it too, which is a less than ideal precedent to "recommend".

Generally speaking, one should want to mock all methods of a dependency, and it should really be the exception when explicitly not wanting to mock a method: ie, to leave it actually callable.

The win here is that we can now de-clutter our tests a bit. We really seldom need to call setMethods, and we sure want to make it clear when we do want to only mock some of a dependency's methods. So this will make our code cleaner and clearer, and easier to code review and maintain. Win!

Cheers Pete!

Righto.

--
Adam

Monday, 9 January 2017

Testing: Too much detail, Cameron?

G'day:
Just really quickly here. Here's a question I posted on the Testing sub-channel of the #CFML Slack channel:

Question: I have a function which is basically a cron job which gets stuff from one data repository, monkeys with it, then sends the monkeying back to a different repository.

I log stuff along the way, for example:
  • "process started"
  • "got n records"
  • "need to update m records"
  • "job done"

Currently I am actively testing that "got n records" reflects the right value for n (where n in the context of the test is a test value coming back from a mock), and similarly with "need to update m records".

On one hand I think "well that's an important partial 'result' from the process, so - yes - test for it".

On the other hand I go "hmmm... is it? Or is that implementation detail?"

Thoughts?

That's it, really. Not an answer to anything today; just a question for you. What do you think?

--
Adam

Tuesday, 1 November 2016

PHP: what exactly are we testing?

G'day:
"the rumours of this blog's demise... [etc]". Yeah. I'm still here. More about that in a separate article.

Here's a completely fictitious scenario which bears absolutely no relation to a code review discussion I had at work today. I don't even know why I mention work. This, after all, has nothing to do with my day job. Or my colleagues. I'm just making it all up.

Here's some code I just made up for the sake of discussion:

<?php 

namespace me\adamcameron\someApp\service;

use \me\adamcameron\someApp\exception;

class SomeService {

    public function getTheThingFromTheWebService($id){
        // bunch of stuff elided
        
        try {
            $result = $this->connector->getThing($id);
            
            // bunch of stuff elided
            
            return $finalResult;
        } catch (exception\NotFoundException $e) {
            // bunch of stuff elided
            
            throw new exception\ServiceException("The Thing with ID $id could not be retrieved");
         }
    }
}


Basically we're calling a web service, and the connector we use wraps up all the HTTP bumpf, and with stuff like 404 responses it throws a NotFoundException up to the service tier so that the service doesn't need to concern itself with the fact the connector is dealing with a REST web service. It could just as easily be connecting straight to a DB and the SELECT statement returned zero rows: that's still a NotFoundException situation. That's cool. We're all-good with this approach to things. For reasons that are outwith this discussion (TBH, I'm slightly contriving the situation in the code above, but maintaining the correct context) we don't want to let the NotFoundException bubble any further, we want to throw a different exception (one that other upstream code is waiting to catch), and provide a human-friendly message with said exception.

But for testing, I need to actually make sure that the Service handles this exception properly, and the two elements of this are that it chucks its own exception, and it includes that human-friendly message when it does.

So I've tested both of those.

Because I'm a well-behaved, team-playing dev, I TDD my code as I go, and having just added that exception-handling, I need to test itbeing about to add that exception-handling code, I need to write my tests first. So, anyway, whether I wrote the test before or after I wrote the code is neither here nor there. I came up with this test case:

<?php

namespace me\adamcameron\someApp\test\service;

use \namespace me\adamcameron\someApp\service\SomeService;

/** @coversDefaultClass \me\adamcameron\someApp\service\SomeService */
class SomeServiceTest extends PHPUnit_Framework_TestCase {

    public function setup() {
        $this->setMockedConnector();
        $this->testService = new SomeService($this->mockedConnector);
    }

    /**
        @covers ::getTheThingFromTheWebService
        @expectedException \me\adamcameron\someApp\exception\ServiceException
        @expectedExceptionMessage The Thing with ID 1 could not be retrieved
    */
    public function testGetTheThingsFromTheWebServiceWillThrowServiceExceptionWhenIdNotFound() {
        $testId = 1;

        $this->mockedConnector
            ->method('getThing')
            ->with($testId)
            ->will($this->throwException('\me\adamcameron\someApp\exception\NotFoundException'));
            
        $this->testService->getTheThingFromTheWebService($testId);
    }
    
    private function setMockedConnector(){
        $this->mockedConnector = $this->getMockBuilder('\me\adamcameron\connector\MyConnector')
            ->disableOriginalConstructor()
            ->setMethods(['getThing'])
            ->getMock();
    }

}

I've left a bunch of contextual and mocking code in there so you can better see what's going one. But the key parts of this test are the annotations regarding the exception.

(we're using an older version of PHPUnit, so we need to do this via annotations, not expectations, unfortunately. But anyway).

This went into code review, and I got some feedback (this is a paraphrase. Of, obviously, a completely fictitious conversation. Remember this is not really work code I'm discussing here):

  • Not too sure about the merits of testing the whole message string here. For the purposes of testing, we don't care about text, we just care about the $id being correct. Perhaps use @expectedExceptionMessageRegExp instead, with like "/.*1.*/" as the pattern.
  • not sure 1 is a great ID to mock, TBH

Well I can't fault the latter bit. If one is gonna be actually checking for values, then use values that are really unlikely to accidentally occur as side-effects of something else going on. Basically 0 and 1 are dumb test values. There's always gonna be a risk they'll end up bubbling up via an accident rather than on purpose. I usually use random prime numbers (often taken from this list of the first 1000 primes). They're just not likely to coincidentally show up somewhere else in one's results at random.

But the first bit is interesting. My first reaction was "well what's the important bit here? The number or the error message?" We wanna check the error message cos given that suggested regex pattern the exception could simply return 1, which is not help to anyone. And yet that'd pass the test. Surely the important thing is the guidance: "The Thing with ID [whatever] could not be retrieved".

But that's woolly thinking on my part. I'm not doing tests for the benefit of humans here. And no code is every gonna rely on the specifics of that message. TDD tests (and unit tests in general) are about testing logic, not "results". And they don't give a shit about the Human Interface. We've got QA for that.

There's no need to test PHP here. If we have this code:

throw new exception\ServiceException("The Thing with ID $id could not be retrieved");

We don't need to test:
  • that PHP knows what a string is
  • that PHP knows how to interpolate a variable
  • that one can pass a string to an Exception in PHP
  • etc
That's testing an external system (and showing no small amount of hubris whilst we do so!), and is outwith the remit of TDD or unit testing in general. If we can't trust PHP to know what a string is: we're in trouble.

What we do need to test is this:

public function getTheThingFromTheWebService($id){
    // bunch of stuff elided
    
    try {
        $result = $this->connector->getThing($id);
        
        // bunch of stuff elided
        
        return $finalResult;
    } catch (exception\NotFoundException $e) {
        // bunch of stuff elided
        
        throw new exception\ServiceException("The Thing with ID $id could not be retrieved"); 
    }
}


  • The ID comes in...
  • ... it's used...
  • ... and when the call that uses it fails...
  • ... we use it in the exceptional scenario.

The logic here is that it's the same ID that's passed to the function is the one reported in the exception message. Not what the exception message is: that's PHP's job.

So in reality here... we just need to test the ID is referenced in the exception. TBH, I'd not even usually bother doing that. It's just we're pushing to have more useful exception messages at the moment, so we're trying out "enforcing" this behaviour. I think it's a worthwhile exercise at least whilst the devs are getting used to it. Once it's part of our standard practice, I reckon we can stop worrying about what's in the error message.

But in the interim I'm gonna fix me test and see if I can pass this code review...

Righto.

--
Adam

Monday, 19 September 2016

Another Friday puzzle; and the importance of tests that try to break code

G'day:
There was another Friday Puzzle on the CFML Slack channel this week. The question was:

The maximum sum subarray problem consists in finding the maximum sum of a contiguous subsequence in an array or list of integers:

maxSequence([-2, 1, -3, 4, -1, 2, 1, -5, 4])
// should be 6: [4, -1, 2, 1]

Apparently the word "contiguous" is something that will flummox some people (judging by the discussion on the Slack channel). In this sense a contiguous sub-sequence is one that is made from adjacent array elements. So you can see they highlighted sub-sequence is a subset of adjacent array elements.

Eyeballing the example series here reveals the tricky bit one needs to watch for... negative values. Take a sub-sequence of 3,4. That has a sum of 7. Now if the next number is negative: 3,4,-1 (sum 6), the sum of this longer sub-sequence is less than that of the preceding shorter one, so the shorter subsequence is still the "winner". However if the next number (or numbers) sum to greater than the negative number, eg; 3,4,-1,2 (sum 8) then we've got the highest sum again. And of course there could be more negative numbers followed by positive numbers with a higher sum repeatedly.

Update:

I did not thoroughly read the requirement here, so got some of this wrong. See "TIL: Cameron is bloody wrong again".

A couple of the bods on the Slack channel quickly came up with solutions. Here they are, respectively: Isaiah's and John's. I'll only repeat one of them here because both used pretty much the same algorithm, just one used procedural code, and the other a more functional approach (in that they used higher-order functions). But they've both got the same logic error in them, because the approach is flawed. Let's have a look at John's one:

numeric function maxSequence(required array arr){
    var currentSum = 0;
    return arr.reduce(function(maxSum, number){
        currentSum = max(currentSum+number, 0);
        return max(currentSum, maxSum);
    }, 0);
}

WriteDump(maxSequence([-2, 1, -3, 4, -1, 2, 1, -5, 4]));

WriteDump(maxSequence([-2, 1, -3, 4, -1, 2, 1, -5, 4, 3]));

WriteDump(maxSequence([-2, -3, -1, -5]));

WriteDump(maxSequence([1, 4, 2, 1, 4, 3]));

This is nice and simple, and it took me a while to get what he's doing here. But we're keeping a running sum of the sequence outside the callback, and the callback itself returns the greater of the running sum or the whatever has previously been the maximum sum. So taking the first series there, we get the following iterations:

maxSumnumbercurrentSumnewMax
0-200
0111
1-301
1444
4-134
4255
5166
6-516
6456

Superficially this seems OK, except for one thing: it considers the minimum possible sum to be 0. Which is not correct. In a sequence with only negative numbers... the maximum sum will be the highest negative number: in [-1,-2,-3] the highest sum there is -1, but sums less than zero are ignored by this algorithm, so it incorrectly returns 0. Equally for an empty sequence, the answer is null or undefined, so this algorithm will fail there too, as it returns 0 for this too (as it defaults to a maximum sum of zero, whereas we don't know there'll be a sum when we start the process. So I guess that's two slightly different logic errors there.

I'll come back to John's solution in a bit, but first here's my solution. It's not as elegant as John's, and I'm not entirely happy with it, but it does work.

Here's the code (I've opted to do mine in ES2015 rather than CFML):

let sumLongestContiguousSubsequence = function (array) {
    let subSequences = array.map((_,i,a)=>a.slice(i));

    return subSequences.reduce(function(max, subsequence){
        let runningSum = subsequence[0];
        let maximumSubSequenceSum = subsequence.reduce(function(max, element){
            return (runningSum += element) > max ? runningSum : max;
        });
        return Math.max(max||maximumSubSequenceSum, maximumSubSequenceSum);
    }, null);
};

My conceit is that one cannot make assumptions about any earlier maximums, so I don't default the answer to anything.

Also I don't default the running sum to anything, I just take its starting point as the first element of the sequence I'm checking.

The general approach here is slightly more ham-fisted than John's approach.
  • I figure I need to scan each separate sub-sequence of the main sequence, starting each sub-sequence from each subsequent value in the main sequence. EG: if I have a sequence of [1,2,3,4], then subSequences will be [[1,2,3,4],[2,3,4] [3,4],[4]].
  • For each of those I do much the same thing as John: just run a cumulative sum, and remember whatever the highest value for that was.
  • So that'll bring me back to an array of maximums (in my [1,2,3,4] example, this'd be [10,9,7,4].
  • And as I reduce that lot, I simply return whichever is the higher of the previous ones and the current one.

I also wrote actual tests here. And this is where my approach differs from John's (or Isaiah's for that matter). When I test things I don't try to demonstrate to myself it works, I try to break the thing.

Here are my tests:

"use strict";

let assert = require("chai").assert;

let sumLongestContiguousSubsequence = require("./sumLongestContiguousSubsequence.js");

describe("Test of puzzle requirement", function(){
    it("returns the highest contiguous subseries sum for baseline requirement", function(){
        let sequence = [-2, 1, -3, 4, -1, 2, 1, -5, 4];
        let expectation = 4 + -1 + 2 + 1; // 6
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
});

describe("Tests of other puzzle submission sequences", function(){
    it("returns the highest contiguous subseries sum for variation 1", function(){
        let sequence = [-2, 1, -3, 4, -1, 2, 1, -5, 4, 3];
        let expectation = 4 + -1 + 2 + 1 + -5 + 4 + 3; // 8
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum for variation 2", function(){
        let sequence = [-2, -1, -3, -5];
        let expectation = -1;
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum for variation 3", function(){
        let sequence = [1, 4, 2, 1, 4, 3];
        let expectation = 1 + 4 + 2 + 1 + 4 + 3; // 15
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
});

describe("Test edge cases", function(){
    it("returns the highest contiguous subseries sum with an empty array", function(){
        let sequence = [];
        let expectation = null;
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum with just zero", function(){
        let sequence = [0];
        let expectation = 0;
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum with just -1", function(){
        let sequence = [-1];
        let expectation = -1;
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum with just 1", function(){
        let sequence = [1];
        let expectation = 1;
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
});

describe("Better described tests", function(){
    it("returns the highest contiguous subseries sum when the sequence has negative values", function(){
        let sequence = [1,2,3,-11];
        let expectation = 1 + 2 + 3; // 6
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum when the sequence has negative values followed by a greater positive value", function(){
        let sequence = [1,2,3,-4,5];
        let expectation = 1 + 2 + 3 + -4 + 5; // 7
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum when the sequence has negative values followed by a subseries positive values that are net greater than the negative one", function(){
        let sequence = [2,4,6,-8,3,7];
        let expectation = 2 + 4 + 6 + -8 + 3 + 7; // 14
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum when the sequence has repeated negative values followed by a subseries positive values that are net greater than the negative one", function(){
        let sequence = [12,14,16,-8,3,7,-12,5,9];
        let expectation = 12 + 14 + 16 + -8 + 3 +7 + -12 + 5 + 9; // 46
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
});

These are fairly deliberate in what they test, but don't try to do anything clever (so there's a lot of repeated code). I make a point of testing edge cases like empty sequences, and sequences with only one element, and all negative values and all positive values etc. My philosophy with testing is that I'm not trying to demonstrate the code works, I'm trying to demonstrate it doesn't break. This amounts to the same thing, but it starts from a slightly different mindset. Testing is one situation where "glass half empty" rather than "glass half full" is the better way to look at things.

So this version of the solution works, but I'm not happy about it. For one thing, I don't think one can look at it and go "right, it's clear what's going on there". I looked for refactoring opportunities, but am not seeing any. Maybe subSequences could have a better name? I looked at extracting the callbacks into named function expressions, but that didn't look any clearer to me either.

let sumLongestContiguousSubsequence = function (array) {
    let runningSum;
    
    let getCurrentMaxSum = function(max, element){
        return (runningSum += element) > max ? runningSum : max;
    };
    
    let findSumOfLongestSub = function(max, subsequence){
        runningSum = subsequence[0];
        let maximumSubSequenceSum = subsequence.reduce(getCurrentMaxSum);
        return Math.max(max||maximumSubSequenceSum, maximumSubSequenceSum);
    };

    let subSequencesSlicedAtEachIndex = array.map((_,i,a)=>a.slice(i));

    return subSequencesSlicedAtEachIndex.reduce(findSumOfLongestSub, null);
};

What do you think: is that clearer? I guess it's a bit better, innit? It's straying away from "elegance in simplicity" though.

Ha, just for a laugh I took the code clarity in the opposite direction. How about this mess:

let f=a=>a.map((_,i,a)=>a.slice(i)).reduce((m1,s)=>{
    let t=s[0],m2=s.reduce((m,v)=>(t+=v)>m?t:m)
    return Math.max(m1||m2,m2)
},null)

It's the same logic as my original version.

The other issue that Sean and Ryan are likely to pull me up on is that one of my reductions relies on side-effects spilling out into the intermediary calling code:

let sumLongestContiguousSubsequence = function (array) {
    let subSequences = array.map((_,i,a)=>a.slice(i));

    return subSequences.reduce(function(max, subsequence){
        let runningSum = subsequence[0];
        let maximumSubSequenceSum = subsequence.reduce(function(max, element){
            return (runningSum += element) > max ? runningSum : max;
        });
        return Math.max(max||maximumSubSequenceSum, maximumSubSequenceSum);
    }, null);
};


This didn't actually bother me until I did the PHP version of this, where the side-effects are more glaring:

$sumLongestContiguousSubsequence = function ($array) {
    $subSequences = array_map(function($i) use ($array) {
        return array_slice($array, $i);
    }, array_keys($array));

    return array_reduce($subSequences, function($max, $subSequence){
        $runningSum = 0;
        $maximumSubSequenceSum = array_reduce($subSequence, function($max, $element) use (&$runningSum){
            return ($runningSum += $element) > $max ? $runningSum : $max;
        }, $subSequence[0]);
        return max($max?:$maximumSubSequenceSum, $maximumSubSequenceSum);
    }, null);
};

PHP sux a bit because it doesn't do closure implicitly, one needs to tell it what to enclose, which is clumsy IMO. But it's also a bit of an orange flag. This orange flag becomes a red flag when I need to actually use a reference here, cos I'm changing the original value, not simply using it.

That guilt-tripped me into revising my JS version so as to not need the side effects:

let sumLongestContiguousSubsequence = function (array) {
    let subSequences = array.map((_,i,a)=>a.slice(i));

    return subSequences.reduce(function(max, subsequence){
        let maximumSubSequence = subsequence.reduce(function(working, element){
            working.runningSum += element;
            working.max = working.max || working.runningSum;
            working.max = working.runningSum > working.max ? working.runningSum : working.max
            return {max:working.max, runningSum:working.runningSum};
        }, {runningSum:0});
        return Math.max(max||maximumSubSequence.max, maximumSubSequence.max);
    }, null);
};


Now I'm carrying both the runningSum and the max value through the first argument in the reduction, by using an object rather than just the simple value. It's a small change but gets rid of the smell. To be completely honest though... I prefer my original version. I realise it's frowned-upon to bleed side-effects (which I then leverage), but this refactoring seems like an exercise in pedantic/dogmatic busy-work, and makes the code slightly less clear in the process. I dunno.

Oh, for completeness I did a CFML conversion too:

sumLongestContiguousSubsequence = function (array) {
    var subSequences = array.map((_,i,a) => a.slice(i));
    return subSequences.reduce(function(maxSum, subSequence){
        var runningSum = 0;
        var maximumSubSequenceSum = subSequence.reduce(
            (maxSum, element) => (runningSum += element) > maxSum ? runningSum : maxSum,
            subSequence[1]
        );
        return max(maxSum ?: maximumSubSequenceSum, maximumSubSequenceSum);
    }, null);
};

Note that this solution only works on Lucee, not ColdFusion, for three reasons:
  • ColdFusion does not have arrow functions;
  • ColdFusion has a bug in its parser which breaks on this expression (see 4190163);
  • Lucee has a null keyword whereas ColdFusion does not.

The Lucee version is OK though. Same as the JS version for the most part: just 1-based arrays, not 0-based; and Lucee has the ?: operator whereas JS uses ||.

That's about all I can think to say about this. I might see if I can mess with John's approach to make it work for those two edge-cases it fails on. I much prefer his way compared to my way... but only provided it can be made to work! I'd also be keen to see treatments of this puzzle in other languages, or better/different solutions for JS, PHP or CFML.

Righto.

--
Adam