Wednesday, 24 February 2021

Docker: using TDD to initialise my app's DB with some data

G'day:

I'll start by saying I am not convinced this exercise really ought to be a TDD-oriented one, but I'm gonna approach it that way anyhow because I suspect I'm going to need to mess around a bit to get this working. Secondly, this is very much a log of what I'm (trying to ~) work on today, and I doubt there will be any shrewd insights going on, given I'm basically googling and RTFMing, then doing what the docs say.

The exercise here is to take the MariaDB database that I already have in my Docker set up (see "Creating a web site with Vue.js, Nginx, Symfony on PHP8 & MariaDB running in Docker containers - Part 5: MariaDB"), which is currently empty and only accessible via the root login; and add some baseline tables and data to it. At the same time also create a user for code to connect to the DB with so I don't need code using root access. Another thing I want to do is stop storing the DB passwords in the Docker .env file like I am now:

COMPOSE_PROJECT_NAME=fullStackExercise
DATABASE_ROOT_PASSWORD=123

The data I need is to fulfil an exercise I have given myself (well: it wasn't me who gave me the original exercise, but I'm reinventing it a bit here) to construct an event registration form (personal details, a selection of workshops to register for), save the details to the DB and echo back a success page. Simple stuff. Less so for me given I'm using tooling I'm still only learning (Vue.js, Symfony, Docker, Kahlan, Mocha, MariaDB).

For my tests, I can already derive a bunch of test specs from those first few paragraphs above, so let's put them together now in spec/integration/baselineDatabase.spec.php:

<?php

namespace adamCameron\fullStackExercise\spec\integration;

describe('Tests for registration database', function () {
    describe('Connectivity tests', function () {
        it('can connect to the database with environment-based credentials', function () {
        });
    });

    describe('Schema tests', function () {
        it('has a workshops table with the required schema', function () {
        });

        it('has a registrations table with the required schema', function () {
        });

        it('has a registeredWorkshops table with the required schema', function () {
        });
    });

    describe('Data tests', function () {
        it('has the required baseline workshop data', function () {
        });
    });
});

And I can now run those to see them… not be implemented:

root@fde4be76c908:/usr/share/fullstackExercise# composer spec -- --spec=spec/integration/baselineDatabase.spec.php --reporter=verbose
> vendor/bin/kahlan '--spec=spec/integration/baselineDatabase.spec.php' '--reporter=verbose'


  Tests for registration database
    Connectivity tests
      ✓ it can connect to the database with environment-based credentials
    Schema tests
      ✓ it has a workshops table with the required schema
      ✓ it has a registrations table with the required schema
      ✓ it has a registeredWorkshops table with the required schema
    Data tests
      ✓ it has the required baseline workshop data


  Pending specifications: 5
  .spec/integration/baselineDatabase.spec.php, line 8
  .spec/integration/baselineDatabase.spec.php, line 13
  .spec/integration/baselineDatabase.spec.php, line 16
  .spec/integration/baselineDatabase.spec.php, line 19
  .spec/integration/baselineDatabase.spec.php, line 24


Expectations   : 0 Executed
Specifications : 5 Pending, 0 Excluded, 0 Skipped

Passed 0 of 0 PASS in 0.015 seconds (using 4MB)

root@fde4be76c908:/usr/share/fullstackExercise#

And now I can implement that first test:

describe('Tests for registration database', function () {

    $this->getConnectionDetailsFromEnvironment = function () {
        return (object) [
            'database' => $_ENV['MYSQL_DATABASE'],
            'user' => $_ENV['MYSQL_USER'],
            'password' => $_ENV['MYSQL_PASSWORD']
        ];
    };

    describe('Connectivity tests', function () {
        it('can connect to the database with environment-based credentials', function () {
            $connectionDetails = $this->getConnectionDetailsFromEnvironment();
            $connection = new PDO(
                "mysql:dbname=$connectionDetails->database;host=database.backend",
                $connectionDetails->user,
                $connectionDetails->password
            );
            $statement = $connection->query("SELECT 'OK' AS test FROM dual");
            $statement->execute();

            $testResult = $statement->fetch(PDO::FETCH_ASSOC);

            expect($testResult)->toContainKey('test');
            expect($testResult['test'])->toBe('OK');
        });
    });

There's not much to this. I'm reading the DB connectivity details from the environment variables Docker has set for me, and using those to do a simple DB query from the database, and just verify the DB is responding as expected. To be honest I don't think I need / ought to be using the environment variable for the database name here: that environment variable is just for MariaDB to create a DB of that name when it first starts up. In the app itself, we'll have a static value for the database name, because the app wants to use that exact database, not simply whatever DB is in that environment variable. Hopefully you see the subtle difference in intent there. Anyhow, we now run our tests:

> vendor/bin/kahlan '--spec=spec/integration/baselineDatabase.spec.php' '--reporter=verbose'


  Tests for registration database
    Connectivity tests
      ✖ it can connect to the database with environment-based credentials
        an uncaught exception has been thrown in `spec/integration/baselineDatabase.spec.php` line 11

        message:`Kahlan\PhpErrorException` Code(0) with message "`E_WARNING` Undefined array key \"MYSQL_DATABASE\""

          [NA] - spec/integration/baselineDatabase.spec.php, line 7 to 11
          […etc…]

Cool. Now we can sort those credentials out and watch that test pass. The first thing I have done is to update docker/.env (see above) to get rid of the root password, and add the other credentials MariaDB expects to initialise a database when its container is first built (see the "Environment Variables" section in mariadb - Docker Official Images for info about that):

COMPOSE_PROJECT_NAME=fullStackExercise
MYSQL_DATABASE=fullstackExercise
# the following are to be provided to `docker-compose up`
DATABASE_ROOT_PASSWORD=
MYSQL_USER=
MYSQL_PASSWORD=

Those empty entries are not necessary, I've just left them there for the sake of documentation. The bit I do actually need to do is in docker/docker-compose.yml. This is best shown with a diff I think:

$ git diff docker/docker-compose.yml
diff --git a/docker/docker-compose.yml b/docker/docker-compose.yml
index bc399a0..0eec553 100644
--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
@@ -24,7 +24,9 @@ services:
       context: ../backend
       dockerfile: ../docker/php-fpm/Dockerfile
     environment:
-      - DATABASE_ROOT_PASSWORD=${DATABASE_ROOT_PASSWORD}
+      - MYSQL_DATABASE=${MYSQL_DATABASE}
+      - MYSQL_USER=${MYSQL_USER}
+      - MYSQL_PASSWORD=${MYSQL_PASSWORD}
     volumes:
       - ../backend/config:/usr/share/fullstackExercise/config
       - ../backend/public:/usr/share/fullstackExercise/public
@@ -52,6 +54,9 @@ services:
       context: ./mariadb
     environment:
       - MYSQL_ROOT_PASSWORD=${DATABASE_ROOT_PASSWORD}
+      - MYSQL_DATABASE=${MYSQL_DATABASE}
+      - MYSQL_USER=${MYSQL_USER}
+      - MYSQL_PASSWORD=${MYSQL_PASSWORD}
     ports:
       - "3306:3306"
     volumes:

I've taken out PHP's access to the DB root password as it doesn't need it any more. It has two tests that will now fail, but they were only ever temporary ones until I did this work anyhow, so I'll be deleting those when I verify they now fail. And I've also added in the three new environment variables to both the MariaDB service, and the PHP one. MariaDB uses it to create the fullstackExercise DB, and PHP will use the same credentials to connect to it. I now have no DB credentials anywhere in the codebase. Instead, I pass them in when I first bring the containers up:

adam@DESKTOP-QV1A45U:/mnt/c/src/fullstackExercise/docker$ DATABASE_ROOT_PASSWORD=123 MYSQL_USER=fullstackExercise MYSQL_PASSWORD=1234 docker-compose up --build --detach

This is not completely secure. One can still see the passwords if one terminals into the containers, eg:

adam@DESKTOP-QV1A45U:/mnt/c/src/fullstackExercise/docker$ docker exec --interactive --tty fullstackexercise_php-fpm_1 /bin/bash
root@ac3872091c8e:/usr/share/fullstackExercise# set | grep MYSQL
MYSQL_DATABASE=fullstackExercise
MYSQL_PASSWORD=1234
MYSQL_USER=fullstackExercise
root@ac3872091c8e:/usr/share/fullstackExercise#

A better way would perhaps be to use Docker Secrets, but I could not work out how to get the values from the files it creates into environment variables in the docker-compose.yml file. But will also admit I pretty much read the docs and went "yeah CBA with that right now". It might be dead easy (UPDATE: just now when linking to the MariaDB docker image page a coupla paragraphs up, I noticed it's all actually explained there, and it is dead easy. I might look at doing this later then).

Now I will run my tests again. My expectations are that that test that failed before will now be passing; and one each of the Kahlan and PHPUnit tests will start to fail because they are testing connecting to the DB using the root credentials, which I've removed.

root@ac3872091c8e:/usr/share/fullstackExercise# composer spec
> vendor/bin/kahlan

..........E.PPPP.                                                 17 / 17 (100%)


  Pending specifications: 4
  .spec/integration/baselineDatabase.spec.php, line 37
  .spec/integration/baselineDatabase.spec.php, line 40
  .spec/integration/baselineDatabase.spec.php, line 43
  .spec/integration/baselineDatabase.spec.php, line 48

Tests database availability
  ✖ it should return the expected database version
    an uncaught exception has been thrown in `spec/integration/database.spec.php` line 14

    message:`Kahlan\PhpErrorException` Code(0) with message "`E_WARNING` Undefined array key \"DATABASE_ROOT_PASSWORD\""

      [NA] - spec/integration/database.spec.php, line 11 to 14
      Kahlan\Filter\Filters::run() - vendor/kahlan/kahlan/src/Suite.php, line 236
      […etc…]


Expectations   : 18 Executed
Specifications : 4 Pending, 0 Excluded, 0 Skipped

Passed 12 of 13 FAIL (EXCEPTION: 1) in 0.491 seconds (using 6MB)

Script vendor/bin/kahlan handling the spec event returned with error code 255

This is good: only one failing test: the one we expect to fail, and it's failing for the right reason. And with PHPUnit:

PHPUnit 9.5.2 by Sebastian Bergmann and contributors.

.....E                                                              6 / 6 (100%)

Time: 00:00.268, Memory: 14.00 MB

There was 1 error:

1) adamCameron\fullStackExercise\tests\integration\DatabaseTest::testDatabaseVersion
Undefined array key "DATABASE_ROOT_PASSWORD"

/usr/share/fullstackExercise/tests/integration/DatabaseTest.php:16

ERRORS!
Tests: 6, Assertions: 13, Errors: 1.

Generating code coverage report in HTML format ... done [00:00.374]
Script vendor/bin/phpunit handling the test event returned with error code 2

I'll get rid of those failing tests. They are redundant now.

The next test cases we have to address are these ones:

    Schema tests
      ✓ it has a workshops table with the required schema
      ✓ it has a registrations table with the required schema
      ✓ it has a registeredWorkshops table with the required schema

Looking at the docs for MariaDB's Docker image ("Docker Official Images > mariadb > Initializing a fresh instance"), when the DB starts up, it looks for files in a docker-entrypoint-initdb.d directory, and runs any scripts it finds in there. This makes things easy.

However let's not get ahead of ourselves. We need tests first. But first… a bit of an aside. I'm actually questioning the merits of these tests. They are handy when I'm doing the initial DB setup though. Later as the application develops, we'll have more finely-tuned integration tests that will implicitly test the table schemata are correct; but I guess at the moment all we need to have is the schema (then some baseline data), so as transient tests I suppose the have some merit. I'm not sure. One one hand it might be overkill; on another hand we're supposed to be developing the application iteratively, and these are a first iteration. I guess the situation is similar to the DB tests I had that were using the root connectivity details, because for that iteration that's where we were at. Now we've moved on so those tests are redundant, and these new tests replace them. And these tests will likely be replaced in the next coupla iterations as we go. Anyhow: I'm writing them. Here we go.

describe('Schema tests', function () {
    $schemata = [
        [
            'tableName' => 'workshops',
            'schema' => [
                ['Field' => 'id', 'Type' => 'int(11)'],
                ['Field' => 'name', 'Type' => 'varchar(500)']
            ]
        ],
        [
            'tableName' => 'registrations',
            'schema' => [
                ['Field' => 'id', 'Type' => 'int(11)'],
                ['Field' => 'fullName', 'Type' => 'varchar(100)'],
                ['Field' => 'phoneNumber', 'Type' => 'varchar(50)'],
                ['Field' => 'emailAddress', 'Type' => 'varchar(320)'],
                ['Field' => 'password', 'Type' => 'varchar(255)'],
                ['Field' => 'ipAddress', 'Type' => 'varchar(15)'],
                ['Field' => 'uniqueCode', 'Type' => 'varchar(36)'],
                ['Field' => 'created', 'Type' => 'timestamp']
            ]
        ],
        [
            'tableName' => 'registeredWorkshops',
            'schema' => [
                ['Field' => 'id', 'Type' => 'int(11)'],
                ['Field' => 'registrationId', 'Type' => 'int(11)'],
                ['Field' => 'workshopId', 'Type' => 'int(11)']
            ]
        ]
    ];

    array_walk($schemata, function ($tableSchema) {
        $tableName = $tableSchema['tableName'];
        $expectedSchema = $tableSchema['schema'];

        it("has a $tableName table with the required schema", function () use ($tableName, $expectedSchema) {
            $statement = $this->connection->query("SHOW COLUMNS FROM $tableName");
            $statement->execute();

            $columns = $statement->fetchAll(PDO::FETCH_ASSOC);

            expect($columns)->toHaveLength(count($expectedSchema));
            foreach ($expectedSchema as $i => $column) {
                expect($columns[$i]['Field'])->toBe($expectedSchema[$i]['Field']);
                expect($columns[$i]['Type'])->toBe($expectedSchema[$i]['Type']);
            }
        });
    });
});

There was an intermediary refactoring here: initially I had three "hard-coded" cases, as listed further up. As I wrote the test for the second case I noticed I was duplicating everything from the first test except the table name and the details of the schema, so I extracted those as test data, and looped over them. All the test does here is to get the table columns description, and verify they match the name, type and length of my expectations. The expectations were taken directly from the requirement I had been given to implement.

If I now run the tests, those three cases fail, as we'd expect given the tables don't yet exist:

Tests for registration database
  Schema tests
    ✖ it has a workshops table with the required schema
      an uncaught exception has been thrown in `spec/integration/baselineDatabase.spec.php` line 74

      message:`PDOException` Code(42S02) with message "SQLSTATE[42S02]: Base table or view not found: 1146 Table 'fullstackexercise.workshops' doesn't exist"

        [NA] - spec/integration/baselineDatabase.spec.php, line 73 to 74
        […etc…]

    ✖ it has a registrations table with the required schema
      an uncaught exception has been thrown in `spec/integration/baselineDatabase.spec.php` line 74

      message:`PDOException` Code(42S02) with message "SQLSTATE[42S02]: Base table or view not found: 1146 Table 'fullstackexercise.registrations' doesn't exist"

        [NA] - spec/integration/baselineDatabase.spec.php, line 73 to 74
        […etc…]

    ✖ it has a registeredWorkshops table with the required schema
      an uncaught exception has been thrown in `spec/integration/baselineDatabase.spec.php` line 74

      message:`PDOException` Code(42S02) with message "SQLSTATE[42S02]: Base table or view not found: 1146 Table 'fullstackexercise.registeredWorkshops' doesn't exist"

        [NA] - spec/integration/baselineDatabase.spec.php, line 73 to 74
        […etc…]
[…etc…]

Now to add the tables.I've set up these files:

adam@DESKTOP-QV1A45U:/mnt/c/src/ttct$ tree docker/mariadb/do*
docker/mariadb/docker-entrypoint-initdb.d
├── 1.createAndPopulateWorkshops.sql
├── 2.createRegistrations.sql
└── 3.createRegisteredWorkshops.sql

Note: for now that first file name is slightly misnamed, as it'll only have the DDL statement in it at the moment, and the data-insertion will come in a subsequent step. The file contents are as follows:

/* docker/mariadb/docker-entrypoint-initdb.d/1.createAndPopulateWorkshops.sql */

USE fullstackExercise;

CREATE TABLE workshops (
    id INT NOT NULL AUTO_INCREMENT,
    name VARCHAR(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
    
    PRIMARY KEY (id)
) ENGINE=InnoDB;


/* docker/mariadb/docker-entrypoint-initdb.d/2.createRegistrations.sql */

USE fullstackExercise;

CREATE TABLE registrations (
   id INT NOT NULL AUTO_INCREMENT,
   fullName VARCHAR(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
   phoneNumber VARCHAR(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
   emailAddress VARCHAR(320) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
   password VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
   ipAddress VARCHAR(15) NOT NULL,
   uniqueCode VARCHAR(36) NOT NULL DEFAULT (UUID()),
   created TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
   
   PRIMARY KEY (id)
) ENGINE=InnoDB;


/* docker/mariadb/docker-entrypoint-initdb.d/3.createRegisteredWorkshops.sql */

USE fullstackExercise;

CREATE TABLE registeredWorkshops (
   id INT NOT NULL AUTO_INCREMENT,
   registrationId INT NOT NULL,
   workshopId INT NOT NULL,
   PRIMARY KEY (id),
   FOREIGN KEY (registrationId) REFERENCES registrations(id),
   FOREIGN KEY (workshopId) REFERENCES workshops(id)
);

And lastly I need to copy that directory into my MariaDB container when I build it (docker/mariadb/Dockerfile):

FROM mariadb:latest
COPY ./docker-entrypoint-initdb.d/ /docker-entrypoint-initdb.d/
CMD ["mysqld"]
EXPOSE 3306

After I rebuild my containers, I run the tests and we're all good:

    Schema tests
       it has a workshops table with the required schema
       it has a registrations table with the required schema
       it has a registeredWorkshops table with the required schema

Finally I need some seed data in the workshops table. First I'm going to write my test cases for this:

describe('Data tests', function () {
    it('has the required baseline workshop data', function () {
        $expectedWorkshops = [
            ['id' => '2', 'name' => 'TEST_WORKSHOP 1'],
            ['id' => '3', 'name' => 'TEST_WORKSHOP 2'],
            ['id' => '5', 'name' => 'TEST_WORKSHOP 3'],
            ['id' => '7', 'name' => 'TEST_WORKSHOP 4']
        ];

        $statement = $this->connection->query("SELECT id, name FROM workshops ORDER BY id");
        $statement->execute();
        $workshops = $statement->fetchAll(PDO::FETCH_ASSOC);

        expect($workshops)->toEqual($expectedWorkshops);
    });

    it('correctly auto-increments the ID on new insertions', function () {
        $$expectedWorkshopName = 'TEST_WORKSHOP 5';

        $this->connection->beginTransaction();

        $statement = $this->connection->prepare(query: "INSERT INTO workshops (name) VALUES (:name)");
        $statement->execute(['name' => $expectedWorkshopName]);
        $id = $this->connection->lastInsertId();

        $statement = $this->connection->prepare("SELECT id, name FROM workshops WHERE id = :id");
        $statement->execute(['id' => $id]);
        $workshops = $statement->fetchAll(PDO::FETCH_ASSOC);

        expect($workshops)->toHaveLength(1)
        expect($workshops[0])->toContainKey('name')
        expect($workshops[0]['name'])->toBe($expectedWorkshopName)

        $this->connection->rollback();
    });
});

Those are reasonably self-explanatory. I need to insert four baseline workshop records, and in the first case I just SELECT the data and check it's what I expect it to be. The second case only occurred to me when I went to look at the changes to the SQL I needed to make in 1.createAndPopulateWorkshops.sql to insert that data. I needed to take the auto-increment off the table-create statement so I could insert records with the specific IDs I need, then after doing that I altered the table to have the ID auto-increment. I figured I had better test that that worked too. So I insert a new record (just the name, letting the DB handle the ID), get the ID back and use that to get the whole record back for that ID, verifying it's also got the correct name. I do no want that data cluttering my DB so I put the whole thing in a transaction so that it rolls-back when I'm done or if there's an error.

Running those, only the first one errors:

> vendor/bin/kahlan '--spec=spec/integration/baselineDatabase.spec.php'

F.                                                                  2 / 2 (100%)


Tests for registration database
  Data tests
    ✖ it has the required baseline workshop data
      expect->toEqual() failed in `.spec/integration/baselineDatabase.spec.php` line 102

      It expect actual to be equal to expected (==).

      actual:
        (array) []
      expected:
        (array) [
            0 => [
                "id" => "2",
                "name" => "TEST_WORKSHOP 1"
            ],
            […etc…]


Expectations   : 4 Executed
Specifications : 0 Pending, 0 Excluded, 0 Skipped

Passed 1 of 2 FAIL (FAILURE: 1) in 0.025 seconds (using 4MB)

Focus Mode Detected in the following files:
fdescribe - spec/integration/baselineDatabase.spec.php, line 89 to 124
exit(-1)

Script vendor/bin/kahlan handling the spec event returned with error code 255
root@e7d6aa6cf839:/usr/share/fullstackExercise#

This puzzled me at first, but then it occurred to me that the auto-increment test case really ought to have been added when I did the first round of tests before creating the table, because that is when that functionality was added. All I'm doing with the changes I'm about to make is insert some data-insertion code into the script. It's already doing the auto-increment on the ID, and all I'm doing with that is changing when it's being applied: from the table-creation statement to its own statement after the inserts are done. See below for what I mean.

And now I'll now update that docker/mariadb/docker-entrypoint-initdb.d/1.createAndPopulateWorkshops.sql to also insert the baseline data:

USE fullstackExercise;

CREATE TABLE workshops (
    id INT NOT NULL /* AUTO_INCREMENT <- this has been removed from here */,
    name VARCHAR(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
    
    PRIMARY KEY (id)
) ENGINE=InnoDB;

INSERT INTO workshops (id, name)
VALUES
    (2, 'TEST_WORKSHOP 1'),
    (3, 'TEST_WORKSHOP 2'),
    (5, 'TEST_WORKSHOP 3'),
    (7, 'TEST_WORKSHOP 4')
;

ALTER TABLE workshops MODIFY COLUMN id INT auto_increment;

(Note I've moved the auto-increment on the ID field when I create the table now, so the seed data can have specific IDs. Once I do the insert, then I make the column auto-increment).

Once I rebuild my containers, all the tests now pass:

> vendor/bin/kahlan '--spec=spec/integration/baselineDatabase.spec.php' '--reporter=verbose'


  Tests for registration database
    Connectivity tests
       it can connect to the database with environment-based credentials
    Schema tests
       it has a workshops table with the required schema
       it has a registrations table with the required schema
       it has a registeredWorkshops table with the required schema
    Data tests
       it has the required baseline workshop data
       it correctly auto-increments the ID on new insertions



Expectations   : 35 Executed
Specifications : 0 Pending, 0 Excluded, 0 Skipped

Passed 6 of 6 PASS in 0.033 seconds (using 5MB)

root@44850303b17a:/usr/share/fullstackExercise#

And I think that's about it. I'm not doing anything with the data yet, but that'll start to be fleshed out in the next article (or maybe the following one. Not sure). This was just an exercise in doing some stuff with Docker and MariaDB, and thinking about the merits of TDDing exercises like this. I think it was worth it, especially during this early phase of working with these containers as I'm still reconfiguring stuff a lot, so it's good to know things don't get messed up when I'm monkeying with stuff.

Righto.

--
Adam