Friday 20 January 2023

PHP: PrimaryReadReplicaConnection - configuration / usage example

G'day:

I've been dusting off my out-of-date PHP skills (such as they are), and I had a right arse of a time finding any documentation for PrimaryReadReplicaConnections via Google, so I'm adding something here for the next time I need to find it. Because knowing me I won't remember.

Back when I was last doing PHP for a living, DBAL had a MasterSlaveConnection class. This was used to create a single composite connnection object that comprising one primary connection - likely to a read/write DB user - and an array of secondary connections - likely to read-replicas of the same. The way it works is that for read-only operations (SELECTs basically). The reasoning being that this would ease up traffic on the primary DB for queries that don't need to be done on it, using the read replicas instead. But for writes, it uses the primary. It's a good idea.

I'm returning to PHP and I need to mentor my team, so I've been writing some code examples to show them how things work in PHP. I googled "MasterSlaveConnection" to RTFM, and found this issue on Github: "RFC: Rename MasterSlaveConnection". OK so MasterSlaveConnection connection has been renamed to PrimaryReadReplicaConnection. Makes sense. So I googled that instead. Frickin nothing. Nothing useful anyhow. Some stuff on setting up a connection in a Doctrine YAML file, and some other references, but I can't find actual docs with an actual useful example of "do it like this".

Before I start, here's a vanilla DBAL connection example for context:

private function getDbalConnection() : Connection
{
    $parameters = $this->getConnectionParameters();
    return DriverManager::getConnection([
        'dbname' => $parameters->database,
        'user' => $parameters->username,
        'password' => $parameters->password,
        'host' => $parameters->host,
        'port' => $parameters->port,
        'driver' => 'pdo_mysql'
    ]);
}

And here's the equivalent for a PrimaryReadReplicaConnection:

private function getPrimaryReadReplicaConnection() : PrimaryReadReplicaConnection
{
    $parameters = $this->getConnectionParameters();
    return DriverManager::getConnection([
        'wrapperClass' => PrimaryReadReplicaConnection::class,
        'driver' => 'pdo_mysql',
        'primary' => [
            'host' => $parameters->host,
            'port' => $parameters->port,
            'user' => $parameters->username,
            'password' => $parameters->password,
            'dbname' => $parameters->database
        ],
        'replica' => [[
            'host' => $parameters->readOnlyHost,
            'port' => $parameters->port,
            'user' => $parameters->username,
            'password' => $parameters->password,
            'dbname' => $parameters->database
        ]]
    ]);
}

It's pretty basic really. As with a normal DBAL connection one still uses the DriverManager::getConnection factory method to create a connection.

The difference is with the PrimaryReadReplicaConnection case is one needs to pass the deets of the primary and all the replicas instead of just the one set of connection config:

  • wrapperClass: the class name of the PrimaryReadReplicaConnection class.
  • driver: which DB I'm using.
  • primary; an associative array of the usual connection params for the connection to the primary (read/write) DB connection.
  • replica; an indexed array of associative arrays of connection params to the read replicas. One can pass as many of these as one likes, and I believe the connection wrapper controls how to pick one to sue for a given query. I'm only using one here because I only have one read replica.

I think that replica key should be plural, right? Oh well: can't always get it right I guess.

Here are some tests describing its operations / behaviour:

/** @testdox it will use a read-only connection if it can */
public function testPrimaryReadReplicaConnectionReadOnlyConnection()
{
    $connection = $this->getPrimaryReadReplicaConnection();

    $result = $connection->executeQuery("SELECT @@VERSION");

    $this->assertFalse($connection->isConnectedToPrimary());
}

false because it's not connected to the primary: it's connected to a replica because it's only doing reading.

/** @testdox a connection will start on a replica then change to the primary and stay there after a write operation */
public function testPrimaryReadReplicaConnectionSwitchToPrimary()
{
    $testName = "TEST_NAME_" . uniqid();

    $connection = $this->getPrimaryReadReplicaConnection();

    $this->assertFalse($connection->isConnectedToPrimary(), "Should start on a replica");
    $connection->beginTransaction();
    try {
        $this->assertTrue($connection->isConnectedToPrimary(), "Should be on the primary in a transaction");
        $connection->executeStatement(
            "INSERT INTO features (name,value) VALUES (:name, :value)", [
            "name" => $testName,
            "value" => "TEST_VALUE",
        ]);

        $newRow = $connection->executeQuery("SELECT value FROM features WHERE name = ?", [$testName]);
        $this->assertEquals("TEST_VALUE", $newRow->fetchOne());
        $this->assertTrue($connection->isConnectedToPrimary(), "Should still be on the primary after a read following a write");
    } finally {
        $connection->rollBack();
    }
    $this->assertTrue($connection->isConnectedToPrimary(), "Should still be on the primary after a rollback");

    $connection = $this->getPrimaryReadReplicaConnection();
    $this->assertFalse($connection->isConnectedToPrimary(), "A new connection should start on a replica");
}

This second one shows something important. A given connection might start on a read replica, but once there's been a first write operation (in this case the transaction singles some writing about to happen, intrinsically) - so the connection switches to the primary writeable connection - it stays on that primary connection. This is for the sake of data reliability: if one writes some data to primary and then tries to read it back later (for whatever reason), there'd be no guarantee the read replica(s) will have received the updates themselves yet, so it's safer to stay on the primary. Good thinking.

And a new connection within the same code will start afresh using a replica if it can.

OK, that's it. I just wanted this noted down some place.

Righto.

--
Adam