Showing posts with label GuzzleHttp. Show all posts
Showing posts with label GuzzleHttp. Show all posts

Saturday 21 January 2023

PHP: looking at ways of making HTTP requests

G'day:

I'm reacquainting myself with PHP, and part of this process is chucking some tests together to demonstrate to myself how bits and pieces of it works. This has the added bonus of being able to put the code in front of my team, to help provide learning info for them. This article is pretty much just showing sample code, and it's for the reader to compare and contrast. There's likely not gonna be too much exposition from me once we get to the code. I'm sure I can pad things out by a few hundred words before we get there though. I am me after all.

This time, I've decided to revisit how to make HTTP requests.

I've got four candidate solutions to look at:

I am aware of PHP's curl extension always being available, but its API is a bit of a mess (it's been part of PHP since the bad old days).

I've also used Guzzle in the past, with mixed success. It started out being simple and handy, and I liked it. But then between a major version bump (I can't remember which versions this occurred between), the old API was basically dumped in favour of a new, non-backwards-compatible, and largely (and pointlessly IMO) overly complex promise-based approach. To provide asynchronicity in HTTP requests. Which was something I never needed and seemed like an odd addition to an HTTP library. I suspect the author had started to look at Node.jS with all its async HTTP shiz, and went "I know… I'll ruin something perfectly useful by adding this crap into it as well". Ugh. However I note Guzzle is still around, so - armed with an open mind - I'll look at that too.

During my googling I have also spied that Symfony has an HTTP client too. It probably always did, but in my last gig we went the Guzzle route, so I had not looked further afield.

Also during my googling (and reading the Guzzle docs), I discovered PHP's own streams extension can be used to make HTTP requests. That sounds interesting, so have decided to give that a go too.

My approach is to create a test class, and add a test for each of those four platforms, to do each of a GET and a POST. They are not complicated tests, it's just a case of getting the thing to do something simple that I can expect results from.

I will also concede that I used Copilot to do probably 80% of the work here, with my polishing that last 20%.

Installation

  • curl needs ext-curl installed. It ships with the Docker image, so I didn't have to do anything for this.
  • Installing Guzzle is a matter of adding it as a dependency in composer.json: "guzzlehttp/guzzle": "^7.5.0" at time of writing.
  • Similarly with Symfony's HTTP client: "symfony/http-client": "^6.2.2"
  • And PHP's streams lib is native to PHP. No installation necessary.

Curl

/** @testdox it can make a GET request */
public function testGet()
{
    $ch = curl_init();
    curl_setopt_array($ch, [
        CURLOPT_URL => 'https://api.github.com/users/adamcameron',
        CURLOPT_USERAGENT => $this->getUserAgentForCurl(),
        CURLOPT_RETURNTRANSFER => 1
    ]);
    $response = curl_exec($ch);
    curl_close($ch);

    $this->assertEquals(200, curl_getinfo($ch, CURLINFO_HTTP_CODE));
    $this->assertJson($response);
    $this->assertGitInfoIsCorrect($response);
}

It also uses these two helper methods:

private function getUserAgentForCurl(): string
{
    return sprintf("curl/%s", curl_version()['version']);
}
protected function assertGitInfoIsCorrect(string $response): void
{
    $myGitMetadata = json_decode($response);
    $this->assertEquals('adamcameron', $myGitMetadata->login);
    $this->assertEquals('Adam Cameron', $myGitMetadata->name);
}

(A bunch of the other tests below also use that one above).

The GET test in each case will be to get my own GitHub profile and to superficially check it's been fetched properly. BTW I needed that getUserAgentForCurl carry-on because curl by itself does notsent a user agent, and Github says "nuh-uh" if it doesn't get one. So I've just contrived a user agent that is the one that the underlying curl implementation would use (eg: ike if one was doing a curl from bash).

The POST test will post to https://httpbin.org/post. I only discovered httpbin.org when I was doing this exercise, and I wanted a simple (public) way of echoing back a post request. Handy.

/** @testdox it can make a POST request */
public function testPost()
{
    $ch = curl_init();
    curl_setopt_array($ch, [
        CURLOPT_URL => 'https://httpbin.org/post',
        CURLOPT_USERAGENT => $this->getUserAgentForCurl(),
        CURLOPT_RETURNTRANSFER => 1,
        CURLOPT_POST => 1,
        CURLOPT_POSTFIELDS => ['foo' => 'bar']
    ]);
    $response = curl_exec($ch);
    curl_close($ch);

    $this->assertEquals(200, curl_getinfo($ch, CURLINFO_HTTP_CODE));
    $this->assertJson($response);
    $httpBinResponse = json_decode($response);

    $this->assertEquals('bar', $httpBinResponse->form->foo);
}

An example of what https://httpbin.org/post returns is:

{
    "args":{
        
    },
    "data":"",
    "files":{
        
    },
    "form":{
        "foo":"bar"
    },
    "headers":{
        "Accept":"*/*",
        "Content-Length":"141",
        "Content-Type":"multipart/form-data; boundary=------------------------b0133bb008e6829b",
        "Host":"httpbin.org",
        "User-Agent":"curl/7.74.0",
        "X-Amzn-Trace-Id":"Root=1-63cc3251-4a3f92c809ad00f75261466a"
    },
    "json":null,
    "origin":"82.8.81.31",
    "url":"https://httpbin.org/post"
}

Guzzle

First up: I'm really pleased how compact and straight-forward Guzzle's code is for these exercises. And also that one is not forced to write async code for a non-async situation.

/** it can make a GET request */
public function testGet()
{
    $client = new Client();
    $response = $client->request('GET', 'https://api.github.com/users/adamcameron');
    $this->assertEquals(200, $response->getStatusCode());
    $this->assertJson($response->getBody());
    $this->assertGitInfoIsCorrect($response->getBody());
}
/** @testdox it can make a POST request */
public function testPost()
{
    $client = new Client();
    $response = $client->request(
        'POST',
        'https://httpbin.org/post',
        ['form_params' => ['foo' => 'bar']]
    );
    $this->assertEquals(200, $response->getStatusCode());
    $this->assertJson($response->getBody());
    $httpBinResponse = json_decode($response->getBody());
    $this->assertEquals('bar', $httpBinResponse->form->foo);
}

I also decided to revisit the async side of things:

/** it can make an asynchronous GET request */
public function testAsyncGet()
{
    $client = new Client();
    $promise = $client->requestAsync('GET', 'https://api.github.com/users/adamcameron');
    $response = $promise->wait();
    $this->assertEquals(200, $response->getStatusCode());
    $this->assertJson($response->getBody());
    $this->assertGitInfoIsCorrect($response->getBody());
}

Simple. It seems the current implementation is taking the "async-await" approach with these things like JS has these days.

That was not much of a test though. This time I am gonna make a bunch of requests (which are artificially slow) and make sure they do seem to run asynchronously. I've slung this in my web directory:

// html/test-fixtures/slow.php
$timeToWait = $_GET['timeToWait'] ?? 0;
sleep($timeToWait);
echo "waited $timeToWait seconds";

And calling that with varying delays:

/** it can make multiple asynchronous GET requests */
public function testMultipleAsyncGet()
{
    $client = new Client();
    $requestsToMakeConcurrently = [
        $client->getAsync('http://nginx/test-fixtures/slow.php?timeToWait=1'),
        $client->getAsync('http://nginx/test-fixtures/slow.php?timeToWait=2'),
        $client->getAsync('http://nginx/test-fixtures/slow.php?timeToWait=3')
    ];
    $startTime = microtime(true);
    $responses = Promise\Utils::unwrap($requestsToMakeConcurrently);
    $endTime = microtime(true);

    $totalTime = $endTime - $startTime;
    $this->assertGreaterThan(3, $totalTime);
    $this->assertLessThan(4, $totalTime);

    array_walk($responses, function ($response, $i) {
        $this->assertEquals(200, $response->getStatusCode());
        $this->assertEquals(sprintf("waited %d seconds", $i+1), $response->getBody());
    });
}

The assertions there are a bit woolly. I figured it should def take longer than 3sec cos at least one of the requests will take 3sec. Plus there'll be a wee bit of overhead. That overhead ought not be more than a second, so if the whole lot finishes in less than 4sec, it's a pretty good indicator that all three requests were being made simultaneously. It occurs to me now I could perhaps look @ the Nginx activity logs for when the requests come in. Please hold…

172.31.0.4 - - [21/Jan/2023:18:57:49 +0000] "GET /test-fixtures/slow.php?timeToWait=1 HTTP/1.1" 200 27 "-" "GuzzleHttp/7"
172.31.0.4 - - [21/Jan/2023:18:57:50 +0000] "GET /test-fixtures/slow.php?timeToWait=2 HTTP/1.1" 200 27 "-" "GuzzleHttp/7"
172.31.0.4 - - [21/Jan/2023:18:57:51 +0000] "GET /test-fixtures/slow.php?timeToWait=3 HTTP/1.1" 200 27 "-" "GuzzleHttp/7"

Now Nginx is logging when it responds to the request, not when it receives it, so what we can infer from this is that the requests all arrived at 18:57:48, and the 1sec request finished after 1sec at 18:57:49; the 2sec request finished after 2sec @ 18:57:50, and similarly the third one, 3sec, finished after 3 seconds at 18:57:51.

It's easier to see if I make the requests hang on for different periods of time. Here's an example where they take 1sec, 12sec and 23sec respectively:

172.31.0.4 - - [21/Jan/2023:18:59:07 +0000] "GET /test-fixtures/slow.php?timeToWait=1 HTTP/1.1" 200 27 "-" "GuzzleHttp/7"
172.31.0.4 - - [21/Jan/2023:18:59:18 +0000] "GET /test-fixtures/slow.php?timeToWait=12 HTTP/1.1" 200 28 "-" "GuzzleHttp/7"
172.31.0.4 - - [21/Jan/2023:18:59:29 +0000] "GET /test-fixtures/slow.php?timeToWait=23 HTTP/1.1" 200 28 "-" "GuzzleHttp/7"

We can infer they all arrived at 18:59:06. 1sec later at 18:59:07 the first request completed; 12sec later the second one completed at 18:59:18 (18:59:18 - 18:59:06 is 12sec); and lastly the third request - which will take 23sec to run - indeed finishes at 18:59:29 - 18:59:06 = 23sec later.

Excellent. Working as expected.

For completeness I also tested an async POST request:

/** @testdox it can make an asynchronous POST request */
public function testAsyncPost()
{
    $client = new Client();
    $promise = $client->requestAsync(
        'POST',
        'https://httpbin.org/post',
        ['form_params' => ['foo' => 'bar']]
    );
    $response = $promise->wait();
    $this->assertEquals(200, $response->getStatusCode());
    $this->assertJson($response->getBody());
    $httpBinResponse = json_decode($response->getBody());
    $this->assertEquals('bar', $httpBinResponse->form->foo);
}

No surprises.


Symfony

/** @testdox it can make a GET request */
public function testGet()
{
    $client = HttpClient::create();
    $response = $client->request('GET', 'https://api.github.com/users/adamcameron');
    $this->assertEquals(200, $response->getStatusCode());
    $this->assertJson($response->getContent());
    $this->assertGitInfoIsCorrect($response->getContent());
}

This is identical to the Guzzle example except Symfony uses a factory method to create the client object compared Guzzle just using new; and Guzzle uses getBody instead of Symfony's getContent.

/** @testdox it can make a POST request */
public function testPost()
{
    $client = HttpClient::create();
    $response = $client->request(
        'POST',
        'https://httpbin.org/post',
        ['body' => ['foo' => 'bar']]
    );
    $this->assertEquals(200, $response->getStatusCode());
    $this->assertJson($response->getContent());
    $httpBinResponse = json_decode($response->getContent());
    $this->assertEquals('bar', $httpBinResponse->form->foo);
}

It's just occurred to me that knowing Symfony, it can likely do async request collections too. And after some googling: sure enough I've found a way ("Symfony › HTTP Client › Concurrent Requests"):

/** it can make multiple asynchronous GET requests */
public function testMultipleAsyncGet()
{
    $client = HttpClient::create();
    $requestsToMakeConcurrently = [
        $client->request('GET', 'http://nginx/test-fixtures/slow.php?timeToWait=1'),
        $client->request('GET', 'http://nginx/test-fixtures/slow.php?timeToWait=2'),
        $client->request('GET', 'http://nginx/test-fixtures/slow.php?timeToWait=3')
    ];
    $stream = $client->stream($requestsToMakeConcurrently);

    $i = 1;
    $startTime = microtime(true);
    foreach ($stream as $response => $chunk) {
        if ($chunk->isLast()) {
            $this->assertEquals(200, $response->getStatusCode());
            $this->assertEquals("waited $i seconds", $response->getContent());
            $i++;
        }
    }
    $endTime = microtime(true);
    $totalTime = $endTime - $startTime;
    $this->assertGreaterThan(3, $totalTime);
    $this->assertLessThan(4, $totalTime);
}

This is analogous to the Guzzle version. It's implementation is not as nice though IMO.


PHP Streams

/** @testdox it can make a GET request */
public function testGet()
{
    $context = stream_context_create([
        'http' => [
            'method' => 'GET',
            'header' => ['User-Agent: ' . $this->getUserAgentForCurl()]
        ]
    ]);
    $response = file_get_contents('https://api.github.com/users/adamcameron', false, $context);
    $this->assertJson($response);
    $this->assertGitInfoIsCorrect($response);
}
/** @testdox it can make a POST request */
public function testPost()
{
    $context = stream_context_create([
        'http' => [
            'method' => 'POST',
            'header' => [
                'User-Agent: ' . $this->getUserAgentForCurl(),
                'Content-Type: application/x-www-form-urlencoded'
            ],
            'content' => http_build_query(['foo' => 'bar'])
        ]
    ]);
    $response = file_get_contents('https://httpbin.org/post', false, $context);
    $this->assertJson($response);
    $httpBinResponse = json_decode($response);
    $this->assertEquals('bar', $httpBinResponse->form->foo);
}

OK. It's poss just me being pedantic, but I get a bit itchy looking at file_get_contents on a URL. I mean I know an HTTP request is fetching a file - so semantically that's fine - but it still seems odd.


Conclusion

For these superficial test cases, I prefer Guzzle. Doubtless there more one can do with Symfony's HTTP client, because there's always more one can do with Symfony's stuff; but the same will apply with Guzzle too no doubt. I did not know about PHP's streams before, and whilst this might not be a good use of it, there'll likely be other situations to use them.

I'm mostly pleased that Guzzle seems easy to use again, and for both sync and async stuff. Cool.

All the code shown in here is @ /test/integration/http on Github.

Righto.

--
Adam

Monday 22 February 2021

Troubleshooting an issue with Kahlan

G'day:

I'm ramping-up to do some more development with this fullstackExercise codebase I've been working on (see "Creating a web site with Vue.js, Nginx, Symfony on PHP8 & MariaDB running in Docker containers", parts 1-12), and as part of the yak-shaving phase of things, I decided to port my PHP tests from PHPUnit to Kahlan. The reason for this is twofold. Firstly, I much prefer the describe / it syntax to writing and running tests over xUnit; and secondly - other than perpetually finding new ways to be frustrated by how PHPUnit handles/documents things - I am not learning a great deal about PHPUnit any more. So Kahlan fits in well with this learning exercise.

Porting the tests is easy, largely cos my tests are currently minimal in number and in complexity. But this PHPUnit class (test/functional/ExampleComTest.php):

namespace adamCameron\kahlanIssue\test\functional;

use GuzzleHttp\Client;
use PHPUnit\Framework\TestCase;
use Symfony\Component\HttpFoundation\Response;

class ExampleComTest extends TestCase
{
    /** @coversNothing */
    public function testExampleDotComReturnsExpectedContent()
    {
        $expectedContent = "Example Domain";


        $client = new Client([
            'base_uri' => 'http://example.com/'
        ]);

        $response = $client->get('index.html');

        $this->assertEquals(Response::HTTP_OK, $response->getStatusCode());

        $html = $response->getBody();
        $document = new \DOMDocument();
        $document->loadHTML($html);

        $xpathDocument = new \DOMXPath($document);

        $hasTitle = $xpathDocument->query('/html/head/title[text() = "' . $expectedContent . '"]');
        $this->assertCount(1, $hasTitle);

        $hasHeading = $xpathDocument->query('/html/body/div/h1[text() = "' . $expectedContent . '"]');
        $this->assertCount(1, $hasHeading);
    }
}

Becomes this Kahlan spec(spec/functional/example.com.spec.php):

namespace adamCameron\fullStackExercise\spec\functional;

use GuzzleHttp\Client;
use Symfony\Component\HttpFoundation\Response;

describe('Tests that example.com can be curled', function () {

    beforeAll(function () {
        $client = new Client([
            'base_uri' => 'http://example.com/'
        ]);

        $this->response = $client->get('index.html');

        $html = $this->response->getBody();
        $document = new \DOMDocument();
        $document->loadHTML($html);
        $this->xpathDocument = new \DOMXPath($document);
    });

    it("should have expected status code", function () {
        expect($this->response->getStatusCode())->toBe(Response::HTTP_OK);
    });

    it("should have expected title", function () {
        $hasTitle = $this->xpathDocument->query('/html/head/title[text() = "Example Domain"]');
        expect($hasTitle)->toHaveLength(1);
    });

    it("should have expected heading", function () {
        $hasHeading = $this->xpathDocument->query('/html/body/div/h1[text() = "Example Domain"]');
        expect($hasHeading)->toHaveLength(1);
    });
});

The problems for me started when I came to run that test in Kahlan. I got this:

            _     _
  /\ /\__ _| |__ | | __ _ _ __
 / //_/ _` | '_ \| |/ _` | '_ \
/ __ \ (_| | | | | | (_| | | | |
\/  \/\__,_|_| |_|_|\__,_|_| |_|

The PHP Test Framework for Freedom, Truth and Justice.

src directory  :
spec directory : /mnt/c/src/kahlanIssue/spec

                                                                    3 / 3 (100%)


Tests that example.com can be curled
  an uncaught exception has been thrown in `vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php` line 158

  message:`ParseError` Code(0) with message "Unclosed '{' on line 142 does not match ')'"

    [NA] - vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php, line  to 158
    Kahlan\Jit\ClassLoader::loadFile() - vendor/kahlan/kahlan/src/Jit/ClassLoader.php, line 759
    Kahlan\Jit\ClassLoader::loadClass() - vendor/guzzlehttp/guzzle/src/Utils.php, line 95
    [… etc …]


Expectations   : 0 Executed
Specifications : 0 Pending, 0 Excluded, 0 Skipped

Passed 0 of 1 FAIL (EXCEPTION: 1) in 0.015 seconds (using 7MB)

Eek, OK that's a compilation error, not a runtime exception, which is odd. I checked the code of CurlMultiHandler.php to see if anything was broken in there, and unsurprisingly it wasn't. I decided to chuck a try / catch around the code in question, and see if the actual issue was something other than being reported. I found it odd that Kahlan was reporting it as an Exception, when do me a code-parsing issue is an out-and-out Error. Two different things. So I wondered if Kahlan was doing something odd. I ran the code again and the error spewed out again. What the hell? Then I looked more closely. The error had changed slightly. Now it says this:

an uncaught exception has been thrown in `vendor/guzzlehttp/guzzle/src/Handler/StreamHandler.php` line 201

message:`ParseError` Code(0) with message "Unclosed '{' on line 192 does not match ')'"

Note before it was complaining about CurlMultiHandler.php. Now it's StreamHandler.php. OK. I'll play yer silly game. I stuck a try / catch around that code, and re-ran. Now it errors with this:

an uncaught exception has been thrown in `vendor/guzzlehttp/promises/src/Promise.php` line 151

message:`ParseError` Code(0) with message "Unclosed '{' on line 148 does not match ')'"

Grrr. OK, one last time with the try / catch:

            _     _
  /\ /\__ _| |__ | | __ _ _ __
 / //_/ _` | '_ \| |/ _` | '_ \
/ __ \ (_| | | | | | (_| | | | |
\/  \/\__,_|_| |_|_|\__,_|_| |_|

The PHP Test Framework for Freedom, Truth and Justice.

src directory  :
spec directory : /mnt/c/src/kahlanIssue/spec

...                                                                 3 / 3 (100%)



Expectations   : 3 Executed
Specifications : 0 Pending, 0 Excluded, 0 Skipped

Passed 3 of 3 PASS in 0.565 seconds (using 7MB)

Um. OK. Why's it working? Next I backed-out all my debug code and ran it again. And the tests worked still. Next I blew-away the entire vendor directory and re-did composer install, and the tests continued to work.

It's worth noting here that I know those are all Guzzle files, but If I run the equivalent code in PHPUnit, there is never any issue, hence me doubting it's anything to do with Guzzle specifically, it's how Kahlan is calling Guzzle. Probably with vendor/kahlan/kahlan/src/Jit/ClassLoader.php, which is the preceding file in the callstack, each time. I have to admit I did not look into that file to see what it is doing. I'll crack on with that shortly.

At this point I was thinking that it might be something to do with me running all this in a Docker container, and decided I should verify this in the native file system. At the same time I decided to create a simple repro case: my initial experiments at this stage had not been with the example spec above, they were with one of the fullstack-exercise test specs, and those required the rest of the Docker containers to be up and running, so too complicated to be portable (see "Short, Self Contained, Correct (Compilable), Example"). Accordingly I created the spec above, in a stand-alone application (as per the kahlan-issue repo on Github).

I sighed and installed PHP 8 and Composer on my PC - something I was hoping to avoid needing to do every again, now that I'm using Docker - and ran the test again. Again PHPUnit ran fine, but Kahlan errored-out. The act of putting debug code into the files erroring was enough to make that file now parse and compile, and the error moved on to the next file. Always the same three files. After touching the files once, the problem stayed away. Blowing away vendor and re-doing composer install did not make the erroring situation recur. However if I stuck the app in a different base directory - eg: I had it set up in C:\src\kahlanIssue initially, but moved it to C:\temp\kahlanIssue - then the error recurred again, until I tweakd the files. I noted that putting just whitespace into the file did not "fix" the issue; it needed to be something that PHP would need to parse. Even if I restarted the PC, once I had "touched" those three files once, even if I reverted the touches, the problem did not recur. That was weird.

I was also able to replicate the issue in all of:

  • on Ubuntu within a Docker container (host machine is a Windows PC);
  • on Ubuntu via WSL on that same Windows PC;
  • natively on that Windows PC;
  • natively on Windows on a different PC.

It was only when I downgraded from PHP 8.0.2 to 7.4.15 that I could no longer replicate the issue. So I guess something odd Kahlan is doing to load the PHP files worked fine in previous versions of PHP, but PHP 8 doesn't like it. Or something like that.

I think now I have enough to go on to raise an issue with the Kahlan bods, which I shall do now… done: 370.

All that kinda killed my Sunday afternoon (and now Monday morning writing this article). Hopefully I can get some actual work done this evening though…

Status update on this

The bod from the Kahlan project (specifically Simon Jaillet) jumped on this straight away. In the mean time I looked at what was going on in that vendor/kahlan/kahlan/src/Jit/ClassLoader.php file I mentioned earlier, which was the last touch point in Kahlan before the error. Part of that loading process was to take the original files, monkey-patch the hell out of them to make mocking easier, and then load the monkey-patched version of the file instead of the original one. It also cached the monkey-patched version so it only had to do that exercise once. When it was monkey-patching the erroring files on PHP8, it was ending up with this sort of thing. Original code:

// Step through the task queue which may add additional requests.
P\Utils::queue()->run();

And patching it to be this:

// Step through the task queue which may add additional requests.($__KMONKEY__396__?$__KMONKEY__396__:
$__KMONKEY__396::queue())->run();

Note how the first part of the patched code is stuck on the comment line, not the code line. It should be like this:

// Step through the task queue which may add additional requests.
($__KMONKEY__396__?$__KMONKEY__396__:$__KMONKEY__396::queue())->run();

And when I ran Kahlan on PHP 7.4, it was patching it correctly (as per the last block, above).

The bloke from Kahlan located the issue, which is down to a change in how PHP8 tokenises comments (see "Backward Incompatible Changes > Tokenizer":

T_COMMENT tokens will no longer include a trailing newline. The newline will instead be part of a following T_WHITESPACE token.

That explains that.

I raised the issue at 13:27, and it was fixed by 15:16. I've updated my Kahlan version, and it does indeed sort-out the issue. Brilliant work!

Righto.

--
Adam

Friday 25 November 2016

PHP: decorating async GuzzleHttp calls - more methods

G'day:
Here's yet another article about decorating an async GuzzleHttp adapter. So far in the series I've got this lot:
To date my adapter has only had the one method to decorate: the get method. I also need to add more methods like post and put, which have their own considerations when doing decoration. Let's have a look.

Firstly I need some end points to test against. I decided I'd better shuffle-on from those CFML based ones I had, and use PHP instead. So I've rejigged them all as follows:

data.php

class Person{
    public $id;
    public $firstName;
    public $lastName;

    function __construct($id, $firstName, $lastName){
        $this->id = $id;
        $this->firstName = $firstName;
        $this->lastName = $lastName;
    }
}

$people = [
    new Person(1, "Donald", "Duck"),
    new Person(2, "Donald", "Cameron"),
    new Person(3, "Donald", "Trump"),
    new Person(4, "Donald", "Wearsyatroosas")
];

getById.php

$id = $_GET["id"];

require __DIR__ . "/data.php";

$filteredPeople = array_filter($people, function($person) use ($id) {
    return $person->id == $id;
});
$person = array_shift($filteredPeople);

$result = (array) $person;

sleep(5);
$result["retrieved"] = (new DateTime())->format("Y-m-d H:i:s");

header("type:application/json");
echo json_encode($result);

create.php

require __DIR__ . "/data.php";

$newPerson = new Person(
    uniqid(),
    $_POST['firstName'],
    $_POST['lastName']
);

$result = (array) $newPerson;

$result["retrieved"] = (new DateTime())->format("Y-m-d H:i:s");

header("type:application/json");
echo json_encode($result);

update.php

$id = $_GET["id"];
parse_str(file_get_contents("php://input"),$put_vars);

require __DIR__ . "/data.php";

$filteredPeople = array_filter($people, function($person) use ($id) {
    return $person->id == $id;
});

if (empty($filteredPeople)){
    http_response_code(400);
    throw new InvalidArgumentException("No Person with ID $id found");
}

$person = array_shift($filteredPeople);

$newPerson = new Person(
    $person->id,
    $put_vars['firstName'],
    $put_vars['lastName']
);

$result = [
    'before' => (array) $person,
    'after' => (array) $newPerson,
    'updated' => (new DateTime())->format("Y-m-d H:i:s")
];

header("type:application/json");
echo json_encode($result);

That lot is all pretty perfunctory. There was a coupla PHP things I didn't know about before hand, namely:
  • how to extract the body for a put request. I was initially expecting them to be in the $_POST array for some reason, but that makes no sense.
  • how to set the HTTP status. I'd never had to do that before with PHP (we use Silex / Symfony, and we just use API methods to do same).

(readers need to remember I'm fairly new to PHP ;-)

The next update is to the Adapter interface:

namespace me\adamcameron\testApp\adapter;

use GuzzleHttp\Promise\Promise;

interface Adapter {
    public function get($url, $parameters) : Promise;
    public function post($url, $body) : Promise;
    public function put($url, $body, $parameters) : Promise;
}

I've added the two new methods there. This now breaks all my Adapter implementations.

First I need to add the actual method implementations to the base adapter: the ones that actually makes the calls:

namespace me\adamcameron\testApp\adapter;

use GuzzleHttp\Client;
use GuzzleHttp\Promise\Promise;

class GuzzleAdapter implements Adapter {

    private $client;

    public function __construct() {
        $this->client = new Client();
    }

    public function get($url, $parameters) : Promise {
        $fullUrl = sprintf("%s?%s", $url, http_build_query($parameters));

        $response = $this->client->requestAsync("get", $fullUrl);

        return $response;
    }

    public function post($url, $body) : Promise {
        $options = ['form_params' => $body];

        $response = $this->client->requestAsync("post", $url, $options);

        return $response;
    }

    public function put($url, $body, $parameters) : Promise {
        $fullUrl = sprintf("%s?%s", $url, http_build_query($parameters));
        $options = ['form_params' => $body];

        $response = $this->client->requestAsync("put", $fullUrl, $options);

        return $response;
    }
}

Nothing surprising there. Note that each of the new methods have different argument requirements. This becomes relevant later.

Now I need to add equivalent methods to the decorators. I'll start with the caching one as that's easy: we have no caching requirements for POST or PUT requests, so the methods just pass through to the underlying adapter:

public function post($url, $body) : Promise {
    return $this->adapter->post($url, $body);
}

public function put($url, $body, $parameters) : Promise {
    return $this->adapter->put($url, $body, $parameters);
}


See: easy.

The logging adapter was more of a challenge. The first pass of the implementation looked like this sort of thing:

public function get($url, $parameters) : Promise {
    $encodedParameters = json_encode($parameters);
    $this->logger->logMessage(sprintf("%s: Requesting for %s", $this->thisFile, $encodedParameters));

    $response = $this->adapter->get($url, $parameters);

    $response->then(function($response) use ($encodedParameters) {
        $body = $response->getBody();
        $this->logger->logMessage(sprintf("%s: Response for %s: %s", $this->thisFile, $encodedParameters, $body));
        $body->rewind();
    });

    return $response;
}

public function post($url, $body) : Promise {
    $logDetails = json_encode($body);
    $this->logger->logMessage(sprintf("%s: Requesting for %s", $this->thisFile, $logDetails));

    $response = $this->adapter->post($url, $body);

    $response->then(function($response) use ($logDetails) {
        $body = $response->getBody();
        $this->logger->logMessage(sprintf("%s: Response for %s: %s", $this->thisFile, $logDetails, $body));
        $body->rewind();
    });

    return $response;
}

public function post($url, $body, $parameters) : Promise {
    $logDetails = json_encode([
        'parameters' => $parameters,
        'body' => $body
    ]);
    $this->logger->logMessage(sprintf("%s: Requesting for %s", $this->thisFile, $logDetails));

    $response = $this->adapter->put($url, $body, $parameters);

    $response->then(function($response) use ($logDetails) {
        $body = $response->getBody();
        $this->logger->logMessage(sprintf("%s: Response for %s: %s", $this->thisFile, $logDetails, $body));
        $body->rewind();
    });

    return $response;
}

That's fine, but look at how much similarity there is in there:
  • We log some stuff
  • We peform a request to a method
  • We have a resolution handler
What we log is different for all three methods, but the handling after that is the same. So I figured there was a refactoring opportunity there. One slight challenge is that which I mentioned above: each method passes different arguments. That'll be tricky to refactor into one method. Well so I thought. Enter PHP's variadic functions. These allow me to specify a method signature that has a concrete argument list, then one last argument that captures any other arguments passed into the function when it's called.

This way I have three congruent parts:
  • capture some stuff to log. The composition of what to log is method-specific, but that's fine
  • call a method
  • with some arguments

I was able to factor-out this commonality into a helper method:

private function performLoggedRequest($method, $logDetails, ...$requestArgs) : Promise {
    $this->logger->logMessage(sprintf("%s: Requesting for %s", $this->thisFile,  $logDetails));

    $response = call_user_func_array([$this->adapter, $method], $requestArgs);

    $response->then(function($response) use ($logDetails) {
        $body = $response->getBody();
        $this->logger->logMessage(sprintf("%s: Response for %s: %s", $this->thisFile, $logDetails, $body));
        $body->rewind();
    });

    return $response;
}

From there, it's just a matter of refactoring all three public methods to call this helper method:

public function get($url, $parameters) : Promise {
    $logDetails = json_encode($parameters);

    return $this->performLoggedRequest(__FUNCTION__, $logDetails, $url, $parameters);
}

public function post($url, $body) : Promise {
    $logDetails = json_encode($body);

    return $this->performLoggedRequest(__FUNCTION__, $logDetails, $url, $body);
}

public function put($url, $body, $parameters) : Promise {
    $logDetails = json_encode([
        'parameters' => $parameters,
        'body' => $body
    ]);

    return $this->performLoggedRequest(__FUNCTION__, $logDetails, $url, $body, $parameters);
}

For each call to performLoggedRequest, all the passed-in arguments after $logDetails end up in a single array, as $requestArgs All because of the ... in the method signature. Cool.

This enables use to call call_user_func_array. with both a dynamic method name, and a dynamic number of arguments in that array.

I then needed to do much the same with the StatusToExceptionAdapter:

public function get($url, $parameters) : Promise {
    return $this->request(__FUNCTION__, $url, $parameters);
}

public function post($url, $body) : Promise {
    return $this->request(__FUNCTION__, $url, $body);
}

public function put($url, $parameters, $body) : Promise {
    return $this->request(__FUNCTION__, $url, $parameters, $body);
}

private function request($method, ...$args) : Promise {
    return call_user_func_array([$this->adapter, $method], $args)
        ->then(function (Response $response) {
            return $this->handleThen($response);
        })
        ->otherwise(function ($exception) {
            $this->handleOtherwise($exception);
        });
}

Easy!

Finally a test. I used the same basic test rig as last time, so I'll just focus on the individual tests here. I also ran the equivalent test of getById as before: no change there so I'll not bother reproducing it.

The test for post was as follows:

function testCreate(PersonRepository $personRepository, LoggingService $loggingService) {
    $personDetails = [
        'firstName' => 'Donald',
        'lastName' => 'McLean'
    ];

    $loggingService->logMessage(sprintf("Test: calling create(%s)", json_encode($personDetails)));

    $response = $personRepository->create($personDetails);
    $body = (string) $response->wait()->getBody();

    $loggingService->logMessage(sprintf("Test: called create(): [%s]", $body));
}

That's all obvious. The log output for this was:

[2016-11-24 20:04:11] testPost.INFO: Test: calling create({"firstName":"Donald","lastName":"McLean"}) [] []
[2016-11-24 20:04:11] testPost.INFO: LoggingAdapter: Requesting for {"firstName":"Donald","lastName":"McLean"} [] []
[2016-11-24 20:04:11] testPost.INFO: LoggingAdapter: Response for {"firstName":"Donald","lastName":"McLean"}: {"id":"583747bb1a6d9","firstName":"Donald","lastName":"McLean","retrieved":"2016-11-24 20:04:11"} [] []
[2016-11-24 20:04:11] testPost.INFO: StatusToExceptionAdapter: non-exception status encountered: 200 [] []
[2016-11-24 20:04:11] testPost.INFO: Test: called create(): [{"id":"583747bb1a6d9","firstName":"Donald","lastName":"McLean","retrieved":"2016-11-24 20:04:11"}] [] []

We can see no caching was called (good), but the approrpiate logging and StatusToException checks were made. And we got a new record ID back after the insert. Excellent.

The test of the update had two things to do: doing an update of an existing record (happy path) then trying to update a non-existent record (which should yield an exception):

function testUpdate(PersonRepository $personRepository, LoggingService $loggingService) {
    testUpdateByPhase("Valid ID", 3, $personRepository, $loggingService);
    testUpdateByPhase("Invalid ID", -1, $personRepository, $loggingService);
}

function testUpdateByPhase($phase, $id, PersonRepository $personRepository, LoggingService $loggingService) {
    $personDetails = [
        'firstName' => 'Donald',
        'lastName' => 'Corleone'
    ];

    $logDetails = [
        'id' => $id,
        'details' => $personDetails
    ];

    $loggingService->logMessage(sprintf("Test (phase: %s): calling update(%s)", $phase, json_encode($logDetails)));

    $response = $personRepository->update($id, $personDetails);
    try {
        $body = (string) $response->wait()->getBody();
        $loggingService->logMessage(sprintf("Test (phase %s): called update(): [%s]", $phase, $body));
    } catch (Exception $e) {
        $previous = $e->getPrevious();
        $loggingService->logMessage(
            sprintf(
                "Test (phase %s): call to update failed: "
                . "Class: %s; "
                . "Code: %s; "
                . "Message: %s"
                . "Previous class: %s"
                . "Previous message: %s"
                , $phase, get_class($e), $e->getCode(), $e->getMessage()
                , get_class($previous), $previous->getMessage()
            )
        );
    } finally {
        $loggingService->logMessage(sprintf("Test (phase %s): complete", $phase));
    }
}

That's more code, but most of it is error handling, and there's nothing noteworthy. The log shows the outcome. First the valid update:

[2016-11-24 20:04:11] testPost.INFO: Test (phase: Valid ID): calling update({"id":3,"details":{"firstName":"Donald","lastName":"Corleone"}}) [] []
[2016-11-24 20:04:11] testPost.INFO: LoggingAdapter: Requesting for {"parameters":{"id":3},"body":{"firstName":"Donald","lastName":"Corleone"}} [] []
[2016-11-24 20:04:11] testPost.INFO: LoggingAdapter: Response for {"parameters":{"id":3},"body":{"firstName":"Donald","lastName":"Corleone"}}: {"before":{"id":3,"firstName":"Donald","lastName":"Trump"},"after":{"id":3,"firstName":"Donald","lastName":"Corleone"},"updated":"2016-11-24 20:04:11"} [] []
[2016-11-24 20:04:11] testPost.INFO: StatusToExceptionAdapter: non-exception status encountered: 200 [] []
[2016-11-24 20:04:11] testPost.INFO: Test (phase Valid ID): called update(): [{"before":{"id":3,"firstName":"Donald","lastName":"Trump"},"after":{"id":3,"firstName":"Donald","lastName":"Corleone"},"updated":"2016-11-24 20:04:11"}] [] []
[2016-11-24 20:04:11] testPost.INFO: Test (phase Valid ID): complete [] []

Success!

And the invalid ID:

[2016-11-24 20:04:11] testPost.INFO: LoggingAdapter: Requesting for {"parameters":{"id":-1},"body":{"firstName":"Donald","lastName":"Corleone"}} [] []
[2016-11-24 20:04:11] testPost.INFO: StatusToExceptionAdapter: remapped status encountered: 400 [] []
[2016-11-24 20:04:11] testPost.INFO: Test (phase Invalid ID): call to update failed: Class: me\adamcameron\testApp\exception\BadRequestException; Code: 400; Message: A response of 400 was received from the service Previous class: GuzzleHttp\Exception\ClientExceptionPrevious message: Client error: `PUT http://php.local:8070/experiment/guzzle/update.php?id=-1` resulted in a `400 Bad Request` response: <br /> <font size='1'><table class='xdebug-error xe-uncaught-exception' dir='ltr' border='1' cellspacing='0' cellpadding (truncated...)  [] []
[2016-11-24 20:04:11] testPost.INFO: Test (phase Invalid ID): complete [] []


That worked as expected too. Although I now note I should have returned a 404 there, not a 400. Oops. But it's doing what I told it to do, anyhow. And pleasingly, the StatusToExceptionAdapter did its job.

And that's about it. I learned a few new things whilst investigating this one, which is good.

That's the end of my foray into decorating GuzzleHttp adapters. I'll get onto something different next.

Righto.

--
Adam

Wednesday 23 November 2016

PHP: decorating async GuzzleHttp calls - multiple decorators

G'day:
Here's yet another article about decorating an async GuzzleHttp adapter. So far in the series I've got this lot:

Over the course of those articles, I've created three decorators:

  • logging - wraps the async call in "before" and "after" logging;
  • exception handling - remapping GuzzleHttp's default exceptions to HTTP-response-status-specific exceptions;
  • caching - caching successful responses so future calls for the same info won't need to hit the underlying  web service.

These all mirror real-world requirements I have.

I've tested them all separately, but the theory with decoration is that one should be able to further-decorate an already-decorated object. I've decided to give this a go.

Firstly I've revised my approach somewhat. Previously I was baking my endpoint into the adapter as that was convenient for how i was demonstrating the code. This isn't a good approach as the adapter should just focus on making the calls and handling them, not having a position on the destination of the calls. So I've removed the endpoint from the basic adapter:

use GuzzleHttp\Client;
use GuzzleHttp\Promise\Promise;

class GuzzleAdapter implements Adapter {

    private $client;

    public function __construct() {
        $this->client = new Client();
    }

    public function get($url, $parameters) : Promise {
        $fullUrl = sprintf("%s?%s", $url, http_build_query($parameters));

        $response = $this->client->requestAsync("get", $fullUrl);

        return $response;
    }
}

Now I pass the URL into the call to get, and append the passed-in parameters to make a complete URL. In real life I'd probably also be optionally passing in some headers, and perhaps some GuzzleHttp options, but that's just noise for the purposes of what I'm doing here.

Also note I've gone all PHP7 for a change, and added in some stricter typing. To that end, I'm also enforcing an interface on these classes now:

use GuzzleHttp\Promise\Promise;

interface Adapter {
    public function get($url, $parameters) : Promise;
}

In dynamic languages like PHP and CFML I generally try to shy away from being to forceful about type checking as it doesn't really add much, and sometimes gets in the way. But I think in this situation it makes some sense. And I also want to start getting into the swing of things of writing PHP 7 code too.

I've also changed my approach to calling the adapter somewhat. Now that the endpoint has been moved out of the adapter, I need it to go somewhere else. I don't want to be passing it in with every get call. We use the Repository Pattern at work to wrap up details of external service implementation, so I've run with that here.

use GuzzleHttp\Promise\Promise;
use me\adamcameron\testApp\adapter\Adapter;

class PersonRepository {

    private $adapter;

    private static $baseUrl = "http://cf2016.local:8516/cfml/misc/guzzleTestEndpoints/";
    private static $getByIdUrl = "getById.cfm";
    private static $getByNameUrl = "getByName.cfm";
    private static $returnStatusCodeUrl = "returnStatusCode.cfm";

    public function __construct(Adapter $adapter)
    {
        $this->adapter = $adapter;
    }

    public function getById($id) : Promise {
        $response = $this->adapter->get(
            self::$baseUrl . self::$getByIdUrl,
            ["id" => $id]
        );

        return $response;
    }

    public function getByName($firstName, $lastName) : Promise {
        $response = $this->adapter->get(
            self::$baseUrl . self::$getByNameUrl,
            [
                "firstName" => $firstName,
                "lastName" => $lastName
            ]
        );

        return $response;
    }

    public function returnStatusCode($statusCode) : Promise {
        $response = $this->adapter->get(
            self::$baseUrl . self::$returnStatusCodeUrl,
            ["statusCode" => $statusCode]
        );

        return $response;
    }
}

Notes on this one:

  • The repo knows about the endpoints it calls. Usually we'd probably have the base URL as a constructor param so we can easily change version of the web service, but that's just noise here, so I've "hard-coded" it.
  • I've added a new method - getByName - that I've not used before. I just wanted to have a second method which had a different method signature from the others I'd previously used. This was mostly for an experiment that I was gonna do but decided against for this article, but left the method in anyhow.
For the sake of completeness, here's the revised endpoint code:

getById.cfm

<cfscript>
cfcontent(type="application/json");

writeLog(file="testApp", text="[ID: #URL.id#] request received");

sleep(5000);
include "./data.cfm";

record = queryExecute(
    "SELECT id, firstName, lastName FROM people WHERE id = :id",
    {id=URL.id},
    {dbtype="query", query="people", maxrows=1}
);

result = structNew("ordered");
result["id"] = record.id;
result["firstName"] = record.firstName;
result["lastName"] = record.lastName;
result["retrieved"] = now().dateTimeFormat("HH:nn:ss.lll");

writeOutput(serializeJson(result));
writeLog(file="testApp", text="[ID: #URL.id#] response returned");
</cfscript>


getByName.cfm

<cfscript>
cfcontent(type="application/json");

writeLog(file="testApp", text="[firstName: #URL.firstName#][lastName: #URL.lastName#] request received");

sleep(2500);
include "./data.cfm";

record = queryExecute(
    "SELECT id, firstName, lastName FROM people WHERE firstName = :firstName AND lastName = :lastName",
    {firstName=URL.firstName, lastName=URL.lastName},
    {dbtype="query", query="people", maxrows=1}
);

result = structNew("ordered");
result["id"] = record.id;
result["firstName"] = record.firstName;
result["lastName"] = record.lastName;
result["retrieved"] = now().dateTimeFormat("HH:nn:ss.lll");

sleep(2500);

writeOutput(serializeJson(result));
writeLog(file="testApp", text="[firstName: #URL.firstName#][lastName: #URL.lastName#] response returned");
</cfscript>

data.cfm

<cfscript>
people = queryNew("id,firstName,lastName", "integer,varchar,varchar", [
    {id=1, firstName="Donald", lastName="Duck"},
    {id=2, firstName="Donald", lastName="Cameron"},
    {id=3, firstName="Donald", lastName="Trump"},
    {id=4, firstName="Donald", lastName="Wearsyatroosas"}
]);
</cfscript>

returnStatusCode.cfm

Remains unchanged:

<cfset sleep(2000)>
<cfheader statusCode="#URL.statusCode#">
<cfoutput>The request resulted in a #URL.statusCode# status code @ #now().dateTimeFormat("HH:nn:ss.lll")#</cfoutput>

The only thing to note really is that these endpoints are kinda data-driven now.

And I have a test of all this. It's long-winded, so I'll go through it section by section, before presenting it again in its entirety.

Firstly - and the only really important bit for the purposes of this article - is setting our test objects:

$loggingService = getLoggingService();
$personRepository = getPersonRepository($loggingService);

// ...


function getLoggingService() : LoggingService {
    $ts = (new DateTime())->format("His");
    $loggingService = new LoggingService("all_$ts");

    return $loggingService;
}

function getPersonRepository(LoggingService $loggingService) : PersonRepository {
    $guzzleAdapter = getGuzzleAdapter();
    $loggingAdapter = getLoggingAdapter($guzzleAdapter, $loggingService);
    $statusToExceptionAdapter = getStatusToExceptionAdapter($loggingAdapter, $loggingService);
    $cachingAdapter = getCachingAdapter($statusToExceptionAdapter, $loggingService);

    $personRepository = new PersonRepository($cachingAdapter);

    return $personRepository;
}

function getGuzzleAdapter() : GuzzleAdapter {
    $guzzleAdapter = new GuzzleAdapter();

    return $guzzleAdapter;
}

function getLoggingAdapter(Adapter $adapter, LoggingService $loggingService) : LoggingAdapter {
    $loggingAdapter = new LoggingAdapter($adapter, $loggingService);

    return $loggingAdapter;
}

function getStatusToExceptionAdapter(Adapter $adapter, LoggingService $loggingService) : StatusToExceptionAdapter {
    $exceptionMap = [
        400 => '\me\adamcameron\testApp\exception\BadRequestException',
        503 => '\me\adamcameron\testApp\exception\ServerException'
    ];

    $statusToExceptionAdapter = new StatusToExceptionAdapter($adapter, $exceptionMap, $loggingService);

    return $statusToExceptionAdapter;
}

function getCachingAdapter(Adapter $adapter, LoggingService $loggingService) : CachingAdapter {
    $cachingService = new CachingService($loggingService);
    $cachingAdapter = new CachingAdapter($adapter, $cachingService);

    return $cachingAdapter;
}

That looks like a chunk of code (well: it is), but it's mostly just compartmentalising some calls. Notes:

  • we need a LoggingService and a PersonRepository for our tests.
  • The PersonRepository needs an Adapter.
  • For our purposes we are decorating the baseline GuzzleAdapter with:
    • a LoggingAdapter. We only want to log calls made to the service, so we put this closest to the baseline Adapter.
    • a StatusToExceptionAdapter. We only want to cache successful calls, and intrinsically previously successful calls will not need the logic in this adapter, so put this next.
    • a CachingAdapter. This goes closest to the calling code so that there's a minimum of complexity if a successful call has been previously made.

The order that the decorations are layers is important, as hopefully I've explained above. The ordering needs to be considered on a case-by-case basis, determined by the purpose of each decorator, and how one decorator might interrupt the execution of other ones; eg: if the StatusToExceptionAdapter throws an exception, the CachingAdapter never does its work. And depending on what those two do, the GuzzleAdapter itself might not be needed, so the LoggingAdapter is skipped too.

From there I test each of the three endpoints again, in much the same way I did before, but noting how the decorators combine where appropriate.

makeRequestsToGetById($personRepository, $loggingService);
logTestSeparator($loggingService);

makeRequestsToGetByName($personRepository, $loggingService);
logTestSeparator($loggingService);

makeRequestsToReturnStatusCode($personRepository, $loggingService);
logTestSeparator($loggingService);

Testing getById I just check the caching still works:

function makeRequestsToGetById(PersonRepository $personRepository, LoggingService $loggingService){
    $id = 4;

    makeRequestToGetById($id, "Uncached", $personRepository, $loggingService);
    pause(5, $loggingService);
    makeRequestToGetById($id, "Cached", $personRepository, $loggingService);
}

function makeRequestToGetById($id, $phase, PersonRepository $personRepository, LoggingService $loggingService){
    $startTime = time();

    $loggingService->logMessage("Test (phase: $phase): calling getById($id)");
    $response = $personRepository->getById($id);
    $loggingService->logMessage("Test (phase: $phase): call to getById complete");
    $body = (string) $response->wait()->getBody();
    $loggingService->logMessage("Test (phase: $phase): Body: $body");

    $duration = time() - $startTime;
    $loggingService->logMessage("Test (phase: $phase): Process duration: {$duration}sec");
}


Sample output:

[2016-11-22 20:39:21] all_203921.INFO: Test: Getting not-yet cached results [] []
[2016-11-22 20:39:21] all_203921.INFO: Test (phase: Uncached): calling getById(4) [] []
[2016-11-22 20:39:21] all_203921.INFO: LoggingAdapter: Requesting for {"id":4} [] []
[2016-11-22 20:39:21] all_203921.INFO: Test (phase: Uncached): call to getById complete [] []
[2016-11-22 20:39:26] all_203921.INFO: LoggingAdapter: Response for {"id":4}: {"id":4,"firstName":"Donald","lastName":"Wearsyatroosas","retrieved":"20:39:26.161"} [] []
[2016-11-22 20:39:26] all_203921.INFO: StatusToExceptionAdapter: non-exception status encountered: 200 [] []
[2016-11-22 20:39:26] all_203921.INFO: CachingService: put called with cd51039fde58b80cf261c3f1a99da3ef [] []
[2016-11-22 20:39:26] all_203921.INFO: Test (phase: Uncached): Body: {"id":4,"firstName":"Donald","lastName":"Wearsyatroosas","retrieved":"20:39:26.161"} [] []
[2016-11-22 20:39:26] all_203921.INFO: Test (phase: Uncached): Process duration: 5sec [] []
[2016-11-22 20:39:26] all_203921.INFO: Test: waiting 5sec [] []
[2016-11-22 20:39:31] all_203921.INFO: Test: waited 5sec [] []
[2016-11-22 20:39:31] all_203921.INFO: Test (phase: Cached): calling getById(4) [] []
no log entry here, because the LoggingAdapter was never hit: the CachingAdapter took care of it
[2016-11-22 20:39:31] all_203921.INFO: Test (phase: Cached): call to getById complete [] []
[2016-11-22 20:39:31] all_203921.INFO: CachingService: get called with cd51039fde58b80cf261c3f1a99da3ef [] []
[2016-11-22 20:39:31] all_203921.INFO: Test (phase: Cached): Body: {"id":4,"firstName":"Donald","lastName":"Wearsyatroosas","retrieved":"20:39:26.161"} [] []
[2016-11-22 20:39:31] all_203921.INFO: Test (phase: Cached): Process duration: 0sec [] []

Also note this demonstrates the LoggingAdapter at work too. Specifically note that there's only a log entry for the request when the web service is actually being called, not when it's coming from cache, further down.

Next I test getByName. It's exactly the same test as the getById one, so I won't bore you with it. It demonstrates the same thing.

Lastly I test returnStatusCode:

function makeRequestsToReturnStatusCode(PersonRepository $personRepository, LoggingService $loggingService) {
    makeRequestToReturnStatusCode200("First", $personRepository, $loggingService);
    pause(2, $loggingService);
    makeRequestToReturnStatusCode200("Second", $personRepository, $loggingService);
    logTestSeparator($loggingService);
    makeRequestToMappedStatusCode("First", $personRepository, $loggingService);
    $loggingService->logMessage("Test: Repeating an erroring call will not touch cache");
    makeRequestToMappedStatusCode("Second", $personRepository, $loggingService);
}

The test here is twofold. First I test that requests that are not intercepted and don't error are logged and cached correctly:

function makeRequestToReturnStatusCode200($phase, PersonRepository $personRepository, LoggingService $loggingService){
    $statusCode = 200;
    $loggingService->logMessage("Test (phase: $phase): calling returnStatusCode($statusCode)");
    $response = $personRepository->returnStatusCode($statusCode);
    $loggingService->logMessage("Test (phase: $phase): call to returnStatusCode complete");
    $body = (string) $response->wait()->getBody();
    $loggingService->logMessage("Test (phase: $phase): Body: $body");
}

Sample log output:

[2016-11-22 20:39:41] all_203921.INFO: Test (phase: First): calling returnStatusCode(200) [] []
[2016-11-22 20:39:41] all_203921.INFO: LoggingAdapter: Requesting for {"statusCode":200} [] []
[2016-11-22 20:39:41] all_203921.INFO: Test (phase: First): call to returnStatusCode complete [] []
[2016-11-22 20:39:43] all_203921.INFO: LoggingAdapter: Response for {"statusCode":200}: The request resulted in a 200 status code @ 20:39:43.319 [] []
[2016-11-22 20:39:43] all_203921.INFO: StatusToExceptionAdapter: non-exception status encountered: 200 [] []
[2016-11-22 20:39:43] all_203921.INFO: CachingService: put called with 8670b5e29a709e36ae186bce7715de68 [] []
[2016-11-22 20:39:43] all_203921.INFO: Test (phase: First): Body: The request resulted in a 200 status code @ 20:39:43.319 [] []
[2016-11-22 20:39:43] all_203921.INFO: Test: waiting 2sec [] []
[2016-11-22 20:39:45] all_203921.INFO: Test: waited 2sec [] []
[2016-11-22 20:39:45] all_203921.INFO: Test (phase: Second): calling returnStatusCode(200) [] []
[2016-11-22 20:39:45] all_203921.INFO: Test (phase: Second): call to returnStatusCode complete [] []
[2016-11-22 20:39:45] all_203921.INFO: CachingService: get called with 8670b5e29a709e36ae186bce7715de68 [] []
[2016-11-22 20:39:45] all_203921.INFO: Test (phase: Second): Body: The request resulted in a 200 status code @ 20:39:43.319 [] []

The second test of returnStatusCode test a remapped status to make sure it works to spec:

function makeRequestToMappedStatusCode($phase, PersonRepository $personRepository, LoggingService $loggingService) {
    $statusCode = 400;
    $loggingService->logMessage("Test (phase: $phase): calling returnStatusCode($statusCode)");
    $response = $personRepository->returnStatusCode($statusCode);
    $loggingService->logMessage("Test (phase: $phase): call to returnStatusCode complete");
    try {
        $response->wait();
        $loggingService->logMessage("Test (phase: $phase): failed to throw expected exception");
    } catch (Exception $e){
        logException($e, $loggingService);
    }
}

Sample logging:

[2016-11-22 20:39:45] all_203921.INFO: Test (phase: First): calling returnStatusCode(400) [] []
[2016-11-22 20:39:45] all_203921.INFO: LoggingAdapter: Requesting for {"statusCode":400} [] []
[2016-11-22 20:39:45] all_203921.INFO: Test (phase: First): call to returnStatusCode complete [] []
[2016-11-22 20:39:47] all_203921.INFO: StatusToExceptionAdapter: remapped status encountered: 400 [] []
[2016-11-22 20:39:47] all_203921.INFO: Test: Request errored with 400  [] []
[2016-11-22 20:39:47] all_203921.INFO: Test: Class: me\adamcameron\testApp\exception\BadRequestException  [] []
[2016-11-22 20:39:47] all_203921.INFO: Test: Message A response of 400 was received from the service  [] []
[2016-11-22 20:39:47] all_203921.INFO: Test: Previous class GuzzleHttp\Exception\ClientException  [] []
[2016-11-22 20:39:47] all_203921.INFO: Test: Previous code 400  [] []
[2016-11-22 20:39:47] all_203921.INFO: Test: Previous message Client error: `GET http://cf2016.local:8516/cfml/misc/guzzleTestEndpoints/returnStatusCode.cfm?statusCode=400` resulted in a `400 Bad Request` response: The request resulted in a 400 status code @ 20:39:47.359   [] []
[2016-11-22 20:39:47] all_203921.INFO: Test: Repeating an erroring call will not touch cache [] []
[2016-11-22 20:39:47] all_203921.INFO: Test (phase: Second): calling returnStatusCode(400) [] []
[2016-11-22 20:39:47] all_203921.INFO: LoggingAdapter: Requesting for {"statusCode":400} [] []
[2016-11-22 20:39:47] all_203921.INFO: Test (phase: Second): call to returnStatusCode complete [] []
[2016-11-22 20:39:49] all_203921.INFO: StatusToExceptionAdapter: remapped status encountered: 400 [] []
[2016-11-22 20:39:49] all_203921.INFO: Test: Request errored with 400  [] []
[2016-11-22 20:39:49] all_203921.INFO: Test: Class: me\adamcameron\testApp\exception\BadRequestException  [] []
[2016-11-22 20:39:49] all_203921.INFO: Test: Message A response of 400 was received from the service  [] []
[2016-11-22 20:39:49] all_203921.INFO: Test: Previous class GuzzleHttp\Exception\ClientException  [] []
[2016-11-22 20:39:49] all_203921.INFO: Test: Previous code 400  [] []
[2016-11-22 20:39:49] all_203921.INFO: Test: Previous message Client error: `GET http://cf2016.local:8516/cfml/misc/guzzleTestEndpoints/returnStatusCode.cfm?statusCode=400` resulted in a `400 Bad Request` response: The request resulted in a 400 status code @ 20:39:49.392   [] []

So:
  • The status code is converted to the correct exception.
  • The initial log entry is made for the request, but not the one for the response (see the log extract from the getById tests for that one.
  • Nothing is cached or attempted to be fetched from cache at any point because the appropriate handlers are never invoked.

That'sit really. I'm happy with that.

Oh, I use a coupla other helper functions in that lot. For completeness here they are:

function logException(Exception $e, LoggingService $loggingService) {
    $previous = $e->getPrevious();
    $loggingService->logMessage(sprintf("Test: Request errored with %s%s", $e->getCode(), PHP_EOL));
    $loggingService->logMessage(sprintf("Test: Class: %s%s", get_class($e), PHP_EOL));
    $loggingService->logMessage(sprintf("Test: Message %s%s", $e->getMessage(), PHP_EOL));
    $loggingService->logMessage(sprintf("Test: Previous class %s%s", get_class($previous), PHP_EOL));
    $loggingService->logMessage(sprintf("Test: Previous code %s%s", $previous->getCode(), PHP_EOL));
    $loggingService->logMessage(sprintf("Test: Previous message %s%s", $previous->getMessage(), PHP_EOL));
}

function logTestSeparator(LoggingService $loggingService) {
    $loggingService->logMessage("==================================================================");
}

function pause($seconds, LoggingService $loggingService) {
    $loggingService->logMessage("Test: waiting {$seconds}sec");
    sleep($seconds);
    $loggingService->logMessage("Test: waited {$seconds}sec");
}

Indeed the code for everything here is up at GitHub, but I'm changing it with each article I write, so it gets out of date quickly. Here's the current commit, anyhow, and that's the code I've been using for this article.

I've got yet another article to write on this, but I haven't even worked out how I'm gonna write the code yet, so that might be a day or so still.

Righto.

--
Adam

Tuesday 22 November 2016

PHP: decorating async GuzzleHttp calls - caching

G'day:
This continues my series of exercise in decorating GuzzleHttp calls. Recap:

The last piece of the puzzle I need to deal with is caching. We need to decorate our web service adapter with a caching tier so that we don't continually pester it over the wire for info we have already asked for.

The treatment here is traditional:
  • we get a request for data  based on an ID;
  • we check the cache for data stored against that ID;
  • if we have it, we returned the cached version;
  • if not, we call the web service;
  • and before returning the result, we cache it for "next time".
I've implemented a stub caching service which simply sticks stuff in an array. This'll do for the purposes of testing:

class CachingService {

    public $cache;

    public function __construct(){
        $this->cache = [];
    }

    public function contains($id){
        return array_key_exists($id, $this->cache);
    }

    public function get($id){
        echo "Getting $id from cache" . PHP_EOL . PHP_EOL;
        return $this->cache[$id];
    }

    public function put($id, $value){
        echo "Putting $id into cache" . PHP_EOL . PHP_EOL;
        return $this->cache[$id] = $value;
    }
}

In reality we'd use redis or something like that, but it's the interface that matters here, not the implementation. Note I'm also outputing a message when the cache methods are hit, just so we can tell what's going on when the code is running.

Next I have a CachingGuzzleAdapter which is the decorator for a GuzzleAdapter. This has a chunk of code in it, so I'll break it down:

class CachingGuzzleAdapter {

    private $adapter;
    private $cache;

    public function __construct($adapter, $cache) {
        $this->adapter = $adapter;
        $this->cache = $cache;
    }

    // ...

}

No surprises here. The decorator takes the adapter it's decorating, as well as an instance of the cache service it'll be talking to. I probably shoulda called that $cachingService, now that I look at it. We're taking the same approach here as we did with the other decorators (in the earlier articles): all the guzzling is offloaded to the adapter here; this class only deals with the caching side of things.

To that end, our get method is very simple:

public function get($id){
    if ($this->cache->contains($id)) {
        return $this->returnResponseFromCache($id);
    }
    return $this->returnResponseFromWebService($id);
}

If it's in the cache: use that; if not: go get it. The details of how to get the thing from cache or how to get things from the adapter are hidden in helper functions.

Here's the tricky bit: getting the response back from cache:

private function returnResponseFromCache($id) {
    $p = new Promise(function() use (&$p, $id){
        echo "GETTING FROM CACHE" . PHP_EOL;
        $cachedResult = $this->cache->get($id);

        $newResponse = new Response(
            $cachedResult['status'],
            $cachedResult['headers'],
            $cachedResult['body']
        );

        $p->resolve($newResponse);
    });

    return $p;
}

Here we:

  • get some data from the cache. This method only gets called if the data is there, so we can dive straight in.
  • We create a new Guzzle Response object, and put the data back into it.
  • We resolve a promise with that as its value. We need to return a promise here because that's what the get method we're decorating returns.

I'm not entirely comfortable with the self-referential bit around $p, but that was from the docs as the recommended way of doing this. So be it.

If the stuff wasn't in cache, we need to make an actual web service call:

private function returnResponseFromWebService($id){
    $response = $this->adapter->get($id);

    $response->then(function(Response $response) use ($id) {
        echo "PUTTING IN CACHE" . PHP_EOL;

        $detailsToCache = [
            'status' => $response->getStatusCode(),
            'headers' => $response->getHeaders(),
            'body' => $response->getBody()->getContents()
        ];
        $this->cache->put($id, $detailsToCache);
    });

    return $response;
}

This is straight forward:

  • make the call;
  • in its resolution handler grab the data and bung it in the cache.
  • Note that I'm only caching the significant bits of the response, not the response itself. My original version of this cached the entire response object: because the cache is just an in-memory array I can do that; in reality I'd need to serialise it somehow, so I wanted to make a point of only caching the data here.

That's it!

Of course I need to test this, so I wrote a test rig, similar to the ones I've already done:

$endPoint  = "http://cf2016.local:8516/cfml/misc/guzzleTestEndpoints/getById.cfm?id=";

$guzzleAdapter = new GuzzleAdapter($endPoint);
$cache = new CachingService();
$adapter = new CachingGuzzleAdapter($guzzleAdapter, $cache);

printf("Getting not-yet cached results @ %s%s", (new DateTime())->format("H:i:s"), PHP_EOL . PHP_EOL);
makeRequests($adapter);

echo "===================================================" .PHP_EOL . PHP_EOL;

sleep(10);

printf("Getting results again (from cache) @ %s%s", (new DateTime())->format("H:i:s"), PHP_EOL . PHP_EOL);
makeRequests($adapter);


function makeRequests($adapter){
    $startTime = time();
    $ids = ["001", "002"];
    $responses = [];
    foreach ($ids as $id) {
        echo "Making requests" . PHP_EOL . PHP_EOL;
        echo "Requesting: $id" . PHP_EOL;
        $responses[] = $adapter->get($id);
        echo "Requests made" . PHP_EOL . PHP_EOL;
    }
    echo "Fetching responses" . PHP_EOL . PHP_EOL;
    foreach ($responses as $response){
        $body = (string) $response->wait()->getBody();
        echo "Body: $body" . PHP_EOL . PHP_EOL;
    }
    echo "Responses fetched" . PHP_EOL . PHP_EOL;
    $duration = time() - $startTime;
    echo "Process duration: {$duration}sec" . PHP_EOL . PHP_EOL;
}

This just:
  • calls the adapter for a coupla IDs,
  • waits 10sec
  • and gets the same data again.
The expectation is that the first call will take around 5sec as that's how long each web service call takes:

<cfscript>
cfcontent(type="application/json");

writeLog(file="testApp", text="[ID: #URL.id#] request received");

sleep(5000);

writeOutput(serializeJson({"id"=URL.id, "retrieved"=now().dateTimeFormat("HH:nn:ss.lll")}));
writeLog(file="testApp", text="[ID: #URL.id#] response returned");
</cfscript>

It'll only be 5sec in total not 10sec because the calls are being made asynchronously, remember?

The second round of calls should take no time at all because the adapter will be able to get the results from cache. In theory. Let's see:

>php testCachingAdapter.php
Getting not-yet cached results @ 09:24:25

Making requests

Requesting: 001
Requests made

Making requests

Requesting: 002
Requests made

Fetching responses

PUTTING IN CACHE
Putting 001 into cache

PUTTING IN CACHE
Putting 002 into cache

Body: {"retrieved":"09:24:30.453","id":"001"}

Body: {"retrieved":"09:24:30.457","id":"002"}

Responses fetched

Process duration: 5sec

===================================================

Getting results again (from cache) @ 09:24:40

Making requests

Requesting: 001
Requests made

Making requests

Requesting: 002
Requests made

Fetching responses

GETTING FROM CACHE
Getting 001 from cache

Body: {"retrieved":"09:24:30.453","id":"001"}

GETTING FROM CACHE
Getting 002 from cache

Body: {"retrieved":"09:24:30.457","id":"002"}

Responses fetched

Process duration: 0sec


>

Hurrah! As predicted the first call takes 5sec; the second takes 0sec. And the intermediary telemetry confirms that the cache is being used correctly. Also note the timestamps on the returned data reflect when the initial requests were made, not when the second round of calls were made.

That was pretty easy this time. It took me a while to find the code for returning a resolved promise with the Response value in it, but wiring everything up was simple.

As an exercise I'm now gonna wire up a CachedLoggedErrorMappedGuzzleAdapter. Or perhaps a LoggedErrorMappedCacheGuzzleAdapter. Not sure. But it's all a difference in wiring, and that's it really. I'll write that up next.

Righto.

--
Adam