Showing posts with label PHP. Show all posts
Showing posts with label PHP. Show all posts

Saturday 13 May 2017

PHPUnit: using a mocked-method call log to track its activity

G'day:
This is just a quick PHPUnit one, inspired by a trick my team lead suggested to me. I think it's quite handy and works around what seems to be a shortcoming of how PHPUnit mocks stuff. I'm sure there's strong opinions held on this sort of thing somewhere as to how I'm not supposed to want to do what my requirement has me needing to do here. I do so look forward to being told about it.

Anyway, I'm writing a script to repair some bung data we have. The details are irrelevant, but it's fairly "important" data (well: not to me... but to someone, apparently...), and I want to make sure we can track when things go wrong. And they will go wrong. Firstly I'm talking to an external third-party service, and it's foolhardy to trust those; secondly I'm relying on some of our own internal services which... ah... well yeah they were written in a different era by people with an unusual understanding of how to write managable, testable, and reliable code. And in the course of doing this work I turned-up a coupla bugs and had to fix those, so I don't back our own code to be stable here. I don't mean that judgementally: all code bases have better and not so good code in them. Anyway...

The upshot of this is the script logs a bunch of stuff - said script will be run by a cron job - and part of my testing is to verify the right things gets logged at the right time. A lot of it is fairly perfunctory "belt and braces" sort of logging, say "Records 1000-1999 processed OK", and I don't really care about that; but there's other stuff where the script deals with unexpected circumstances and I want to make sure that's done right.

For the purposes of this test I am mocking out all the dependencies so the script doesn't actually do anything, but the mocks return "workable" mocked data sufficient to walk through the stages of the script. And the logger is one of these dependencies, and is also mocked.

The test I was stuck on was one where an unhandled exception was thrown, and the script basically shuts itself down; having first logged what went wrong.

Here's a very pared-back version of the script:

public function myMethod($times){
    try {
        $this->logger->logMessage("Starting off");
        for ($i=1; $i <= $times; $i++) {
            $this->logger->logMessage("Starting processing iteration $i");
            $this->helper->doThing();
            $this->logger->logMessage("Finished processing iteration $i");
        }
        $this->logger->logMessage("Finishing off");
    }
    catch (\Exception $e) {
        $this->logger->logMessage(sprintf("Something went wrong: %s", $e->getMessage()));
        throw $e;
    }
    finally {
        $this->logger->logMessage("All done");
    }
}

Here I've just got the one doThing dependency call in the loop, but it makes the point. And around that, and before and after the the loop I log a line to track progress. And I log a line if there's an unhandled exception. As I said the real script is somewhat more complex than this, but this is an adequate analogy.

I have another test which tests the ongoing logging of a successful run with 0, 1, >1 records, but this test is to make sure the script a) reports the exception; b) still shuts down OK.

PHPUnit's approach to checking a specific argument was passed to a particular mock call is to use the at matcher method when setting an expectation. My first pass at the test used this matcher:

function testMyMethodUsingAt()
{
    $exceptionMessage = "EXPECT_THIS";
    $this->expectException(\RuntimeException::class);
    $this->expectExceptionMessage($exceptionMessage);

    $testIterations = 6;
    $errorAt = $testIterations / 2;

    $this->helper
        ->expects($this->at($errorAt))
        ->method("doThing")
        ->willThrowException(new \RuntimeException($exceptionMessage));

    $this->logger
         ->expects($this->at(7))
         ->method("logMessage")
         ->with("Starting processing iteration 4");
    $this->logger
         ->expects($this->at(8))
         ->method("logMessage")
         ->with("Something went wrong: $exceptionMessage");
    $this->logger
         ->expects($this->at(9))
         ->method("logMessage")
         ->with("All done");

    $this->sut->myMethod($testIterations);
}

Here I worked out the the call history for info would be nine entries long, and the entries I wanted to confirm the exception-tracking logic was correct was the 7th, 8th and 9th ones.

But this is problematic for a number of ways:
  • it's not immediately clear what hat I pulled 7th, 8th and 9th from;
  • it's fragile because it relies on there being precisely seven earlier log entries for it to work. EG: if we were to add another log line into the method, that would break this test, which is rubbish;
  • semantically it's incorrect, as it's not the 7th, 8th and 9th entries I'm interested in; it's the last three. It'd be better to code it accordingly so the code was more readable to maintainers, later on.
  • Lastly the at approach to things in PHPUnit is just a bit crap because the index is based on the mock object, not the mock method, despite almost all use-cases I've had for this sort of thing have been wanting to check sequential calls to the mocked method, not just to any method in the mocked object. So I try to steer clear of at at the best of times.
Other mocking frameworks I've used expose the call-log for a mocked method, but I cannot find any way to get it from PHPUnit. I can see where the data is stored, but to get it out would be deeply "Heath Robinson", and would be even worse than the approach above.

I was stumped.

Then me boss said "just use a callback in the will expectation, and build the log yerself". Initially I just looked at him like he was a loony (he's used to this), but then it dawned on me what he was suggesting. And it was easy, and not altogether awful, either:

function testMyMethodUsingCallLog()
{
    $exceptionMessage = "EXPECT_THIS";
    $this->expectException(\RuntimeException::class);
    $this->expectExceptionMessage($exceptionMessage);

    $testIterations = 6;
    $errorAt = $testIterations / 2;

    $this->helper
        ->expects($this->at($errorAt))
        ->method("doThing")
        ->willThrowException(new \RuntimeException($exceptionMessage));

    $callLog = [];
    $this->logger
        ->method("logMessage")
        ->will($this->returnCallback(function ($message) use (&$callLog) {
            $callLog[] = $message;
        }));

    try {
        $this->sut->myMethod($testIterations);
    }
    catch (\Exception $e) {
        $this->assertSame("All done", array_pop($callLog));
        $this->assertSame("Something went wrong: $exceptionMessage", array_pop($callLog));
        $this->assertNotSame("Finishing off", array_pop($callLog));

        throw $e;
    }
}

The first key bit is capturing the call-log. This creates an array, and with each call to the mocked logging method, we capture what was logged. Note that these expectations don't enforce any rules here. They just capture what's going on.

Because this method call throws an exception, I have to try/catch it so that I can regain control once it's errored, and inspect the call log to see it's what I expect. My approach here is far more semantic: I check the last entry is the "Finishing off" one, the one before that is logging the correct exception, and - this is slightly different now - as a failsafe I check that the one before that is anything other than "All Done". Because if it was that value, it'd mean the logic flow did not branch at the right stage of things. Lastly I throw the exception I caught so PHPUnit's own exception-expectation-checking also validates that too.

One of the best things about this is that this test doesn't care about what goes on in the log prior to the stuff it's testing, it is able to focus on precisely what it cares about. This is much clearer code. An example of this would be to change this line in each test:

$testIterations = 6;

Change it to be any other number, and the first test breaks, whereas the second test keeps working. Win.

That's pretty cool.

I hasten to add I would probably not usually be focusing quite this closely on this sort of sequencing, but in this case we need to know things are working properly.

It's a shame PHPUnit doesn't just expose the call-log like this, as I can't be the first person who's been in this situation. But I guess the tools are there to DIY like I have here, so no harm done.

Righto.

--
Adam

Thursday 9 March 2017

PHPUnit: setMethods and how I've been writing too much code

G'day:
Seems our technical architect is becoming my PHP muse. Good on ya, Pete.

Anyway, he just sent out what I thought was a controversial email in which he said (paraphrase) "we're calling setMethods way too often: in general we don't need to call it". To contextualise, I mean this:

$dependency = $this
    ->getMockBuilder(MyDependency::class)
    ->setMethods(['someMethod'])
    ->getMock();

$dependency
    ->method('someMethod')
    ->willReturn('MOCKED RESULT');


See here I'm explicitly saying "mock out the someMethod method". Pete's contention is that that's generally unnecessary, cos getMockBuilder mocks 'em all by default anyhow. This goes against my understanding of how it works, and I've written bloody thousands of unit tests with PHPUnit so it surprised me I was getting it wrong. I kinda believed him cos he said he tested it, but I didn't believe him so much as to not go test it myself.

So I knocked out a quick service and dependency for it:

namespace me\adamcameron\myApp;

class MyService
{

    private $myDecorator;

    public function __construct(MyDecorator $myDecorator)
    {
        $this->myDecorator = $myDecorator;
    }

    public function decorateMessage($message)
    {
        $message = $this->myDecorator->addPrefix($message);
        $message = $this->myDecorator->addSuffix($message);
        return $message;
    }

}

namespace me\adamcameron\myApp;

class MyDecorator
{

    public function addPrefix($message)
    {
        return "(ACTUAL PREFIX) $message";
    }

    public function addSuffix($message)
    {
        return "$message (ACTUAL SUFFIX)";
    }

}

Nothing surprising there: the service takes that decorator as a constructor arg, and then when calling decorateMessage the decorator's methods are called.

In our tests of decorateMessage, we don't want to use the real decorator methods, we want to use mocks.

First I have a test the way I'd normally mock things: explicitly calling setMethods with the method names:

public function testWithExplicitMock()
{
    $decorator = $this
        ->getMockBuilder(MyDecorator::class)
        ->setMethods(['addPrefix', 'addSuffix'])
        ->getMock();

 $decorator
  ->method('addPrefix')
  ->willReturn('(MOCKED PREFIX)');

 $decorator
  ->method('addSuffix')
  ->willReturn('(MOCKED SUFFIX)');

 $service = new MyService($decorator);

 $result = $service->decorateMessage('TEST MESSAGE');

 $this->assertSame(
        '(MOCKED SUFFIX)',
        $result
    );
}

And this mocks out the method correctly. This is my baseline.

Secondly, I still call setMethods, but I give it an empty array:

public function testWithImplicitMock()
{
    $decorator = $this
        ->getMockBuilder(MyDecorator::class)
        ->setMethods([])
        ->getMock();

 $decorator
  ->method('addPrefix')
  ->willReturn('(MOCKED PREFIX)');

 $decorator
  ->method('addSuffix')
  ->willReturn('(MOCKED SUFFIX)');

 $service = new MyService($decorator);

 $result = $service->decorateMessage('TEST MESSAGE');

 $this->assertSame(
        '(MOCKED SUFFIX)',
        $result
    );
}

This also mocks out the method correctly. Passing an empty array mocks out all the methods in the mocked class.

Next I try passing null to setMethods:

public function testWithNull()
{
    $decorator = $this
        ->getMockBuilder(MyDecorator::class)
        ->setMethods(null)
        ->getMock();

    $decorator
        ->method('addPrefix')
        ->willReturn('(MOCKED PREFIX)');

    $decorator
        ->method('addSuffix')
        ->willReturn('(MOCKED SUFFIX)');

    $service = new MyService($decorator);

    $result = $service->decorateMessage('TEST MESSAGE');

    $this->assertSame(
        '(ACTUAL PREFIX) TEST MESSAGE (ACTUAL SUFFIX)',
        $result
    );
}

This creates the mocked object, but the methods are not mocked. So the result here is using the actual methods.

In the last example I mock one of the methods (addPrefix), but not the other (addSuffix):

public function testWithOneExplicitMock()
{
    $decorator = $this
        ->getMockBuilder(MyDecorator::class)
        ->setMethods(['addPrefix'])
        ->getMock();

    $decorator
        ->method('addPrefix')
        ->willReturn('(MOCKED PREFIX)');

    $service = new MyService($decorator);

    $result = $service->decorateMessage('TEST MESSAGE');

    $this->assertSame(
        '(MOCKED PREFIX) (ACTUAL SUFFIX)',
        $result
    );
}

Here we see the important bit: when one explicitly mocks one or a subset of methods, then the other methods are not mocked. Which is kinda obvious now I say it, but I guess we had been thinking one needed to call setMethods to be able to then mock the behaviour of the method. And, TBH, I don't think we thought through what was happening to the methods we were not mocking. I think we started calling setMethods all the time because we were slavishly following the PHPUnit docs, which always call it too, which is a less than ideal precedent to "recommend".

Generally speaking, one should want to mock all methods of a dependency, and it should really be the exception when explicitly not wanting to mock a method: ie, to leave it actually callable.

The win here is that we can now de-clutter our tests a bit. We really seldom need to call setMethods, and we sure want to make it clear when we do want to only mock some of a dependency's methods. So this will make our code cleaner and clearer, and easier to code review and maintain. Win!

Cheers Pete!

Righto.

--
Adam

Wednesday 8 February 2017

PHP: more array dereferencing weirdness

G'day:
(Firstly: I hasten to add that the "weirdness" is "weird to me". It might be completely predictable to everyone else in the PHP world).

This is a follow-on from my earlier article "PHP: accessing an undefined array element yields a notice. Or does it? WTF, PHP?". Now before I start: Kalle Sommer Nielsen has come back to me in a comment on that article and mentions it's expected behaviour, and explained it... but I don't quite follow yet. I'm hoping some ensuing experimentation will cast some light on the scene.

My next variation of this has me scratching my head more. Well: it's demonstrating the same thing, but perhaps more clearly.

<?php

echo PHP_EOL . "Test with array" . PHP_EOL;

$a = [1, 2, 3];
var_dump($a[0]);
var_dump($a[0]['id']);
var_dump($a[0][0]['id']);
var_dump($a[0]['id'][0]);

echo PHP_EOL . "Test with int" . PHP_EOL;
$i = 1;
var_dump($i['id']);
var_dump($i[0]['id']);
var_dump($i['id'][0]);

echo PHP_EOL;

echo PHP_EOL . "Test with string" . PHP_EOL;
$s = '1';
var_dump($s['id']);
var_dump($s[0]['id']);
var_dump($s['id'][0]);

This yields:


C:\temp>php test.php

Test with array
C:\temp\test.php:6:
int(1)
C:\temp\test.php:7:
NULL
C:\temp\test.php:8:
NULL
C:\temp\test.php:9:
NULL

Test with int
C:\temp\test.php:13:
NULL
C:\temp\test.php:14:
NULL
C:\temp\test.php:15:
NULL


Test with string
PHP Warning:  Illegal string offset 'id' in C:\temp\test.php on line 21
PHP Stack trace:
PHP   1. {main}() C:\temp\test.php:0

Warning: Illegal string offset 'id' in C:\temp\test.php on line 21

Call Stack:
    0.0002     356384   1. {main}() C:\temp\test.php:0

C:\temp\test.php:21:
string(1) "1"
PHP Warning:  Illegal string offset 'id' in C:\temp\test.php on line 22
PHP Stack trace:
PHP   1. {main}() C:\temp\test.php:0

Warning: Illegal string offset 'id' in C:\temp\test.php on line 22

Call Stack:
    0.0002     356384   1. {main}() C:\temp\test.php:0

C:\temp\test.php:22:
string(1) "1"
PHP Warning:  Illegal string offset 'id' in C:\temp\test.php on line 23
PHP Stack trace:
PHP   1. {main}() C:\temp\test.php:0

Warning: Illegal string offset 'id' in C:\temp\test.php on line 23

Call Stack:
    0.0002     356384   1. {main}() C:\temp\test.php:0

C:\temp\test.php:23:
string(1) "1"

C:\temp>

OK, so here I factor-out the top-level array from the mix. That test was always referring to the first element of the array, which is just the integer 1, so we might as well just test with that. Needless to say, there's the same results as with the first array element. But I still don't get it. What is using array notation to dereference an integer supposedly doing? It's a non-sensical operation, as far as I understand things. And if it is a nonsensical operation: it should error.

For completeness I also try a variation with a string "1". This does error like I'd expect. Well: it gives a warning and then returns an inappropriate (IMO) result anyhow. This is probably PHP doing it's combo of telling me I did it wrong but then going ahead and guessing what I meant anyhow. Grrr.

I'm gonna mess around with de-referencing integers some more when I get a moment, and try to work out WTF.

Righto.

--
Adam

Monday 30 January 2017

PHP: accessing an undefined array element yields a notice. Or does it? WTF, PHP?

G'day:
So we encountered this today:

<?php
$array = [1, 2, 3];

var_dump($array['id']); // this yields a warning
var_dump($array[0]['id']); // none of these do
var_dump($array[0][0]['id']);
var_dump($array[0]['id'][0]);

As indicated, this yields results:

C:\temp>php shite.php
PHP Notice:  Undefined index: id in C:\temp\shite.php on line 4
PHP Stack trace:
PHP   1. {main}() C:\temp\shite.php:0

Notice: Undefined index: id in C:\temp\shite.php on line 4

Call Stack:
    0.0002     353472   1. {main}() C:\temp\shite.php:0

C:\temp\shite.php:4:
NULL
C:\temp\shite.php:5:
NULL
C:\temp\shite.php:6:
NULL
C:\temp\shite.php:7:
NULL

C:\temp>

I think this behaviour is bloody daft. Accessing something that doesn't exist should be more than a notice, and it shouldn't then go ahead and return "null" anyhow. And giving a notice and returning null is possibly the worst way of combining possible actions here. What am I supposed to do with that? Other than bury my head in my hands a weep.

Now this is partially documented on the Arrays page of the docs:

Note:
Attempting to access an array key which has not been defined is the same as accessing any other undefined variable: an E_NOTICE-level error message will be issued, and the result will be NULL.
Note:
Array dereferencing a scalar value which is not a string silently yields NULL, i.e. without issuing an error message.

That's shit, but... well: it's not that surprising (I mean... it's PHP, right?). But how come

$array['id']

incurs the warning, but this doesn't:

$array[0]['id']

This just seems to make fairly unhelpful behaviour just that much less helpful.

I'm sure there's a reason for this. Not a good one, I hasten to add, but no doubt there's a reason.

Anyone know what it is?

Update:

Thanks to Dave (comment below) for pointing out there's a bug been raised for this: 68110.


Righto.

--
Adam

Saturday 14 January 2017

Incredibly: a reader asks me for help with PHP & Nginx

G'day:
Jesus fuck, you lot. Don't ask me shit about systems support / server application config / all that godawful shit that should be consigned to the Systems Support Team (sorry to my mates in this role, but I fucking hate it, and became a dev so I didn't have to do it any more).

Right so ages ago I wrote an article "PHP: getting PHP 5 and PHP 7 running side by side on the same machine (a better way)", and someone recently asked me how do do the same on Nginx. Well it's "slow news night" here in my life: I'm just at the pub in Galway passing time by getting pissed on Guinness (I'm on me fifth pint, and even I can tell the writing here is reflecting that) and writing blog articles, so I had a look at it. It's pretty easy, as it turns out.

Firstly: I am no expert on Nginx. I don't like web servers. I try not to ever have to use them. I know enough about IIS to know it's a pain in the arse but I can do what I need if I have to; and I know enough about Apache to get things working. I don't even know why Nginx exists. I'm guessing the reason is "to annoy Adam, cos it's just one more bloody thing he'll need to know about at some point". Sigh. I got PHP working on Nginx once before ("PHP: getting my dev environment running on Nginx instead of Apache"), and other than that have converted some rewrite rules from Apache to Nginx (man: does Nginx suck compared to Apache for those!).

This is a different laptop from the one I did that other exercise on, so whilst I had Nginx on here (for the rewrite exercise, which was work-related and this is my work laptop), I decided to start from scratch. I deleted what I had installed (where for Nginx "installed" means "unzipped").

Here's what I did:
  1. grabbed the latest Windows Nginx download from their site. I just googled "nginx windows download" to find that.
  2. Unzipped it. Relocated it to my apps dir: c:\apps\nginx
  3. Stopped Apache (which listens on port 80), and instead started Nginx (just run nginx.exe from the dir above). By default it listens on localhost and port 80.
  4. Browse to http://localhost/ and verified the Nginx default index.html page was served. OK, so it works as a baseline. Always test the baseline before proceeding with any customisations of things.
  5. Set-up a coupla test hosts in my hosts file (C:\windows\system32\drivers\etc\hosts... make sure to start yer text editor as an admin, otherwise you won't be able to save it):
    127.0.0.1 php5.nginx.local
    127.0.0.1 php7.nginx.local

    I foresaw that to run both PHP5 and PHP7, Nginx is gonna need to differentiate between which one wants to use, and using different host names seemed to make sense and be easy.
  6. I edited my nginx.conf file (in the conf subdir of the one above) to know about these two hosts:
    http {
        # [...]
    
        server {
            listen       8800;
            server_name  php5.nginx.local;
            root   c:/src/php/php.local/www;
    
            # [...]
        }
    
        server {
            listen       8800;
            server_name  php7.nginx.local;
            root   c:/src/php/php.local/www;
    
            # [...]
        }
    }
    

    Where I've elided stuff, it's the same as it was before. The chief considerations here are:
    • having a server for both PHP5 and PHP7;
    • having them listen on each of those two new hosts I set up;
    • setting the root to be where my PHP code is. This is the same for both in this case, as I want to serve exactly the same code via both 5 and 7;
    • oh and I'm listening on port 8800 as I don't want Nginx to interfere with my normal Apache install (I'll be sticking with Apache after this experiment, thanks).
    Note that this config will still not serve PHP, but it'll at least run.
  7. When one runs Nginx from the console it hogs the prompt, so to stop it one needs to run another console and call nginx -s stop. We need to do this to test the config changes. So I did that, and used the other console to start it again.
  8. I browse to each of http://php5.nginx.local:8800 and http://php7.nginx.local:8800 to test they were working. They were running, but giving a 403 cos I didn't have an index.html in that directory, nor did I have directory browsing switched on (which for my test code I do, as it makes finding stuff easier).
  9. I told Nginx to allow directory browsing for each of the server configs:
    location / {
        index  index.html index.htm;
        autoindex on;
    }
    

    Do not do this in production. Well: don't do anything I say in production.
  10. I cycled Nginx again and tested both hosts:

    Index of /


    ../
    code-coverage-reports/                             15-Nov-2016 13:33                   -
    community/                                         30-Jun-2016 11:47                   -
    experiment/                                        15-Nov-2016 08:47                   -
    library/                                           30-Jun-2016 11:47                   -
    stackoverflow/                                     30-Jun-2016 11:47                   -
    deleteme.php                                       30-Jun-2016 11:47                 233
    gdayWorld.html                                     30-Jun-2016 11:47                  11
    gdayWorld.php                                      30-Jun-2016 11:47                  50
    phpinfo.php                                        30-Jun-2016 11:47                  21
    utf8.html                                          21-Dec-2016 08:21                 133
    


    So that's all good except for me not having deleted that file that perhaps I meant to, a while back ;-)
  11. Next I just followed the instructions from me other blog article, getting PHP to listen out to traffic coming in from Nginx (the code below is in a batch file):
    start C:\bin\RunHiddenConsole.exe C:\apps\php\5\5\php-cgi.exe -b 127.0.0.1:8550 start C:\bin\RunHiddenConsole.exe C:\apps\php\7\1\php-cgi.exe -b 127.0.0.1:8710
    Note that each of them is listening on a different port.
  12. And then tell Nginx to pass PHP requests across to PHP:
    server {
        # [...]
        location ~ \.php$ {
            fastcgi_pass   127.0.0.1:8710;
            fastcgi_index  index.php;
            fastcgi_param  SCRIPT_FILENAME  $document_root$fastcgi_script_name;
            include        fastcgi_params;
        }
        # [...]
    }
    

    That's the one for PHP7, but the PHP5 one is the same except for the port.
  13. I restart Nginx again, and this time hit phpinfo.php on each host:
    {i'd show you proof, but BlogSpot is refusing to let me put two images inline in an <ol> list, it seems. Trust me... it works).
  14. Hurrah!
  15. The last thing I tried to do is to wrap my batch file that starts Nginx into a service, using RunAsService.exe, but this is my work laptop and it's locked-down too tight for me to run that. But... well... the batch file works.
So, anyway... that's it. That's what one needs to do to get Nginx serving two different versions of PHP. My will to live has been sapped by even having to think about this sort of shit, so I'm going back to my Guinness (two pint further ahead than I was when I started this exercise).

Sorry this one is a bit scrappy but... well... I'm drunk.

Righto.

--
Adam

PHP: 2000 words on elseif (and PHPMD)

G'day:
This is another one that just lives up to the definitive notion of a blog: a web log; a log of what I'm doing. But, like... web-based. A blog. What I basically mean is that I'm not sure I'm showing any insight here (look: I basically know I'm not), but it's what I looked at today, so it's what I'm writing about today.

We're trying to lift our game when it comes to our code "Cleanliness" (in the RC Martin "Clean Code" sense), and we've started more rigorously using some code review tools recently, such as PHPCS and PHPMD. They're both pretty good, and I recommend them to anyone using PHP. Almost everyone probably already is: I'm always late to these games.

One PHPMD rule that tends to irk me is the one that declares "[one's code] uses an else expression. Else is never necessary and you can simplify the code to work without else". Now: generally this is true. One can usually refactor one's code to stick it in a helper method and early return in the if logic, then just the rest of the code after the if block is what happens in the else case. This is good because it helps one's code be "closer to the left-edge"; ie: it's not indented because it's not in a logic block any more. The more one needs to indent one's code: the more it probably smells. It's got to the stage with our code now that a double nesting of logic-control statements almost gets the same reaction from code reviewers as when they see a comment in our code; "hey, gather round everyone: Adam's made a comment. Let's look at it!" (Adam shrinks into the carpet and starts getting his excuses ready. Later he refactors the code to be clearer, and the comment goes; the code reviewers are happy). The problem for me is that I'm not clever enough sometimes, and it takes me a coupla takes before I see how to get rid of a wayward else. At least now I usually pick it up and work it out during my "self code review" phase, and sort it out. And PHPMD helps with this.

Today I was reviewing a colleague's code, and noted they had not an else but an elseif. And I triumphantly(*) pinged them with a comment reminding them to run PHPMD before submitting their pull request.

(*) because I'm petty.

It turns out that they had run PHPMD, and it hadn't pulled 'em up. Initially (again: cos I'm petty) I didn't believe them and knocked together my own SSCCE and looked at it.

This demonstrates the problem:

function takeAChance(){
    $firstChance = (bool) rand(0,1);

    if ($firstChance) {
        echo "firstChance was true";
    } else {
        echo "firstChance was false";
    }
}

takeAChance();

And now running PHPMD over it:

C:\src\php\php.local\src\phpmd>phpmd else_block.php text phpmd.xml
C:\src\php\php.local\src\phpmd\else_block.php:8 The method takeAChance uses an else expression.
Else is never necessary and you can simplify the code to work without else.

C:\src\php\php.local\src\phpmd>


OK, fair enough. Oh btw, the solution to this is to early return:

function takeAChance(){
    $firstChance = (bool) rand(0,1);

    if ($firstChance) {
        echo "firstChance was true";
        return;
    }
    echo "firstChance was false";
}

(note: yeah yeah, in the real world I'd not be echoing from a function. I get that. Not the point here)

OK, now here's an SSCCE of the sort of thing we had in this pull request:

function takeAChance(){
    $firstChance = (bool) rand(0,1);
    $secondChance = (bool) rand(0,1);

    if ($firstChance) {
        echo "firstChance was true; don't care what secondChance was";
    }elseif($secondChance){
        echo "well at least secondChance was true";
    }
}

And running PHPMD:

C:\src\php\php.local\src\phpmd>phpmd elseif_block.php text phpmd.xml

C:\src\php\php.local\src\phpmd>

No problems. Grrr. This does not make sense to me. If we add an actual else in there:

function takeAChance(){
    $firstChance = (bool) rand(0,1);
    $secondChance = (bool) rand(0,1);

    if ($firstChance) {
        echo "firstChance was true; don't care what secondChance was";
    }elseif($secondChance){
        echo "well at least secondChance was true";
    }else{
        echo "both were false";
    }
}

And PHPMD says:
C:\src\php\php.local\src\phpmd>phpmd if_elseif_else_block.php text phpmd.xml
C:\src\php\php.local\src\phpmd\if_elseif_else_block.php:11
The method takeAChance uses an else expression.
Else is never necessary and you can simplify the code to work without else.

C:\src\php\php.local\src\phpmd>


And, yeah, it's true... but the elseif seems to me to be as much an issue here as an else. We can short-circuit both easily enough with early returns:

function takeAChance(){
    $firstChance = (bool) rand(0,1);
    $secondChance = (bool) rand(0,1);

    if ($firstChance) {
        echo "firstChance was true; don't care what secondChance was";
        return;
    }
    if($secondChance){
        echo "well at least secondChance was true";
        return;
    }
    echo "both were false";
}

Equally, to me a PHP elseif (note this is an actual command, it's not a concatenation of else if like one might expect), really is, functionally, a "single-statement else", just with the statement itself being an if statement. What I mean is remember that the if / elseif / else statement doesn't need to use blocks (ie: a bunch of statements within braces), it could be a single statement:

function takeAChance() {
    $firstChance = (bool) rand(0,1);

    if ($firstChance)
        echo "firstChance was true";
    else
        echo "firstChance was false";

}

(what an abomination: almost looks like Python. Bleah).

Anyway... so yeah, one doesn't need to use braces if there's only one statement to execute based on the result of the condition. Don't do this. But it's possible.

What I'm getting at is this:

if ($firstChance) {
    echo "firstChance was true; don't care what secondChance was";
}elseif($secondChance){
    echo "well at least secondChance was true";
}

Is basically like this:

if ($firstChance) {
    echo "firstChance was true; don't care what secondChance was";
}else{
    if($secondChance){
        echo "well at least secondChance was true";
    }
}

Just without the braces on the else:

if ($firstChance) {
    echo "firstChance was true; don't care what secondChance was";
}else
    if ($secondChance){
        echo "well at least secondChance was true";
    }

Compare this again to the elseif version:

if ($firstChance) {
    echo "firstChance was true; don't care what secondChance was";
}elseif ($secondChance){
    echo "well at least secondChance was true";
}

The only difference there is the whitespace between the else and the if. And logically there is no difference, as far as I can tell.

I tested all variations I could think of of elseifs, elses, braces, no braces, and the like (see the samples on GitHub), and ran PHPMD over them:

C:\src\php\php.local\src\phpmd\samples>phpmd . text phpmd.xml
C:\src\php\php.local\src\phpmd\samples\else_block.php:8 The method takeAChance uses an else expression. Else is never necessary and you can simplify the code to work without else.
C:\src\php\php.local\src\phpmd\samples\else_if_block.php:9      The method takeAChance uses an else expression. Else is never necessary and you can simplify the code to work without else.
C:\src\php\php.local\src\phpmd\samples\if_elseif_else_block.php:11      The method takeAChance uses an else expression. Else is never necessary and you can simplify the code to work without else.

C:\src\php\php.local\src\phpmd\samples>

Hmmmm... it's not a case of else clauses being a problem it seems. It's a problem with else blocks being a problem.

It's complaining about this:

if ($firstChance) {
    echo "firstChance was true";
} else {
    echo "firstChance was false";
}


But not about this:

if ($firstChance) {
    echo "firstChance was true";
} else 
    echo "firstChance was false";
 


The only difference being the lack of braces on the second one.

Now I guess this makes an element of sense. The "untidiness" of large logic blocks is that they are more of a pain to navigate across when maintaining the code. With an if/else statement one needs to pay attention to where the if is, where the else is, which code is in which clause, and where the whole thing ends. This is not a problem if one keeps the code clean so each block has only 1-2 statements in it, but we've all seen if/elses that go on for miles and miles worth of scrolling down (and, it's worth pointing out: that is shit code if it's written like that. Don't do it). It's exacerbated by further nested logic within that. Fair enough.

But with a non-block else case - which intrinsically can only be one statement - the problem of unclean block sizes can simply not occur. Also fair enough.

But in that case, the warning should not be able else, intrinsically; it should be about the size of the blocks. And the size rule should probably apply to the if block too! If a single-statement else is OK, then a single-statement else block should be OK too.

I think the solution to this is that any else clause should be flagged: be it a single statement, or be it a block. So intrinsically an elseif should be flagged too. I'll raise this with the PHPMD bods.

Whilst on this: they should probably swing the other way and actually flag any instances of single-statement logic clauses: ifs and loops should always have braces. That's a widely accepted truism in curly-brace-language programming. I realise PHPCS does pick up on these (I only realise it just now cos I just tested it), but if PHPMD is busying itself with this sort of thing too; it should do a thorough job. In my view brace-less control logic is mess.


Lastly on this, I'm slightly irked by this, from the PHPMD Clean Code Rules page:

ElseExpression

[...]

An if expression with an else branch is never necessary.

It is terribly pedantic of me, but… it's not an expression. PHP flow control statements are…statements, not expressions. An expression is a specific thing: it is some code that represents a value. In PHP if/else statement does not represent a value. One cannot do this:
$message = if ($firstChance) {
    "firstChance was true";
} else {
    "firstChance was false";
}

If you cannot have some code on the right-hand side of an assignment (or anywhere else a value might be expected), then it's not an expression. A counterpoint to this, one can do this in Ruby:

message = if firstChance
    "firstChance was true";
else
    "firstChance was false";
end

puts message

This is because "everything is an object" in Ruby, I guess: even statements. In this case it seems the value of the if/else statement is that of the last statement in the relevant clause that gets executed by the logic. I'm buggered if I know whether Ruby best practices would encourage that (and by that I mean: I really actually don't know!), but I dunno that I'd be doing that sort of thing. Still: it demonstrates an if statement that is also an expression. Which a PHP one is not.

This sort of thing wouldn't bug me if it was in a casual conversation, but in an official implementation, I think it's a bit… ah… well let's saynit could just be done properly. But then again I guess this is PHP. And "doing it right" doesn't seem to be something PHP concerns itself with too much. It seems to be more "just get it done".

All this aside, I think PHPMD is a bloody handy tool, and whether or not I agree with the minutiae of some of its analysis or its turn of phrase is minor. It's helped my code a lot in the last coupla weeks. And I think our code in general is better of for using tools like this.

So… yeah. That was what I was doing yesterday (it was Fri when I started, it's now Sat evening and I'm in the pub). not scintillating stuff, but the examination of what was going on and the general experimentation was useful (to me, anyhow), I think. And it's enabled me to write a 2000-word blog article which is basically about elseif expressionsclauses.

Righto.

--
Adam

Saturday 7 January 2017

PHP: looking into usage of the ::class constant

G'day:
There's not much to this article. It's arisen from the fact there's a paucity of docs on this subject on the PHP website, and they're quite hard to google for (given Google still insists on ignoring punctuation in searches, for some daftly ignorant reason. So searching for "php ::class" just gets one results for "php class". Not the same thing. Similarly seaching for "php ::class constant" doesn't help.

For the record, the two relevant docs pages are:


Update:

Kalle Sommer Nielsen has been in touch via Twitter and has pointed me to the original RFC for ::class, too: Request for Comments: Class Name Resolution As Scalar Via "class" Keyword. Now to be honest, the content of that RFC would make for better documentation than the current docs furnish. Obviously it'd need a slight tweak to reword some of the questions as statements, and if there were any changes between RFC and what was adopted, those should be included too. Kalle also said he's tweaked the docs slightly too, which is cool. Nice work fella!

I include those there as much for me to be able to easily re-find them as anything else.

In truth there's not much to this ::class construct. I just wish the docs were easier to locate. When discussing the feature, it's nice to be able to point to some docs. Or now: a blog article ;-)

So what does it do?

All classes have a built-in constant (::class) that contains a string that is the class's name. Same as what the get_class function will return for an object, but it works on the actual class too (because it's a class constant). Here are some examples:

<?php

namespace me\adamcameron\cc;

use com\example\other\SomeClassToAlias as AliasedClass;
use com\example\other\SomeClassToAlias;
use com\example\other\SomeOtherClass;

require_once realpath(__DIR__ . '/../vendor/autoload.php');

echo "Using ::class" . PHP_EOL;
printf("SomeClass: %s%s", SomeClass::class, PHP_EOL);
printf("SomeOtherClass: %s%s", SomeOtherClass::class, PHP_EOL);
printf("SomeClassToAlias as AliasedClass: %s%s", AliasedClass::class, PHP_EOL);
printf("SomeClassToAlias: %s%s", SomeClassToAlias::class, PHP_EOL);
printf("PHPUnit_Framework_Exception: %s%s", \PHPUnit_Framework_Exception::class, PHP_EOL);

$someClass = new SomeClass();
$someOtherClass = new SomeOtherClass();
$someAliasedClass = new AliasedClass();
$someClassToAlias = new SomeClassToAlias();
$phpunitFrameworkException = new \PHPUnit_Framework_Exception();

echo PHP_EOL . "Using get_class() on instance" . PHP_EOL;
printf("SomeClass: %s%s", get_class($someClass), PHP_EOL);
printf("SomeOtherClass: %s%s", get_class($someOtherClass), PHP_EOL);
printf("AliasedClass: %s%s", get_class($someAliasedClass), PHP_EOL);
printf("SomeClassToAlias: %s%s", get_class($someClassToAlias), PHP_EOL);
printf("PHPUnit_Framework_Exception: %s%s", get_class($phpunitFrameworkException), PHP_EOL);

In the example all the classes in the me\adamcameron or com\example namespaces are just empty classes, eg:

<?php

namespace me\adamcameron\cc;

class SomeClass {}

I've also included a PHPUnit class there as PHPUnit is a lib I'm aware of that doesn't use namespaces (dunno why), and wanted to see how that behaved, in case there were quirks: seemingly not.

Here's the output:

C:\src\php\php.local\src\oo\classConstant\src>php testWithNamespace.php
Using ::class
SomeClass: me\adamcameron\cc\SomeClass
SomeOtherClass: com\example\other\SomeOtherClass
SomeClassToAlias as AliasedClass: com\example\other\SomeClassToAlias
SomeClassToAlias: com\example\other\SomeClassToAlias
PHPUnit_Framework_Exception: PHPUnit_Framework_Exception

Using get_class() on instance
SomeClass: me\adamcameron\cc\SomeClass
SomeOtherClass: com\example\other\SomeOtherClass
AliasedClass: com\example\other\SomeClassToAlias
SomeClassToAlias: com\example\other\SomeClassToAlias
PHPUnit_Framework_Exception: PHPUnit_Framework_Exception

C:\src\php\php.local\src\oo\classConstant\src>

So you see there are no surprises really: ::class just returns the fully-qualified path of the class concerned. One thing to not is that it does not pay any attention to aliasing, whether on the class itself or on an object of that aliased class. I guess an alias is just for clarity in source code, rather than any sort of "renaming" exercise.

Where does one use ::class constants?

Well my primary usage is when mocking, using PHPUnit. Instead of this:

$mockedThing = $this->getMockBuilder('path\to\class\being\mocked\MockThisClass')
    ->disableOriginalConstructor()
    ->setMethod(['someMethod'])
    ->getMock();

We just have this:

$mockedThing = $this->getMockBuilder(MockThisClass::class)
    ->disableOriginalConstructor()
    ->setMethod(['someMethod'])
    ->getMock();

(and PHPStorm even includes the use statement for me, automatically):

use path\to\class\being\mocked\MockThisClass;

It's good to keep all the class pathing references together at the top of the file.

One flaw in this constant is this behaviour:
<?php

require_once realpath(__DIR__ . '/../vendor/autoload.php');

echo "Using ::class" . PHP_EOL;
printf("SomeNonExistentClass: %s%s", SomeNonExistentClass::class, PHP_EOL);

Any sensible person would probably want an error to be thrown there. But... no. PHP does this:

C:\src\php\php.local\src\oo\classConstant\src>php testWithNonExistentClass.php
Using ::class
SomeNonExistentClass: SomeNonExistentClass

C:\src\php\php.local\src\oo\classConstant\src>

Groan. Why does PHP insist on being "helpful" like this. The class doesn't exist! Just say that. FFS.

Oh well... that sort of thing is almost to be expected of PHP, I guess. Sigh. Well this ::class constant mostly a well-implemented, if not glamourous, small feature in PHP. And now I seem to have documented ::class more than PHP itself has. Heh.

Speaking of documentation, I can't help but think this ::class constant should also be mentioned on a coupla other pages in the PHP docs:

It'd be a good fit for both of those pages, plus that's where Google lands you if you search for it.

That's it. I have to dash to go visit my son for a few hours. Thanks to my colleague Carlos for pointing this whole thing out to me, btw. Both the ::class constant construct itself, but also the observation to be made about aliases. Nice one, fella.

Righto.

--
Adam

PS: apologies for the blatant SEO keyword stuffing of ::class constant in this article ("oops... I did it again"). It was by design.

Wednesday 4 January 2017

PHP: looking at the Chain of Responsibility pattern

G'day:
We're working on a proof of concept for some stuff at work at the moment, and I've been put on a code review of the first round of the code. Part of the code is dealing with conditionally getting some data from cache, or if it ain't there, getting it from the DB instead: a fairly common trope. Normally I'd fall back on the decorator pattern for this, indeed I've written about it in the past ("Using a decorator pattern to reduce inappropriate code complexity"). I didn't see this in play in this first tranche of code, so suggested its use. The authoring dev suggested they felt that the decorator pattern was not a good fit for this, so they were using the Chain of Responsibility pattern instead. Another of our devs - when prodded - tended to agree.

I gave this some thought, read-up on the Chain of Responsibility pattern (and some more on the Decorator Pattern), and wasn't quite seeing it.

Just to back up a bit... if yer like me and kinda knew about the Chain of Responsibility pattern, but wasn't intimately familiar with it: read that Wikipedia link above, and do some googling of yer own. But in short it's characterised by having a task that needs doing, and having a sequence (a chain) of agents that possibly can (or possibly cannot) resolve all or part of the task. The process is that the task is passed to the chain of agents, and each in turn decides for itself whether it can fulfil the task in whole or in part. If an agent can resolve the whole task it does so: returning the result. If a given agent cannot fulfil the whole task (either it cannot fulfil it at all, or can only do part of it), it does its work, then passes responsibility on to the next task in the chain. Each agent knows two things: how to do its part of the job, and that it can pass work on to the "next" agent. It does not (and should not!) know what the next agent does, nor should it rely on what the next agent does. It simply does its job and returns it or passes on incomplete work to the next agent in the chain. I can't be arsed with a diagram: go google one.

Now: for two thirds of our challenge, I think the Chain of Responsibility pattern is a good fit: for the "Is it in the cache? Yes [/No. Is it in the DB? Yes/No]" bit. No qualms there. However it sticks in my craw once we get to the step after getting from the DB: putting it in the cache for next time. That step does not belong in the chain as it's not performing the same sort of task as the other two: getting the data. Second to that it's hitting the cache before and after the DB step, which makes it more seem like a decoration to me (of sorts... I might get back to the "of sorts..." bit later). Even if one decides "get from cache" is one agent's job, and "get from DB, oh and put it in the cache" is another agent's job, that latter agent is doing a mishmash of "unrelated" work, and seems to be a bit tightly-coupled to me, as well as having a wodge of code (that's a technical term) doing two things, rather than two wodges of code doing one thing each.

That said, I don't like being so emphatic about "that's wrong" without being happy about my understanding of things, so figured I should try to make it work in a way that sits well with me in a proof of concept. That and it seemed like good material for a blog article.

In my proof of concept I am fetching some Person data, for example:

$person = $personService->getById(1);

The PersonService defers to a "RetrievalHandler" for this:

class PersonService {

    private $retrievalHandler;

    public function __construct(PersonRetrievalHandler $retrievalHandler) {
        $this->retrievalHandler = $retrievalHandler;
    }

    public function getById($id) : Person {
        $record = $this->retrievalHandler->getById($id);

        if (is_array($record)){
            $person = new Person($record['firstName'], $record['lastName']);

            return $person;
        }
        return new Person();
    }
}


This just finds and returns a Person object, implement thus:

class Person {

    public $firstName;
    public $lastName;

    public function __construct($firstName=null, $lastName=null) {
        $this->firstName = $firstName;
        $this->lastName = $lastName;
    }
}

As noted, the service uses a the RetrievalHandler to retreive stuff. A handler is implementation of this:

abstract class PersonRetrievalHandler {

    protected $nextHandler;
    protected $logger;

    public function __construct() {
        $this->nextHandler = new class {
            function getById(){
                return null;
            }
        };
    }

    public abstract function getById(int $id) : array;

    public function setNextHandler(PersonRetrievalHandler $nextHandler) {
        $this->nextHandler = $nextHandler;
    }
}

All it's required to do here is know how to use getById, and also it can setNextHandler. I'll come back to the code in the constructor later. Don't worry about that for now.

I've got two implementations of this. One for getting the thing from a caching service:

class CachedPersonHandler extends PersonRetrievalHandler {

    private $cacheService;

    public function __construct(CacheService $cacheService){
        parent::__construct();
        $this->cacheService = $cacheService;
    }

    public function getById(int $id) : array {
        if ($this->cacheService->exists($id)) {
            return $cachedPerson = $this->cacheService->get($id);
        }

        return $this->nextHandler->getById($id);
    }
}


And one for getting it from a database repository:

class DatabasePersonHandler extends PersonRetrievalHandler {

    private $repository;

    public function __construct(PersonRepository $repository) {
        parent::__construct();
        $this->repository = $repository;
    }

    public function getById($id) : array {
        $record = $this->repository->getById($id);
        if (count($record)) {
            return $record;
        }

        return $this->nextHandler->getById($id);
    }
}


Note this is very woolly and not production-sound code. It's just to demonstrate the concept.

Important to note in both these situations is the agents don't themselves do any of the work; they just try to get the work done by something else, and return it if done, or pass the task to the next agent if the current agent itself couldn't fulfil the requirement.

I've created a stub caching service and repository:

class CacheService {

    private $cache = [];
    private $logger;

    public function __construct(LoggingService $logger)
    {
        $this->logger = $logger;
    }

    public function get($key) {
        $this->logger->info("Cache hit for $key");
        return $this->cache[$key];
    }

    public function exists(string $key) : bool {
        $exists = array_key_exists($key, $this->cache);

        $this->logger->info(sprintf("Cache %s for %d", $exists ? 'hit' : 'miss', $key));

        return $exists;
    }

    public function put(string $key, $value) {
        $this->cache[$key] = $value;
    }
}


class PersonRepository {

    private $data = [
        [],
        ['firstName' => 'Tui', 'lastName' => 'Tahi'],
        ['firstName' => 'Rewi', 'lastName' => 'Rua'],
        ['firstName' => 'Tama', 'lastName' => 'Toru'],
        ['firstName' => 'Whina', 'lastName' => 'Wha']
    ];
    private $logger;

    public function __construct(LoggingService $logger)
    {
        $this->logger = $logger;
    }

    public function getById($id) {
        if (array_key_exists($id, $this->data)){
            $this->logger->info("Database hit for $id");
            return $this->data[$id];
        }
        $this->logger->info("Database miss for $id");
        return [];
    }
}


These also log activity in a very naïve way:

class LoggingService {
    public function info(string $message) {
        echo $message . PHP_EOL;
    }
}


All of this is stitched together in a factory, as there's a lot of moving parts:

class PersonServiceFactory {

    public static function getPersonService()
    {
        $loggingService = new LoggingService();
        $cacheService = new CacheService($loggingService);
        $personRepository = new PersonRepository($loggingService);

        $cachedHandler = new CachedPersonHandler($cacheService);
        $databaseHandler = new DatabasePersonHandler($personRepository);

        $cachedHandler->setNextHandler($databaseHandler);

        $personService = new PersonService($cachedHandler);

        return $personService;
    }

}


Note how the only knowledge each agent has of another is what a given agent's next handler might be. Speaking of next handlers. let's go back to that code in the PersonRetrievalHandler's constructor:

public function __construct() {
    $this->nextHandler = new class {
        function getById(){
            return null;
        }
    };
}


You'll remember each getById implementation will either fulfil the task and return the result, or it'll defer to the next agent in the chain. This is just a mechanism to make this "work" even if the last agent in the chain calls return $this->nextHandler->getById($id). The base class will take the call and just return null. I was semi-pleased to be able to use a class literal here (or anonymous class as PHP wants to call them). So this means I do not need to set the next handler on the $databaseHandler: it'll just use the inbuilt one if it cannot fulfil the requirement itself.

Right so we can now run this to see what happens. Remember the expectation is that the requested data will be fetched from the cache as a priority, but if not found: get it from the DB. Note that I am not doing anything here about putting the data into the cache if it wasn't there in the first place. I'll get to that. Anyway, here's some tests:

$personService = PersonServiceFactory::getPersonService();
$person = $personService->getById(1);
var_dump($person);
$person = $personService->getById(1);
var_dump($person);
$person = $personService->getById(4);
var_dump($person);
$person = $personService->getById(5);
var_dump($person);

Here we're:
  • getting the same person twice to start with: ostensibly this'd demonstrate a cache-miss then a cache-hit;
  • getting a different person from the first (back to a cache miss);
  • getting a person who is neither in the cache nor the DB.

Results:

C:\src\php\php.local\src\chainOfResponsibility\public>php manualTest.php
Cache miss for 1
Database hit for 1
object(me\adamcameron\cor\model\Person)#9 (2) {
  ["firstName"]=>
  string(3) "Tui"
  ["lastName"]=>
  string(4) "Tahi"
}
Cache miss for 1
Database hit for 1
object(me\adamcameron\cor\model\Person)#10 (2) {
  ["firstName"]=>
  string(3) "Tui"
  ["lastName"]=>
  string(4) "Tahi"
}
Cache miss for 4
Database hit for 4
object(me\adamcameron\cor\model\Person)#9 (2) {
  ["firstName"]=>
  string(5) "Whina"
  ["lastName"]=>
  string(3) "Wha"
}
Cache miss for 5
Database miss for 5
object(me\adamcameron\cor\model\Person)#10 (2) {
  ["firstName"]=>  NULL
  ["lastName"]=>  NULL
}

C:\src\php\php.local\src\chainOfResponsibility\public>

That's all good:
  • the cache never finds anything (although it does look),
  • so passes the job on to the DB. The DB will return the data if found,
  • otherwise it'll fall back to the default agent which just returns null.


So far: so good. But why did I omit the "putting it into the cache" step. Well... using this model, it simply won't work. The rule is that if an agent can resolve the work, then it returns it. Otherwise it defers to the next agent. I figured I could implement a "pass-thru" agent which just caches the result found by the DB agent: job done. But if the DB agent finds the record - read: "it can fulfil the task" - then it doesn't pass the task on to the next agent. It considers the task fulfilled, so just returns its result. So the only time the "put it in cache" agent will be called is if the DB agent doesn't find anything. Precisely the wrong time to cache something. That aside, the agent only passes on the inputs (the $id) to the next agent, so there's no mechanism here for an agent to receive the result from the previous agent even if the result was there to pass.

What we'd need to do here is just use the chain of responsibility to get the data, and perhaps leave it to the PersonService to cache the result it receives. However this is a bit unbalanced in my view, as caching considerations of the same data are being handled at not only two different places in the code, but two different "levels" of the task at hand. That's a bit clumsy in my book.

Next I thought I could adjust the DB handler to unconditionally pass its result on to the next handler, rather than break out of the chain once done. But that's a bit crap as it makes the DB handler behave differently than the other handlers in the chain, which also implies it knows more about what's going on than it should: ie that there's a reason to always pass the result on, rather than return it if it's done the work. Not cool.

Update:

My colleague (a different one) suggested a third open: roll the cache-put into the CachedPersonHandler:

class CachedPersonHandler extends PersonRetrievalHandler {

    private $cacheService;

    public function __construct(CacheService $cacheService){
        parent::__construct();
        $this->cacheService = $cacheService;
    }

    public function getById(int $id) : array {
        if ($this->cacheService->exists($id)) {
            return $cachedPerson = $this->cacheService->get($id);
        }

        $person = $this->nextHandler->getById($id);
        $this->cacheService->put($id, $person);
        return $person;
    }
}

Well: indeed. Now it's basically a decorator: you've decorated a DB call with some caching. Contrast it with this from my earlier article on the Decorator Pattern:

class CachedUserRepository implements UserRepositoryInterface {

    private $repository;
    private $cacheService;

    public function __construct(UserRepositoryInterface $repository, CacheServiceInterface $cacheService) {
        $this->repository = $repository;
        $this->cacheService = $cacheService;
    }

    public function getById($id) {
        if ($this->cacheService->isCached($id)) {
            return $this->cacheService->get($id);
        }
        $object = $this->repository->getById($id);

        $this->cacheService->put($id, $object);
        return $object;
    }

}

You've arrived at the same conclusion I started with ;-)

I did try one other tactic. Instead of passing just the inputs to the agents, I had a look at passing the inputs and the outputs through the chain, and also always passing on to the next agent, irrespective of whether a given handler arrived at the result. This was very proof-of-concept-y. It's possibly a legit approach if the Chain of Responsibility was one in that a given agent was only supposed to fulfil part of the solution, relying on all links in the chain to arrive at the complete result. This is not what I'm doing here, but decided to leverage this approach to get the caching working "properly".


Intermission

Go have a pee or get some more popcorn or something.


Now I have an interface for a Handler:

interface Handler {

    public function perform($request, $response);

}


This just reflects a more generic solution than getById. More importantly note that unlike getById, perform takes a request and a response. For our examples the "request" would be the ID. And the response would be any result a given agent was able to come up with.

This is implemented in an abstract way by PersonRetrievalHandler:

abstract class PersonRetrievalHandler implements Handler {

    protected $nextHandler;
    protected $logger;

    public function __construct() {
        $this->nextHandler = $this->getPassThroughHandler();
    }

    public function setNextHandler(PersonRetrievalHandler $nextHandler) {
        $this->nextHandler = $nextHandler;
    }

    private function getPassThroughHandler(){
        return new class  {
            function perform($_, $response){
                return $response;
            }
        };
    }
}

It's still down to the individual handlers to implement the perform method though.

I've also refactored that class literal out into a helper method; and its perform method now returns whatever response it was passed, rather than returning null.

The perform method in CachedPersonHandler is thus:

public function perform($request, $response=null) {
    $id = $request;
    if ($this->cacheService->exists($id)) {
        $response = $this->cacheService->get($id);
    }

    return $this->nextHandler->perform($request, $response);
}


Note that it doesn't return the result itself, it just passes it on to the next agent. Otherwise it's doing the same thing.

And the DatabasePersonHandler has also been slightly refactored:

public function perform($request, $response=null) {
    if (is_null($response)){
        $id = $request;
        $response = $this->repository->getById($id);
    }

    return $this->nextHandler->perform($request, $response);
}


It infers that the cache didn't find anything by it not receiving anything in the response, and only if so does it hit the DB. Otherwise it just passes everything through to the next agent.

Now we have a third handler, CacheablePersonHandler:

class CacheablePersonHandler extends PersonRetrievalHandler {

    private $cacheService;

    public function __construct(CacheService $cacheService){
        parent::__construct();
        $this->cacheService = $cacheService;
    }

    public function perform($request, $response=null) {
        if (!is_null($response)) {
            $id = $request;
            $this->cacheService->put($id, $response);
        }

        return $this->nextHandler->perform($request, $response);
    }
}


This one makes no pretence about finding data, it just takes any received response and caches it, keyed on the request. And passes the response on. At this point remember the default handler will just return whatever response it receives.

We wire all this together in the factory again:

public static function getPersonService()
{
    $loggingService = new LoggingService();
    $cacheService = new CacheService($loggingService);
    $personRepository = new PersonRepository($loggingService);

    $cachedHandler = new CachedPersonHandler($cacheService);
    $databaseHandler = new DatabasePersonHandler($personRepository);
    $cacheableHandler = new CacheablePersonHandler($cacheService);

    $databaseHandler->setNextHandler($cacheableHandler);
    $cachedHandler->setNextHandler($databaseHandler);

    $personService = new PersonService($cachedHandler);

    return $personService;
}


This differs only in that it also creates the CacheablePersonHandler too, and stick it at the end of the chain.

Here's our revised test:

$personService = PersonServiceFactory::getPersonService();

foreach ([1,1,4,5,5] as $id){
    $person = $personService->getById($id);
    var_dump($person);
}


(it's the same, just without the tautological repeated duplication of the same stuff again).

And the result:

C:\src\php\php.local\src\chainOfResponsibility\public>php manualTest.php
Cache miss for 1
Database hit for 1
Cache put for 1
object(me\adamcameron\cor\model\Person)#8 (2) {
  ["firstName"]=>
  string(3) "Tui"
  ["lastName"]=>
  string(4) "Tahi"
}
Cache hit for 1
Cache put for 1
object(me\adamcameron\cor\model\Person)#11 (2) {
  ["firstName"]=>
  string(3) "Tui"
  ["lastName"]=>
  string(4) "Tahi"
}
Cache miss for 4
Database hit for 4
Cache put for 4
object(me\adamcameron\cor\model\Person)#8 (2) {
  ["firstName"]=>
  string(5) "Whina"
  ["lastName"]=>
  string(3) "Wha"
}
Cache miss for 5
Database miss for 5
object(me\adamcameron\cor\model\Person)#11 (2) {
  ["firstName"]=>
  NULL
  ["lastName"]=>
  NULL
}
Cache miss for 5
Database miss for 5
object(me\adamcameron\cor\model\Person)#8 (2) {
  ["firstName"]=>
  NULL
  ["lastName"]=>
  NULL
}

C:\src\php\php.local\src\chainOfResponsibility\public>

So it "works", in that DB hits are now going into the cache.

But it also re-caches cache hits too :-(

Sigh. I could work around this by doing an exists call in the latter handler too:

public function perform($request, $response=null) {
    if (!is_null($response)) {
        $id = $request;
        if (!$this->cacheService->exists($id)) {
            $this->cacheService->put($id, $response);
        }
    }

    return $this->nextHandler->perform($request, $response);
}


But it all seems a bit clumsy to me. And when something seems clumsy to me, I start wondering if this square peg is supposed to be going in this round hole. And my conclusion is generally: "no".

The problem here goes back to the "cache the data" part of this chain doesn't belong in the chain. It's not a congruent task compared to the data-fetch tasks. The "responsibility" in the "Chain of Responsibility" is that the responsibility is "fulfilling the task". The caching agent makes no attempt to fulfil the requirement of the chain, so it doesn't belong in the chain to me.

So where does it belong? Well it belongs as near as possible to when we know the database was hit, and we have the result. So we could put it in the database agent. But then we have this imbalance that the cache-look-up is part of the chain, and the cache-put is within part of the chain. I don't like that asymmetry. Also it means the DB agent is doing two things: getting data and caching it, which seems like it's doing too much to me: it needs to know how to do two different things. We could go down the route of having like a CachedPersonRepository, and that at least removes those two responsibilities from the agent, but it does just move them: now the repo needs to know about the DB and the caching.

What I'd end up doing is using a caching decorator around a DB-only repository. That's symmetrical, and all concerns are separated.

Hmmm.

Second Intermission

Go find a suitable blade so you can open a vein and end it all instead of having to read on… (and on…)


Oh, as a footnote to this another requirement I wanted to prove is that the chain links behaved "sensibly" if called without their adjacent agents. IE: the DatabasePersonAgent agent didn't need the CachedPersonAgent, nor did it need the CacheablePersonAgent.

Here's some examples:

Just the database handler:

$loggingService = new LoggingService();
$personRepository = new PersonRepository($loggingService);

$databaseHandler = new DatabasePersonHandler($personRepository);

$personService = new PersonService($databaseHandler);

foreach ([1,5] as $id){
    $person = $personService->getById($id);
    var_dump($person);
}


Result:

C:\src\php\php.local\src\chainOfResponsibility\public>php manualTestWithJustDatabaseHandler.php
Database hit for 1
object(me\adamcameron\cor\model\Person)#7 (2) {
  ["firstName"]=>
  string(3) "Tui"
  ["lastName"]=>
  string(4) "Tahi"
}
Database miss for 5
object(me\adamcameron\cor\model\Person)#8 (2) {
  ["firstName"]=>
  NULL
  ["lastName"]=>
  NULL
}

C:\src\php\php.local\src\chainOfResponsibility\public>

Yup: it's getting data from the DB, and doesn't require the other agents.

Now one with just the CachedPersonHandler:

$loggingService = new LoggingService();
$cachingService = new CacheService($loggingService);

$cachedHandler = new CachedPersonHandler($cachingService);

$personService = new PersonService($cachedHandler);

$cachingService->put(5, ['firstName'=>'Ria', 'lastName'=>'Rima']);

foreach ([1,5] as $id){
    $person = $personService->getById($id);
    var_dump($person);
}

Result:

C:\src\php\php.local\src\chainOfResponsibility\public>php manualTestWithJustCachedHandler.php
Cache put for 5
Cache miss for 1
object(me\adamcameron\cor\model\Person)#7 (2) {
  ["firstName"]=>
  NULL
  ["lastName"]=>
  NULL
}
Cache hit for 5
object(me\adamcameron\cor\model\Person)#8 (2) {
  ["firstName"]=>
  string(3) "Ria"
  ["lastName"]=>
  string(4) "Rima"
}

C:\src\php\php.local\src\chainOfResponsibility\public>

Here I'm priming the cache manually, just to make sure it'll find the cached item OK. And it does.

For the sake of completeness I also test with just the cache-put agent. It won't do anything useless (a warning flag in itself), but it should "work":

$loggingService = new LoggingService();
$cachingService = new CacheService($loggingService);

$cacheableHandler = new CacheablePersonHandler($cachingService);

$personService = new PersonService($cacheableHandler);

foreach ([1,5] as $id){
    $person = $personService->getById($id);
    var_dump($person);
}


Result:

C:\src\php\php.local\src\chainOfResponsibility\public>php manualTestWithJustCacheableHandler.php
object(me\adamcameron\cor\model\Person)#7 (2) {
  ["firstName"]=>
  NULL
  ["lastName"]=>
  NULL
}
object(me\adamcameron\cor\model\Person)#8 (2) {
  ["firstName"]=>
  NULL
  ["lastName"]=>
  NULL
}

C:\src\php\php.local\src\chainOfResponsibility\public>

This is all unhelpful to the stated goal, but it's doing what it's told and is self-contained. "Win".

In conclusion I think this has helped me get a handle on the Chain of Responsibility pattern, which is good. I still want to have a look at chains wherein the agents achieve partial fulfilment, and possibly some recursion as well. That'd be cool.

But I really don't think this is the right solution to the job at hand at work.

I'll leave you be now. Oh... as per usual: all the code is on Github, including the working application. The first iteration of the work is tagged separately.

Righto.

--
Adam

Friday 30 December 2016

PHP: using SoapClient to consume a SOAP web service

G'day:
Yeah, OK, just don't ask why I'm needing to consume SOAP web services. It's Adobe's fault. Not because of ColdFusion for a change, but because an app they offer which only exposes its API via SOAP. Because it's still 2006 at Adobe, it seems.

For the last few weeks, this has been running through my head:



But anyway.

Fighting with the Adobe web service was a right pain in the arse, but I won't go into that - it's not useful info to share - but I did have to monkey around with how PHP consumes SOAP web services, and piecing it together took a chunk of googling for me to get my brain around it, so I thought I'd write it up here.

I hasten to add that most of the hassle getting it to work was because of the shitty Adobe web service being unhelpful, so it's perhaps not a challenge for most other people.

I cannot deal with the Adobe web service in this article as it's subscription-only, and it's kinda work-related stuff I'm not allowed to expose here. So I knocked together my own web service to consume. I used CFML for this as all it takes to create a SOAP web service in CFML is to give a method an access qualifier of remote, and have it web-accessible. ColdFusion handles everything else.

Here's the web service code (I'll keep the CFML code to a minimum, but it's all on GitHub anyhow):

import me.adamcameron.accounts.*;

component wsversion=1 {

    remote Invoice function getById(numeric id) returnformat="wddx" {
        var address = new Address(1, "London", "United Kingdom", "E18");

        var account = new PersonalAccount(2, "Adam", "Cameron", "1970-02-17", address);

        var penguin = new Product(3, "Penguin", 4.56);
        var pangolin = new Product(7, "Pangolin", 8.90);
        var platypus = new Product(11, "Playtpus", 12.13);

        var lines = [
            new InvoiceLine(14, penguin, 15, 16.17),
            new InvoiceLine(18, pangolin, 19, 20.21),
            new InvoiceLine(22, platypus, 23, 24.25)
        ];

        invoice = new Invoice(id, account, lines);

        return invoice;        
    }

}

Note that I should not have to state the wsversion there, but for some reason I could not get the default - version 2 - to work, so I had to force it to be version 1. I didn't look into why it didn't work on version 2, as I really didn't want to spend more time than necessary on the CFML code for this exercise.

Other than that, this code is straight forward: it returns an Invoice object which comprises a Person (with an Address), and an array of InvoiceLines each of which have a Product. the modelling for those is the same as in the receiving PHP code, further down.

On the PHP side of things, I just call this code to consume it:

<?php

namespace me\adamcameron\accounts;

require __DIR__ . '/model.php';

$wsdl = "http://localhost:8516/cfml/webservices/soap/accounts/public/Invoices.cfc?wsdl";


$options = [
    'trace' => true
];

$client = new \SoapClient($wsdl, $options);

$invoice = $client->getById(2011);

var_dump($invoice);
echo "==============================" . PHP_EOL . PHP_EOL;
var_dump($client->__getLastResponse());

So it's simply a matter of creating a SoapClient with the WSDL URL and some options. I'm using the trace option here so I can do that call to __getLastResponse.

Let's have a look at the raw SOAP response first (warning: it's pretty turgid reading):

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <soapenv:Body>
        <ns1:getByIdResponse xmlns:ns1="http://public.accounts.soap.webservices.cfml" soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
            <getByIdReturn xmlns:ns2="http://accounts.adamcameron.me" xsi:type="ns2:Invoice">
                <account xsi:type="ns2:PersonalAccount">
                    <address xsi:type="ns2:Address">
                        <country xsi:type="xsd:string">United Kingdom</country>
                        <id xsi:type="xsd:double">1.0</id>
                        <localPart xsi:type="xsd:string">London</localPart>
                        <postcode xsi:type="xsd:string">E18</postcode>
                    </address>
                    <dateOfBirth xsi:type="xsd:string">1970-02-17</dateOfBirth>
                    <firstName xsi:type="xsd:string">Adam</firstName>
                    <id xsi:type="xsd:double">2.0</id>
                    <lastName xsi:type="xsd:string">Cameron</lastName>
                </account>
                <id xsi:type="xsd:double">2011.0</id>
                <items xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" soapenc:arrayType="xsd:anyType[3]" xsi:type="soapenc:Array">
                    <items xsi:type="ns2:InvoiceLine">
                        <count xsi:type="xsd:double">15.0</count>
                        <id xsi:type="xsd:double">14.0</id>
                        <price xsi:type="xsd:double">16.17</price>
                        <product xsi:type="ns2:Product">
                            <description xsi:type="xsd:string">Penguin</description>
                            <id xsi:type="xsd:double">3.0</id>
                            <rrp xsi:type="xsd:double">4.56</rrp>
                        </product>
                    </items>
                    <items xsi:type="ns2:InvoiceLine">
                        <count xsi:type="xsd:double">19.0</count>
                        <id xsi:type="xsd:double">18.0</id>
                        <price xsi:type="xsd:double">20.21</price>
                        <product xsi:type="ns2:Product">
                            <description xsi:type="xsd:string">Pangolin</description>
                            <id xsi:type="xsd:double">7.0</id>
                            <rrp xsi:type="xsd:double">8.9</rrp>
                        </product>
                    </items>
                    <items xsi:type="ns2:InvoiceLine">
                        <count xsi:type="xsd:double">23.0</count>
                        <id xsi:type="xsd:double">22.0</id>
                        <price xsi:type="xsd:double">24.25</price>
                        <product xsi:type="ns2:Product">
                            <description xsi:type="xsd:string">Playtpus</description>
                            <id xsi:type="xsd:double">11.0</id>
                            <rrp xsi:type="xsd:double">12.13</rrp>
                        </product>
                    </items>
                </items>
            </getByIdReturn>
        </ns1:getByIdResponse>
    </soapenv:Body>
</soapenv:Envelope>

Wot a bloody mouthful!

PHP takes all that in its stride, and rehydrates that into an object:

object(stdClass)#2 (3) {
    ["account"]=>
    object(stdClass)#3 (5) {
        ["address"]=>
        object(stdClass)#4 (4) {
            ["country"]=>
            string(14) "United Kingdom"
            ["id"]=>
            float(1)
            ["localPart"]=>
            string(6) "London"
            ["postcode"]=>
            string(3) "E18"
        }
        ["dateOfBirth"]=>
        string(10) "1970-02-17"
        ["firstName"]=>
        string(4) "Adam"
        ["id"]=>
        float(2)
        ["lastName"]=>
        string(7) "Cameron"
    }
    ["id"]=>
    float(2011)
    ["items"]=>
    array(3) {
        [0]=>
        object(stdClass)#5 (4) {
            ["count"]=>
            float(15)
            ["id"]=>
            float(14)
            ["price"]=>
            float(16.17)
            ["product"]=>
            object(stdClass)#6 (3) {
                ["description"]=>
                string(7) "Penguin"
                ["id"]=>
                float(3)
                ["rrp"]=>
                float(4.56)
            }
        }
        [1]=>
        object(stdClass)#7 (4) {
            ["count"]=>
            float(19)
            ["id"]=>
            float(18)
            ["price"]=>
            float(20.21)
            ["product"]=>
            object(stdClass)#8 (3) {
                ["description"]=>
                string(8) "Pangolin"
                ["id"]=>
                float(7)
                ["rrp"]=>
                float(8.9)
            }
        }
        [2]=>
        object(stdClass)#9 (4) {
            ["count"]=>
            float(23)
            ["id"]=>
            float(22)
            ["price"]=>
            float(24.25)
            ["product"]=>
            object(stdClass)#10 (3) {
                ["description"]=>
                string(8) "Playtpus"
                ["id"]=>
                float(11)
                ["rrp"]=>
                float(12.13)
            }
        }
    }
}

That's cool, but if we look at the SOAP response, we do have the object information in there too. Here's an extract:

xsi:type="ns2:Invoice"&gt;
    &lt;account xsi:type="ns2:PersonalAccount"&gt;
        &lt;address xsi:type="ns2:Address"&gt;
            &lt;country xsi:type="xsd:string"&gt;United Kingdom&lt;/country&gt;
            &lt;id xsi:type="xsd:double"&gt;1.0&lt;/id&gt;
            &lt;localPart xsi:type="xsd:string"&gt;London&lt;/localPart&gt;
            &lt;postcode xsi:type="xsd:string"&gt;E18&lt;/postcode&gt;
        &lt;/address&gt;
        &lt;dateOfBirth xsi:type="xsd:string"&gt;1970-02-17&lt;/dateOfBirth&gt;
        &lt;firstName xsi:type="xsd:string"&gt;Adam&lt;/firstName&gt;
        &lt;id xsi:type="xsd:double"&gt;2.0&lt;/id&gt;
        &lt;lastName xsi:type="xsd:string"&gt;Cameron&lt;/lastName&gt;
    &lt;/account&gt;
    &lt;id xsi:type="xsd:double"&gt;2011.0&lt;/id&gt;
    &lt;items xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" soapenc:arrayType="xsd:anyType[3]" xsi:type="soapenc:Array"&gt;
        &lt;items xsi:type="ns2:InvoiceLine"&gt;
            &lt;count xsi:type="xsd:double"&gt;15.0&lt;/count&gt;
            &lt;id xsi:type="xsd:double"&gt;14.0&lt;/id&gt;
            &lt;price xsi:type="xsd:double"&gt;16.17&lt;/price&gt;
            &lt;product xsi:type="ns2:Product"&gt;
                &lt;description xsi:type="xsd:string"&gt;Penguin&lt;/description&gt;
                &lt;id xsi:type="xsd:double"&gt;3.0&lt;/id&gt;
                &lt;rrp xsi:type="xsd:double"&gt;4.56&lt;/rrp&gt;
            &lt;/product&gt;
        &lt;/items&gt;
        &lt;items xsi:type="ns2:InvoiceLine"&gt;
            &lt;!-- etc &gt;


Hidden away in there we see there's an Invoice, PersonalAccount, Address, InvoiceLines and Products.

We can leverage this information to point PHP at classes to use when rehydrating the response, which is cool. We give the SoapClient a type map:

$options = [
    'trace' => true,
    'typemap' => [[
        'type_ns' => 'http://accounts.adamcameron.me',
        'type_name' => 'Address',
        'from_xml' => ['\me\adamcameron\accounts\Address', 'createFromXml']
    ],[
        'type_ns' => 'http://accounts.adamcameron.me',
        'type_name' => 'Account',
        'from_xml' => ['\me\adamcameron\accounts\Account', 'createFromXml']
    ],[
        'type_ns' => 'http://accounts.adamcameron.me',
        'type_name' => 'Product',
        'from_xml' => ['\me\adamcameron\accounts\Product', 'createFromXml']
    ],[
        'type_ns' => 'http://accounts.adamcameron.me',
        'type_name' => 'InvoiceLine',
        'from_xml' => ['\me\adamcameron\accounts\InvoiceLine', 'createFromXml']
    ],[
        'type_ns' => 'http://accounts.adamcameron.me',
        'type_name' => 'Invoice',
        'from_xml' => ['\me\adamcameron\accounts\Invoice', 'createFromXml']
    ]]
];

$client = new \SoapClient($wsdl, $options);

This maps the SOAP type to a PHP class, and method to use to rehydrate the object.

Running the code again with the typemap yields a fully-modelled result:

object(me\adamcameron\accounts\Invoice)#3 (3) {
    // ...
    ["account"]=>
    object(me\adamcameron\accounts\Account)#5 (5) {
        // ...
        ["address"]=>
        object(me\adamcameron\accounts\Address)#9 (5) {
            // ...
        }
    }
    ["items"]=>
    array(3) {
        [0]=>
        object(me\adamcameron\accounts\InvoiceLine)#13 (4) {
            // ...
            ["product"]=>
            object(me\adamcameron\accounts\Product)#15 (3) {
                // ...
            }
            // ...
        }
        [1]=>
        object(me\adamcameron\accounts\InvoiceLine)#14 (4) {
            // ...
            ["product"]=>
            object(me\adamcameron\accounts\Product)#17 (3) {
                // ...
            }
            // ...
        }
        [2]=>
        object(me\adamcameron\accounts\InvoiceLine)#16 (4) {
            // ...

            ["product"]=>
            object(me\adamcameron\accounts\Product)#19 (3) {
                // ...
            }
            // ...
        }
    }
}

(I've elided the irrelevant stuff from that one).

So that's pretty handy.

I do wish I could provide the typemap with each call though if I wanted to, not just when creating the SoapClient object. Oh well: it was no huge hardship.

There's no trick on the class side of things at all. Here's the InvoiceLine class:

<?php

namespace me\adamcameron\accounts;

class InvoiceLine {

    public $id;
    public $product;
    public $count;
    public $price;
        
    function __construct(int $id, Product $product, int $count, float $price){
        $this->id = $id;
        $this->product = $product;
        $this->count = $count;
        $this->price = $price;
    }
    
    static function createFromXml ($xml){
        $sxe = is_string($xml) ? new \SimpleXMLElement($xml) : $xml;
        $obj = new InvoiceLine(
            (int)$sxe->id,
            Product::createFromXml($sxe->product),
            (int)$sxe->count,
            (float)$sxe->price
        );
        return $obj;            
    }
}


OK the one trick is that the SoapClient only passes the "top level" object's XML through to the top level method. So in this case SoapClient calls InvoiceLine->createFromXml as per the type map, but from there it's up to me to do the same for any other objects this one uses, for example a Product in this case.

Oh and the other trick is that because of this, I need that XML / string differentiator at the beginning of the method. SoapClient passes in a string, but once I've converted that to XML, that's what the other calls will receive (ie: Product::createFromXml receives XML, not a string). That took me a while to nut-out, but that's just me being a div.

Oh yeah, and don't forget to explicitly cast the values going into yer properties, otherwise you'll end up with XML in there, not the actual values. That threw me for a while as well :-/

All the rest of the PHP code for this is on GitHub too.

The problem I had with the Adobe web service is that it didn't bother including the type attributes in its responses, other than on the container element (so in my example the array of InvoiceItems would have had a type on it). This means I had to pass the entire collection container into a handler method which then read the XML tag names and hand-cranked a switch based on that as to what rehydration method to use. I'm gonna try to contrive a non-business-sensitive version of that to show how I dealt with it.

Now... I've received orders from a mate that I'm expected at the pub in an hour. So I had better get cracking. Sorry to make you think about SOAP. And Adobe.

Righto.

--
Adam