Showing posts with label PHP. Show all posts
Showing posts with label PHP. Show all posts

Wednesday, 8 February 2017

PHP: more array dereferencing weirdness

G'day:
(Firstly: I hasten to add that the "weirdness" is "weird to me". It might be completely predictable to everyone else in the PHP world).

This is a follow-on from my earlier article "PHP: accessing an undefined array element yields a notice. Or does it? WTF, PHP?". Now before I start: Kalle Sommer Nielsen has come back to me in a comment on that article and mentions it's expected behaviour, and explained it... but I don't quite follow yet. I'm hoping some ensuing experimentation will cast some light on the scene.

My next variation of this has me scratching my head more. Well: it's demonstrating the same thing, but perhaps more clearly.

<?php

echo PHP_EOL . "Test with array" . PHP_EOL;

$a = [1, 2, 3];
var_dump($a[0]);
var_dump($a[0]['id']);
var_dump($a[0][0]['id']);
var_dump($a[0]['id'][0]);

echo PHP_EOL . "Test with int" . PHP_EOL;
$i = 1;
var_dump($i['id']);
var_dump($i[0]['id']);
var_dump($i['id'][0]);

echo PHP_EOL;

echo PHP_EOL . "Test with string" . PHP_EOL;
$s = '1';
var_dump($s['id']);
var_dump($s[0]['id']);
var_dump($s['id'][0]);

This yields:


C:\temp>php test.php

Test with array
C:\temp\test.php:6:
int(1)
C:\temp\test.php:7:
NULL
C:\temp\test.php:8:
NULL
C:\temp\test.php:9:
NULL

Test with int
C:\temp\test.php:13:
NULL
C:\temp\test.php:14:
NULL
C:\temp\test.php:15:
NULL


Test with string
PHP Warning:  Illegal string offset 'id' in C:\temp\test.php on line 21
PHP Stack trace:
PHP   1. {main}() C:\temp\test.php:0

Warning: Illegal string offset 'id' in C:\temp\test.php on line 21

Call Stack:
    0.0002     356384   1. {main}() C:\temp\test.php:0

C:\temp\test.php:21:
string(1) "1"
PHP Warning:  Illegal string offset 'id' in C:\temp\test.php on line 22
PHP Stack trace:
PHP   1. {main}() C:\temp\test.php:0

Warning: Illegal string offset 'id' in C:\temp\test.php on line 22

Call Stack:
    0.0002     356384   1. {main}() C:\temp\test.php:0

C:\temp\test.php:22:
string(1) "1"
PHP Warning:  Illegal string offset 'id' in C:\temp\test.php on line 23
PHP Stack trace:
PHP   1. {main}() C:\temp\test.php:0

Warning: Illegal string offset 'id' in C:\temp\test.php on line 23

Call Stack:
    0.0002     356384   1. {main}() C:\temp\test.php:0

C:\temp\test.php:23:
string(1) "1"

C:\temp>

OK, so here I factor-out the top-level array from the mix. That test was always referring to the first element of the array, which is just the integer 1, so we might as well just test with that. Needless to say, there's the same results as with the first array element. But I still don't get it. What is using array notation to dereference an integer supposedly doing? It's a non-sensical operation, as far as I understand things. And if it is a nonsensical operation: it should error.

For completeness I also try a variation with a string "1". This does error like I'd expect. Well: it gives a warning and then returns an inappropriate (IMO) result anyhow. This is probably PHP doing it's combo of telling me I did it wrong but then going ahead and guessing what I meant anyhow. Grrr.

I'm gonna mess around with de-referencing integers some more when I get a moment, and try to work out WTF.

Righto.

--
Adam

Monday, 30 January 2017

PHP: accessing an undefined array element yields a notice. Or does it? WTF, PHP?

G'day:
So we encountered this today:

<?php
$array = [1, 2, 3];

var_dump($array['id']); // this yields a warning
var_dump($array[0]['id']); // none of these do
var_dump($array[0][0]['id']);
var_dump($array[0]['id'][0]);

As indicated, this yields results:

C:\temp>php shite.php
PHP Notice:  Undefined index: id in C:\temp\shite.php on line 4
PHP Stack trace:
PHP   1. {main}() C:\temp\shite.php:0

Notice: Undefined index: id in C:\temp\shite.php on line 4

Call Stack:
    0.0002     353472   1. {main}() C:\temp\shite.php:0

C:\temp\shite.php:4:
NULL
C:\temp\shite.php:5:
NULL
C:\temp\shite.php:6:
NULL
C:\temp\shite.php:7:
NULL

C:\temp>

I think this behaviour is bloody daft. Accessing something that doesn't exist should be more than a notice, and it shouldn't then go ahead and return "null" anyhow. And giving a notice and returning null is possibly the worst way of combining possible actions here. What am I supposed to do with that? Other than bury my head in my hands a weep.

Now this is partially documented on the Arrays page of the docs:

Note:
Attempting to access an array key which has not been defined is the same as accessing any other undefined variable: an E_NOTICE-level error message will be issued, and the result will be NULL.
Note:
Array dereferencing a scalar value which is not a string silently yields NULL, i.e. without issuing an error message.

That's shit, but... well: it's not that surprising (I mean... it's PHP, right?). But how come

$array['id']

incurs the warning, but this doesn't:

$array[0]['id']

This just seems to make fairly unhelpful behaviour just that much less helpful.

I'm sure there's a reason for this. Not a good one, I hasten to add, but no doubt there's a reason.

Anyone know what it is?

Update:

Thanks to Dave (comment below) for pointing out there's a bug been raised for this: 68110.


Righto.

--
Adam

Saturday, 14 January 2017

Incredibly: a reader asks me for help with PHP & Nginx

G'day:
Jesus fuck, you lot. Don't ask me shit about systems support / server application config / all that godawful shit that should be consigned to the Systems Support Team (sorry to my mates in this role, but I fucking hate it, and became a dev so I didn't have to do it any more).

Right so ages ago I wrote an article "PHP: getting PHP 5 and PHP 7 running side by side on the same machine (a better way)", and someone recently asked me how do do the same on Nginx. Well it's "slow news night" here in my life: I'm just at the pub in Galway passing time by getting pissed on Guinness (I'm on me fifth pint, and even I can tell the writing here is reflecting that) and writing blog articles, so I had a look at it. It's pretty easy, as it turns out.

Firstly: I am no expert on Nginx. I don't like web servers. I try not to ever have to use them. I know enough about IIS to know it's a pain in the arse but I can do what I need if I have to; and I know enough about Apache to get things working. I don't even know why Nginx exists. I'm guessing the reason is "to annoy Adam, cos it's just one more bloody thing he'll need to know about at some point". Sigh. I got PHP working on Nginx once before ("PHP: getting my dev environment running on Nginx instead of Apache"), and other than that have converted some rewrite rules from Apache to Nginx (man: does Nginx suck compared to Apache for those!).

This is a different laptop from the one I did that other exercise on, so whilst I had Nginx on here (for the rewrite exercise, which was work-related and this is my work laptop), I decided to start from scratch. I deleted what I had installed (where for Nginx "installed" means "unzipped").

Here's what I did:
  1. grabbed the latest Windows Nginx download from their site. I just googled "nginx windows download" to find that.
  2. Unzipped it. Relocated it to my apps dir: c:\apps\nginx
  3. Stopped Apache (which listens on port 80), and instead started Nginx (just run nginx.exe from the dir above). By default it listens on localhost and port 80.
  4. Browse to http://localhost/ and verified the Nginx default index.html page was served. OK, so it works as a baseline. Always test the baseline before proceeding with any customisations of things.
  5. Set-up a coupla test hosts in my hosts file (C:\windows\system32\drivers\etc\hosts... make sure to start yer text editor as an admin, otherwise you won't be able to save it):
    127.0.0.1 php5.nginx.local
    127.0.0.1 php7.nginx.local

    I foresaw that to run both PHP5 and PHP7, Nginx is gonna need to differentiate between which one wants to use, and using different host names seemed to make sense and be easy.
  6. I edited my nginx.conf file (in the conf subdir of the one above) to know about these two hosts:
    http {
        # [...]
    
        server {
            listen       8800;
            server_name  php5.nginx.local;
            root   c:/src/php/php.local/www;
    
            # [...]
        }
    
        server {
            listen       8800;
            server_name  php7.nginx.local;
            root   c:/src/php/php.local/www;
    
            # [...]
        }
    }
    

    Where I've elided stuff, it's the same as it was before. The chief considerations here are:
    • having a server for both PHP5 and PHP7;
    • having them listen on each of those two new hosts I set up;
    • setting the root to be where my PHP code is. This is the same for both in this case, as I want to serve exactly the same code via both 5 and 7;
    • oh and I'm listening on port 8800 as I don't want Nginx to interfere with my normal Apache install (I'll be sticking with Apache after this experiment, thanks).
    Note that this config will still not serve PHP, but it'll at least run.
  7. When one runs Nginx from the console it hogs the prompt, so to stop it one needs to run another console and call nginx -s stop. We need to do this to test the config changes. So I did that, and used the other console to start it again.
  8. I browse to each of http://php5.nginx.local:8800 and http://php7.nginx.local:8800 to test they were working. They were running, but giving a 403 cos I didn't have an index.html in that directory, nor did I have directory browsing switched on (which for my test code I do, as it makes finding stuff easier).
  9. I told Nginx to allow directory browsing for each of the server configs:
    location / {
        index  index.html index.htm;
        autoindex on;
    }
    

    Do not do this in production. Well: don't do anything I say in production.
  10. I cycled Nginx again and tested both hosts:

    Index of /


    ../
    code-coverage-reports/                             15-Nov-2016 13:33                   -
    community/                                         30-Jun-2016 11:47                   -
    experiment/                                        15-Nov-2016 08:47                   -
    library/                                           30-Jun-2016 11:47                   -
    stackoverflow/                                     30-Jun-2016 11:47                   -
    deleteme.php                                       30-Jun-2016 11:47                 233
    gdayWorld.html                                     30-Jun-2016 11:47                  11
    gdayWorld.php                                      30-Jun-2016 11:47                  50
    phpinfo.php                                        30-Jun-2016 11:47                  21
    utf8.html                                          21-Dec-2016 08:21                 133
    


    So that's all good except for me not having deleted that file that perhaps I meant to, a while back ;-)
  11. Next I just followed the instructions from me other blog article, getting PHP to listen out to traffic coming in from Nginx (the code below is in a batch file):
    start C:\bin\RunHiddenConsole.exe C:\apps\php\5\5\php-cgi.exe -b 127.0.0.1:8550 start C:\bin\RunHiddenConsole.exe C:\apps\php\7\1\php-cgi.exe -b 127.0.0.1:8710
    Note that each of them is listening on a different port.
  12. And then tell Nginx to pass PHP requests across to PHP:
    server {
        # [...]
        location ~ \.php$ {
            fastcgi_pass   127.0.0.1:8710;
            fastcgi_index  index.php;
            fastcgi_param  SCRIPT_FILENAME  $document_root$fastcgi_script_name;
            include        fastcgi_params;
        }
        # [...]
    }
    

    That's the one for PHP7, but the PHP5 one is the same except for the port.
  13. I restart Nginx again, and this time hit phpinfo.php on each host:
    {i'd show you proof, but BlogSpot is refusing to let me put two images inline in an <ol> list, it seems. Trust me... it works).
  14. Hurrah!
  15. The last thing I tried to do is to wrap my batch file that starts Nginx into a service, using RunAsService.exe, but this is my work laptop and it's locked-down too tight for me to run that. But... well... the batch file works.
So, anyway... that's it. That's what one needs to do to get Nginx serving two different versions of PHP. My will to live has been sapped by even having to think about this sort of shit, so I'm going back to my Guinness (two pint further ahead than I was when I started this exercise).

Sorry this one is a bit scrappy but... well... I'm drunk.

Righto.

--
Adam

PHP: 2000 words on elseif (and PHPMD)

G'day:
This is another one that just lives up to the definitive notion of a blog: a web log; a log of what I'm doing. But, like... web-based. A blog. What I basically mean is that I'm not sure I'm showing any insight here (look: I basically know I'm not), but it's what I looked at today, so it's what I'm writing about today.

We're trying to lift our game when it comes to our code "Cleanliness" (in the RC Martin "Clean Code" sense), and we've started more rigorously using some code review tools recently, such as PHPCS and PHPMD. They're both pretty good, and I recommend them to anyone using PHP. Almost everyone probably already is: I'm always late to these games.

One PHPMD rule that tends to irk me is the one that declares "[one's code] uses an else expression. Else is never necessary and you can simplify the code to work without else". Now: generally this is true. One can usually refactor one's code to stick it in a helper method and early return in the if logic, then just the rest of the code after the if block is what happens in the else case. This is good because it helps one's code be "closer to the left-edge"; ie: it's not indented because it's not in a logic block any more. The more one needs to indent one's code: the more it probably smells. It's got to the stage with our code now that a double nesting of logic-control statements almost gets the same reaction from code reviewers as when they see a comment in our code; "hey, gather round everyone: Adam's made a comment. Let's look at it!" (Adam shrinks into the carpet and starts getting his excuses ready. Later he refactors the code to be clearer, and the comment goes; the code reviewers are happy). The problem for me is that I'm not clever enough sometimes, and it takes me a coupla takes before I see how to get rid of a wayward else. At least now I usually pick it up and work it out during my "self code review" phase, and sort it out. And PHPMD helps with this.

Today I was reviewing a colleague's code, and noted they had not an else but an elseif. And I triumphantly(*) pinged them with a comment reminding them to run PHPMD before submitting their pull request.

(*) because I'm petty.

It turns out that they had run PHPMD, and it hadn't pulled 'em up. Initially (again: cos I'm petty) I didn't believe them and knocked together my own SSCCE and looked at it.

This demonstrates the problem:

function takeAChance(){
    $firstChance = (bool) rand(0,1);

    if ($firstChance) {
        echo "firstChance was true";
    } else {
        echo "firstChance was false";
    }
}

takeAChance();

And now running PHPMD over it:

C:\src\php\php.local\src\phpmd>phpmd else_block.php text phpmd.xml
C:\src\php\php.local\src\phpmd\else_block.php:8 The method takeAChance uses an else expression.
Else is never necessary and you can simplify the code to work without else.

C:\src\php\php.local\src\phpmd>


OK, fair enough. Oh btw, the solution to this is to early return:

function takeAChance(){
    $firstChance = (bool) rand(0,1);

    if ($firstChance) {
        echo "firstChance was true";
        return;
    }
    echo "firstChance was false";
}

(note: yeah yeah, in the real world I'd not be echoing from a function. I get that. Not the point here)

OK, now here's an SSCCE of the sort of thing we had in this pull request:

function takeAChance(){
    $firstChance = (bool) rand(0,1);
    $secondChance = (bool) rand(0,1);

    if ($firstChance) {
        echo "firstChance was true; don't care what secondChance was";
    }elseif($secondChance){
        echo "well at least secondChance was true";
    }
}

And running PHPMD:

C:\src\php\php.local\src\phpmd>phpmd elseif_block.php text phpmd.xml

C:\src\php\php.local\src\phpmd>

No problems. Grrr. This does not make sense to me. If we add an actual else in there:

function takeAChance(){
    $firstChance = (bool) rand(0,1);
    $secondChance = (bool) rand(0,1);

    if ($firstChance) {
        echo "firstChance was true; don't care what secondChance was";
    }elseif($secondChance){
        echo "well at least secondChance was true";
    }else{
        echo "both were false";
    }
}

And PHPMD says:
C:\src\php\php.local\src\phpmd>phpmd if_elseif_else_block.php text phpmd.xml
C:\src\php\php.local\src\phpmd\if_elseif_else_block.php:11
The method takeAChance uses an else expression.
Else is never necessary and you can simplify the code to work without else.

C:\src\php\php.local\src\phpmd>


And, yeah, it's true... but the elseif seems to me to be as much an issue here as an else. We can short-circuit both easily enough with early returns:

function takeAChance(){
    $firstChance = (bool) rand(0,1);
    $secondChance = (bool) rand(0,1);

    if ($firstChance) {
        echo "firstChance was true; don't care what secondChance was";
        return;
    }
    if($secondChance){
        echo "well at least secondChance was true";
        return;
    }
    echo "both were false";
}

Equally, to me a PHP elseif (note this is an actual command, it's not a concatenation of else if like one might expect), really is, functionally, a "single-statement else", just with the statement itself being an if statement. What I mean is remember that the if / elseif / else statement doesn't need to use blocks (ie: a bunch of statements within braces), it could be a single statement:

function takeAChance() {
    $firstChance = (bool) rand(0,1);

    if ($firstChance)
        echo "firstChance was true";
    else
        echo "firstChance was false";

}

(what an abomination: almost looks like Python. Bleah).

Anyway... so yeah, one doesn't need to use braces if there's only one statement to execute based on the result of the condition. Don't do this. But it's possible.

What I'm getting at is this:

if ($firstChance) {
    echo "firstChance was true; don't care what secondChance was";
}elseif($secondChance){
    echo "well at least secondChance was true";
}

Is basically like this:

if ($firstChance) {
    echo "firstChance was true; don't care what secondChance was";
}else{
    if($secondChance){
        echo "well at least secondChance was true";
    }
}

Just without the braces on the else:

if ($firstChance) {
    echo "firstChance was true; don't care what secondChance was";
}else
    if ($secondChance){
        echo "well at least secondChance was true";
    }

Compare this again to the elseif version:

if ($firstChance) {
    echo "firstChance was true; don't care what secondChance was";
}elseif ($secondChance){
    echo "well at least secondChance was true";
}

The only difference there is the whitespace between the else and the if. And logically there is no difference, as far as I can tell.

I tested all variations I could think of of elseifs, elses, braces, no braces, and the like (see the samples on GitHub), and ran PHPMD over them:

C:\src\php\php.local\src\phpmd\samples>phpmd . text phpmd.xml
C:\src\php\php.local\src\phpmd\samples\else_block.php:8 The method takeAChance uses an else expression. Else is never necessary and you can simplify the code to work without else.
C:\src\php\php.local\src\phpmd\samples\else_if_block.php:9      The method takeAChance uses an else expression. Else is never necessary and you can simplify the code to work without else.
C:\src\php\php.local\src\phpmd\samples\if_elseif_else_block.php:11      The method takeAChance uses an else expression. Else is never necessary and you can simplify the code to work without else.

C:\src\php\php.local\src\phpmd\samples>

Hmmmm... it's not a case of else clauses being a problem it seems. It's a problem with else blocks being a problem.

It's complaining about this:

if ($firstChance) {
    echo "firstChance was true";
} else {
    echo "firstChance was false";
}


But not about this:

if ($firstChance) {
    echo "firstChance was true";
} else 
    echo "firstChance was false";
 


The only difference being the lack of braces on the second one.

Now I guess this makes an element of sense. The "untidiness" of large logic blocks is that they are more of a pain to navigate across when maintaining the code. With an if/else statement one needs to pay attention to where the if is, where the else is, which code is in which clause, and where the whole thing ends. This is not a problem if one keeps the code clean so each block has only 1-2 statements in it, but we've all seen if/elses that go on for miles and miles worth of scrolling down (and, it's worth pointing out: that is shit code if it's written like that. Don't do it). It's exacerbated by further nested logic within that. Fair enough.

But with a non-block else case - which intrinsically can only be one statement - the problem of unclean block sizes can simply not occur. Also fair enough.

But in that case, the warning should not be able else, intrinsically; it should be about the size of the blocks. And the size rule should probably apply to the if block too! If a single-statement else is OK, then a single-statement else block should be OK too.

I think the solution to this is that any else clause should be flagged: be it a single statement, or be it a block. So intrinsically an elseif should be flagged too. I'll raise this with the PHPMD bods.

Whilst on this: they should probably swing the other way and actually flag any instances of single-statement logic clauses: ifs and loops should always have braces. That's a widely accepted truism in curly-brace-language programming. I realise PHPCS does pick up on these (I only realise it just now cos I just tested it), but if PHPMD is busying itself with this sort of thing too; it should do a thorough job. In my view brace-less control logic is mess.


Lastly on this, I'm slightly irked by this, from the PHPMD Clean Code Rules page:

ElseExpression

[...]

An if expression with an else branch is never necessary.

It is terribly pedantic of me, but… it's not an expression. PHP flow control statements are…statements, not expressions. An expression is a specific thing: it is some code that represents a value. In PHP if/else statement does not represent a value. One cannot do this:
$message = if ($firstChance) {
    "firstChance was true";
} else {
    "firstChance was false";
}

If you cannot have some code on the right-hand side of an assignment (or anywhere else a value might be expected), then it's not an expression. A counterpoint to this, one can do this in Ruby:

message = if firstChance
    "firstChance was true";
else
    "firstChance was false";
end

puts message

This is because "everything is an object" in Ruby, I guess: even statements. In this case it seems the value of the if/else statement is that of the last statement in the relevant clause that gets executed by the logic. I'm buggered if I know whether Ruby best practices would encourage that (and by that I mean: I really actually don't know!), but I dunno that I'd be doing that sort of thing. Still: it demonstrates an if statement that is also an expression. Which a PHP one is not.

This sort of thing wouldn't bug me if it was in a casual conversation, but in an official implementation, I think it's a bit… ah… well let's saynit could just be done properly. But then again I guess this is PHP. And "doing it right" doesn't seem to be something PHP concerns itself with too much. It seems to be more "just get it done".

All this aside, I think PHPMD is a bloody handy tool, and whether or not I agree with the minutiae of some of its analysis or its turn of phrase is minor. It's helped my code a lot in the last coupla weeks. And I think our code in general is better of for using tools like this.

So… yeah. That was what I was doing yesterday (it was Fri when I started, it's now Sat evening and I'm in the pub). not scintillating stuff, but the examination of what was going on and the general experimentation was useful (to me, anyhow), I think. And it's enabled me to write a 2000-word blog article which is basically about elseif expressionsclauses.

Righto.

--
Adam

Saturday, 7 January 2017

PHP: looking into usage of the ::class constant

G'day:
There's not much to this article. It's arisen from the fact there's a paucity of docs on this subject on the PHP website, and they're quite hard to google for (given Google still insists on ignoring punctuation in searches, for some daftly ignorant reason. So searching for "php ::class" just gets one results for "php class". Not the same thing. Similarly seaching for "php ::class constant" doesn't help.

For the record, the two relevant docs pages are:


Update:

Kalle Sommer Nielsen has been in touch via Twitter and has pointed me to the original RFC for ::class, too: Request for Comments: Class Name Resolution As Scalar Via "class" Keyword. Now to be honest, the content of that RFC would make for better documentation than the current docs furnish. Obviously it'd need a slight tweak to reword some of the questions as statements, and if there were any changes between RFC and what was adopted, those should be included too. Kalle also said he's tweaked the docs slightly too, which is cool. Nice work fella!

I include those there as much for me to be able to easily re-find them as anything else.

In truth there's not much to this ::class construct. I just wish the docs were easier to locate. When discussing the feature, it's nice to be able to point to some docs. Or now: a blog article ;-)

So what does it do?

All classes have a built-in constant (::class) that contains a string that is the class's name. Same as what the get_class function will return for an object, but it works on the actual class too (because it's a class constant). Here are some examples:

<?php

namespace me\adamcameron\cc;

use com\example\other\SomeClassToAlias as AliasedClass;
use com\example\other\SomeClassToAlias;
use com\example\other\SomeOtherClass;

require_once realpath(__DIR__ . '/../vendor/autoload.php');

echo "Using ::class" . PHP_EOL;
printf("SomeClass: %s%s", SomeClass::class, PHP_EOL);
printf("SomeOtherClass: %s%s", SomeOtherClass::class, PHP_EOL);
printf("SomeClassToAlias as AliasedClass: %s%s", AliasedClass::class, PHP_EOL);
printf("SomeClassToAlias: %s%s", SomeClassToAlias::class, PHP_EOL);
printf("PHPUnit_Framework_Exception: %s%s", \PHPUnit_Framework_Exception::class, PHP_EOL);

$someClass = new SomeClass();
$someOtherClass = new SomeOtherClass();
$someAliasedClass = new AliasedClass();
$someClassToAlias = new SomeClassToAlias();
$phpunitFrameworkException = new \PHPUnit_Framework_Exception();

echo PHP_EOL . "Using get_class() on instance" . PHP_EOL;
printf("SomeClass: %s%s", get_class($someClass), PHP_EOL);
printf("SomeOtherClass: %s%s", get_class($someOtherClass), PHP_EOL);
printf("AliasedClass: %s%s", get_class($someAliasedClass), PHP_EOL);
printf("SomeClassToAlias: %s%s", get_class($someClassToAlias), PHP_EOL);
printf("PHPUnit_Framework_Exception: %s%s", get_class($phpunitFrameworkException), PHP_EOL);

In the example all the classes in the me\adamcameron or com\example namespaces are just empty classes, eg:

<?php

namespace me\adamcameron\cc;

class SomeClass {}

I've also included a PHPUnit class there as PHPUnit is a lib I'm aware of that doesn't use namespaces (dunno why), and wanted to see how that behaved, in case there were quirks: seemingly not.

Here's the output:

C:\src\php\php.local\src\oo\classConstant\src>php testWithNamespace.php
Using ::class
SomeClass: me\adamcameron\cc\SomeClass
SomeOtherClass: com\example\other\SomeOtherClass
SomeClassToAlias as AliasedClass: com\example\other\SomeClassToAlias
SomeClassToAlias: com\example\other\SomeClassToAlias
PHPUnit_Framework_Exception: PHPUnit_Framework_Exception

Using get_class() on instance
SomeClass: me\adamcameron\cc\SomeClass
SomeOtherClass: com\example\other\SomeOtherClass
AliasedClass: com\example\other\SomeClassToAlias
SomeClassToAlias: com\example\other\SomeClassToAlias
PHPUnit_Framework_Exception: PHPUnit_Framework_Exception

C:\src\php\php.local\src\oo\classConstant\src>

So you see there are no surprises really: ::class just returns the fully-qualified path of the class concerned. One thing to not is that it does not pay any attention to aliasing, whether on the class itself or on an object of that aliased class. I guess an alias is just for clarity in source code, rather than any sort of "renaming" exercise.

Where does one use ::class constants?

Well my primary usage is when mocking, using PHPUnit. Instead of this:

$mockedThing = $this->getMockBuilder('path\to\class\being\mocked\MockThisClass')
    ->disableOriginalConstructor()
    ->setMethod(['someMethod'])
    ->getMock();

We just have this:

$mockedThing = $this->getMockBuilder(MockThisClass::class)
    ->disableOriginalConstructor()
    ->setMethod(['someMethod'])
    ->getMock();

(and PHPStorm even includes the use statement for me, automatically):

use path\to\class\being\mocked\MockThisClass;

It's good to keep all the class pathing references together at the top of the file.

One flaw in this constant is this behaviour:
<?php

require_once realpath(__DIR__ . '/../vendor/autoload.php');

echo "Using ::class" . PHP_EOL;
printf("SomeNonExistentClass: %s%s", SomeNonExistentClass::class, PHP_EOL);

Any sensible person would probably want an error to be thrown there. But... no. PHP does this:

C:\src\php\php.local\src\oo\classConstant\src>php testWithNonExistentClass.php
Using ::class
SomeNonExistentClass: SomeNonExistentClass

C:\src\php\php.local\src\oo\classConstant\src>

Groan. Why does PHP insist on being "helpful" like this. The class doesn't exist! Just say that. FFS.

Oh well... that sort of thing is almost to be expected of PHP, I guess. Sigh. Well this ::class constant mostly a well-implemented, if not glamourous, small feature in PHP. And now I seem to have documented ::class more than PHP itself has. Heh.

Speaking of documentation, I can't help but think this ::class constant should also be mentioned on a coupla other pages in the PHP docs:

It'd be a good fit for both of those pages, plus that's where Google lands you if you search for it.

That's it. I have to dash to go visit my son for a few hours. Thanks to my colleague Carlos for pointing this whole thing out to me, btw. Both the ::class constant construct itself, but also the observation to be made about aliases. Nice one, fella.

Righto.

--
Adam

PS: apologies for the blatant SEO keyword stuffing of ::class constant in this article ("oops... I did it again"). It was by design.

Wednesday, 4 January 2017

PHP: looking at the Chain of Responsibility pattern

G'day:
We're working on a proof of concept for some stuff at work at the moment, and I've been put on a code review of the first round of the code. Part of the code is dealing with conditionally getting some data from cache, or if it ain't there, getting it from the DB instead: a fairly common trope. Normally I'd fall back on the decorator pattern for this, indeed I've written about it in the past ("Using a decorator pattern to reduce inappropriate code complexity"). I didn't see this in play in this first tranche of code, so suggested its use. The authoring dev suggested they felt that the decorator pattern was not a good fit for this, so they were using the Chain of Responsibility pattern instead. Another of our devs - when prodded - tended to agree.

I gave this some thought, read-up on the Chain of Responsibility pattern (and some more on the Decorator Pattern), and wasn't quite seeing it.

Just to back up a bit... if yer like me and kinda knew about the Chain of Responsibility pattern, but wasn't intimately familiar with it: read that Wikipedia link above, and do some googling of yer own. But in short it's characterised by having a task that needs doing, and having a sequence (a chain) of agents that possibly can (or possibly cannot) resolve all or part of the task. The process is that the task is passed to the chain of agents, and each in turn decides for itself whether it can fulfil the task in whole or in part. If an agent can resolve the whole task it does so: returning the result. If a given agent cannot fulfil the whole task (either it cannot fulfil it at all, or can only do part of it), it does its work, then passes responsibility on to the next task in the chain. Each agent knows two things: how to do its part of the job, and that it can pass work on to the "next" agent. It does not (and should not!) know what the next agent does, nor should it rely on what the next agent does. It simply does its job and returns it or passes on incomplete work to the next agent in the chain. I can't be arsed with a diagram: go google one.

Now: for two thirds of our challenge, I think the Chain of Responsibility pattern is a good fit: for the "Is it in the cache? Yes [/No. Is it in the DB? Yes/No]" bit. No qualms there. However it sticks in my craw once we get to the step after getting from the DB: putting it in the cache for next time. That step does not belong in the chain as it's not performing the same sort of task as the other two: getting the data. Second to that it's hitting the cache before and after the DB step, which makes it more seem like a decoration to me (of sorts... I might get back to the "of sorts..." bit later). Even if one decides "get from cache" is one agent's job, and "get from DB, oh and put it in the cache" is another agent's job, that latter agent is doing a mishmash of "unrelated" work, and seems to be a bit tightly-coupled to me, as well as having a wodge of code (that's a technical term) doing two things, rather than two wodges of code doing one thing each.

That said, I don't like being so emphatic about "that's wrong" without being happy about my understanding of things, so figured I should try to make it work in a way that sits well with me in a proof of concept. That and it seemed like good material for a blog article.

In my proof of concept I am fetching some Person data, for example:

$person = $personService->getById(1);

The PersonService defers to a "RetrievalHandler" for this:

class PersonService {

    private $retrievalHandler;

    public function __construct(PersonRetrievalHandler $retrievalHandler) {
        $this->retrievalHandler = $retrievalHandler;
    }

    public function getById($id) : Person {
        $record = $this->retrievalHandler->getById($id);

        if (is_array($record)){
            $person = new Person($record['firstName'], $record['lastName']);

            return $person;
        }
        return new Person();
    }
}


This just finds and returns a Person object, implement thus:

class Person {

    public $firstName;
    public $lastName;

    public function __construct($firstName=null, $lastName=null) {
        $this->firstName = $firstName;
        $this->lastName = $lastName;
    }
}

As noted, the service uses a the RetrievalHandler to retreive stuff. A handler is implementation of this:

abstract class PersonRetrievalHandler {

    protected $nextHandler;
    protected $logger;

    public function __construct() {
        $this->nextHandler = new class {
            function getById(){
                return null;
            }
        };
    }

    public abstract function getById(int $id) : array;

    public function setNextHandler(PersonRetrievalHandler $nextHandler) {
        $this->nextHandler = $nextHandler;
    }
}

All it's required to do here is know how to use getById, and also it can setNextHandler. I'll come back to the code in the constructor later. Don't worry about that for now.

I've got two implementations of this. One for getting the thing from a caching service:

class CachedPersonHandler extends PersonRetrievalHandler {

    private $cacheService;

    public function __construct(CacheService $cacheService){
        parent::__construct();
        $this->cacheService = $cacheService;
    }

    public function getById(int $id) : array {
        if ($this->cacheService->exists($id)) {
            return $cachedPerson = $this->cacheService->get($id);
        }

        return $this->nextHandler->getById($id);
    }
}


And one for getting it from a database repository:

class DatabasePersonHandler extends PersonRetrievalHandler {

    private $repository;

    public function __construct(PersonRepository $repository) {
        parent::__construct();
        $this->repository = $repository;
    }

    public function getById($id) : array {
        $record = $this->repository->getById($id);
        if (count($record)) {
            return $record;
        }

        return $this->nextHandler->getById($id);
    }
}


Note this is very woolly and not production-sound code. It's just to demonstrate the concept.

Important to note in both these situations is the agents don't themselves do any of the work; they just try to get the work done by something else, and return it if done, or pass the task to the next agent if the current agent itself couldn't fulfil the requirement.

I've created a stub caching service and repository:

class CacheService {

    private $cache = [];
    private $logger;

    public function __construct(LoggingService $logger)
    {
        $this->logger = $logger;
    }

    public function get($key) {
        $this->logger->info("Cache hit for $key");
        return $this->cache[$key];
    }

    public function exists(string $key) : bool {
        $exists = array_key_exists($key, $this->cache);

        $this->logger->info(sprintf("Cache %s for %d", $exists ? 'hit' : 'miss', $key));

        return $exists;
    }

    public function put(string $key, $value) {
        $this->cache[$key] = $value;
    }
}


class PersonRepository {

    private $data = [
        [],
        ['firstName' => 'Tui', 'lastName' => 'Tahi'],
        ['firstName' => 'Rewi', 'lastName' => 'Rua'],
        ['firstName' => 'Tama', 'lastName' => 'Toru'],
        ['firstName' => 'Whina', 'lastName' => 'Wha']
    ];
    private $logger;

    public function __construct(LoggingService $logger)
    {
        $this->logger = $logger;
    }

    public function getById($id) {
        if (array_key_exists($id, $this->data)){
            $this->logger->info("Database hit for $id");
            return $this->data[$id];
        }
        $this->logger->info("Database miss for $id");
        return [];
    }
}


These also log activity in a very naïve way:

class LoggingService {
    public function info(string $message) {
        echo $message . PHP_EOL;
    }
}


All of this is stitched together in a factory, as there's a lot of moving parts:

class PersonServiceFactory {

    public static function getPersonService()
    {
        $loggingService = new LoggingService();
        $cacheService = new CacheService($loggingService);
        $personRepository = new PersonRepository($loggingService);

        $cachedHandler = new CachedPersonHandler($cacheService);
        $databaseHandler = new DatabasePersonHandler($personRepository);

        $cachedHandler->setNextHandler($databaseHandler);

        $personService = new PersonService($cachedHandler);

        return $personService;
    }

}


Note how the only knowledge each agent has of another is what a given agent's next handler might be. Speaking of next handlers. let's go back to that code in the PersonRetrievalHandler's constructor:

public function __construct() {
    $this->nextHandler = new class {
        function getById(){
            return null;
        }
    };
}


You'll remember each getById implementation will either fulfil the task and return the result, or it'll defer to the next agent in the chain. This is just a mechanism to make this "work" even if the last agent in the chain calls return $this->nextHandler->getById($id). The base class will take the call and just return null. I was semi-pleased to be able to use a class literal here (or anonymous class as PHP wants to call them). So this means I do not need to set the next handler on the $databaseHandler: it'll just use the inbuilt one if it cannot fulfil the requirement itself.

Right so we can now run this to see what happens. Remember the expectation is that the requested data will be fetched from the cache as a priority, but if not found: get it from the DB. Note that I am not doing anything here about putting the data into the cache if it wasn't there in the first place. I'll get to that. Anyway, here's some tests:

$personService = PersonServiceFactory::getPersonService();
$person = $personService->getById(1);
var_dump($person);
$person = $personService->getById(1);
var_dump($person);
$person = $personService->getById(4);
var_dump($person);
$person = $personService->getById(5);
var_dump($person);

Here we're:
  • getting the same person twice to start with: ostensibly this'd demonstrate a cache-miss then a cache-hit;
  • getting a different person from the first (back to a cache miss);
  • getting a person who is neither in the cache nor the DB.

Results:

C:\src\php\php.local\src\chainOfResponsibility\public>php manualTest.php
Cache miss for 1
Database hit for 1
object(me\adamcameron\cor\model\Person)#9 (2) {
  ["firstName"]=>
  string(3) "Tui"
  ["lastName"]=>
  string(4) "Tahi"
}
Cache miss for 1
Database hit for 1
object(me\adamcameron\cor\model\Person)#10 (2) {
  ["firstName"]=>
  string(3) "Tui"
  ["lastName"]=>
  string(4) "Tahi"
}
Cache miss for 4
Database hit for 4
object(me\adamcameron\cor\model\Person)#9 (2) {
  ["firstName"]=>
  string(5) "Whina"
  ["lastName"]=>
  string(3) "Wha"
}
Cache miss for 5
Database miss for 5
object(me\adamcameron\cor\model\Person)#10 (2) {
  ["firstName"]=>  NULL
  ["lastName"]=>  NULL
}

C:\src\php\php.local\src\chainOfResponsibility\public>

That's all good:
  • the cache never finds anything (although it does look),
  • so passes the job on to the DB. The DB will return the data if found,
  • otherwise it'll fall back to the default agent which just returns null.


So far: so good. But why did I omit the "putting it into the cache" step. Well... using this model, it simply won't work. The rule is that if an agent can resolve the work, then it returns it. Otherwise it defers to the next agent. I figured I could implement a "pass-thru" agent which just caches the result found by the DB agent: job done. But if the DB agent finds the record - read: "it can fulfil the task" - then it doesn't pass the task on to the next agent. It considers the task fulfilled, so just returns its result. So the only time the "put it in cache" agent will be called is if the DB agent doesn't find anything. Precisely the wrong time to cache something. That aside, the agent only passes on the inputs (the $id) to the next agent, so there's no mechanism here for an agent to receive the result from the previous agent even if the result was there to pass.

What we'd need to do here is just use the chain of responsibility to get the data, and perhaps leave it to the PersonService to cache the result it receives. However this is a bit unbalanced in my view, as caching considerations of the same data are being handled at not only two different places in the code, but two different "levels" of the task at hand. That's a bit clumsy in my book.

Next I thought I could adjust the DB handler to unconditionally pass its result on to the next handler, rather than break out of the chain once done. But that's a bit crap as it makes the DB handler behave differently than the other handlers in the chain, which also implies it knows more about what's going on than it should: ie that there's a reason to always pass the result on, rather than return it if it's done the work. Not cool.

Update:

My colleague (a different one) suggested a third open: roll the cache-put into the CachedPersonHandler:

class CachedPersonHandler extends PersonRetrievalHandler {

    private $cacheService;

    public function __construct(CacheService $cacheService){
        parent::__construct();
        $this->cacheService = $cacheService;
    }

    public function getById(int $id) : array {
        if ($this->cacheService->exists($id)) {
            return $cachedPerson = $this->cacheService->get($id);
        }

        $person = $this->nextHandler->getById($id);
        $this->cacheService->put($id, $person);
        return $person;
    }
}

Well: indeed. Now it's basically a decorator: you've decorated a DB call with some caching. Contrast it with this from my earlier article on the Decorator Pattern:

class CachedUserRepository implements UserRepositoryInterface {

    private $repository;
    private $cacheService;

    public function __construct(UserRepositoryInterface $repository, CacheServiceInterface $cacheService) {
        $this->repository = $repository;
        $this->cacheService = $cacheService;
    }

    public function getById($id) {
        if ($this->cacheService->isCached($id)) {
            return $this->cacheService->get($id);
        }
        $object = $this->repository->getById($id);

        $this->cacheService->put($id, $object);
        return $object;
    }

}

You've arrived at the same conclusion I started with ;-)

I did try one other tactic. Instead of passing just the inputs to the agents, I had a look at passing the inputs and the outputs through the chain, and also always passing on to the next agent, irrespective of whether a given handler arrived at the result. This was very proof-of-concept-y. It's possibly a legit approach if the Chain of Responsibility was one in that a given agent was only supposed to fulfil part of the solution, relying on all links in the chain to arrive at the complete result. This is not what I'm doing here, but decided to leverage this approach to get the caching working "properly".


Intermission

Go have a pee or get some more popcorn or something.


Now I have an interface for a Handler:

interface Handler {

    public function perform($request, $response);

}


This just reflects a more generic solution than getById. More importantly note that unlike getById, perform takes a request and a response. For our examples the "request" would be the ID. And the response would be any result a given agent was able to come up with.

This is implemented in an abstract way by PersonRetrievalHandler:

abstract class PersonRetrievalHandler implements Handler {

    protected $nextHandler;
    protected $logger;

    public function __construct() {
        $this->nextHandler = $this->getPassThroughHandler();
    }

    public function setNextHandler(PersonRetrievalHandler $nextHandler) {
        $this->nextHandler = $nextHandler;
    }

    private function getPassThroughHandler(){
        return new class  {
            function perform($_, $response){
                return $response;
            }
        };
    }
}

It's still down to the individual handlers to implement the perform method though.

I've also refactored that class literal out into a helper method; and its perform method now returns whatever response it was passed, rather than returning null.

The perform method in CachedPersonHandler is thus:

public function perform($request, $response=null) {
    $id = $request;
    if ($this->cacheService->exists($id)) {
        $response = $this->cacheService->get($id);
    }

    return $this->nextHandler->perform($request, $response);
}


Note that it doesn't return the result itself, it just passes it on to the next agent. Otherwise it's doing the same thing.

And the DatabasePersonHandler has also been slightly refactored:

public function perform($request, $response=null) {
    if (is_null($response)){
        $id = $request;
        $response = $this->repository->getById($id);
    }

    return $this->nextHandler->perform($request, $response);
}


It infers that the cache didn't find anything by it not receiving anything in the response, and only if so does it hit the DB. Otherwise it just passes everything through to the next agent.

Now we have a third handler, CacheablePersonHandler:

class CacheablePersonHandler extends PersonRetrievalHandler {

    private $cacheService;

    public function __construct(CacheService $cacheService){
        parent::__construct();
        $this->cacheService = $cacheService;
    }

    public function perform($request, $response=null) {
        if (!is_null($response)) {
            $id = $request;
            $this->cacheService->put($id, $response);
        }

        return $this->nextHandler->perform($request, $response);
    }
}


This one makes no pretence about finding data, it just takes any received response and caches it, keyed on the request. And passes the response on. At this point remember the default handler will just return whatever response it receives.

We wire all this together in the factory again:

public static function getPersonService()
{
    $loggingService = new LoggingService();
    $cacheService = new CacheService($loggingService);
    $personRepository = new PersonRepository($loggingService);

    $cachedHandler = new CachedPersonHandler($cacheService);
    $databaseHandler = new DatabasePersonHandler($personRepository);
    $cacheableHandler = new CacheablePersonHandler($cacheService);

    $databaseHandler->setNextHandler($cacheableHandler);
    $cachedHandler->setNextHandler($databaseHandler);

    $personService = new PersonService($cachedHandler);

    return $personService;
}


This differs only in that it also creates the CacheablePersonHandler too, and stick it at the end of the chain.

Here's our revised test:

$personService = PersonServiceFactory::getPersonService();

foreach ([1,1,4,5,5] as $id){
    $person = $personService->getById($id);
    var_dump($person);
}


(it's the same, just without the tautological repeated duplication of the same stuff again).

And the result:

C:\src\php\php.local\src\chainOfResponsibility\public>php manualTest.php
Cache miss for 1
Database hit for 1
Cache put for 1
object(me\adamcameron\cor\model\Person)#8 (2) {
  ["firstName"]=>
  string(3) "Tui"
  ["lastName"]=>
  string(4) "Tahi"
}
Cache hit for 1
Cache put for 1
object(me\adamcameron\cor\model\Person)#11 (2) {
  ["firstName"]=>
  string(3) "Tui"
  ["lastName"]=>
  string(4) "Tahi"
}
Cache miss for 4
Database hit for 4
Cache put for 4
object(me\adamcameron\cor\model\Person)#8 (2) {
  ["firstName"]=>
  string(5) "Whina"
  ["lastName"]=>
  string(3) "Wha"
}
Cache miss for 5
Database miss for 5
object(me\adamcameron\cor\model\Person)#11 (2) {
  ["firstName"]=>
  NULL
  ["lastName"]=>
  NULL
}
Cache miss for 5
Database miss for 5
object(me\adamcameron\cor\model\Person)#8 (2) {
  ["firstName"]=>
  NULL
  ["lastName"]=>
  NULL
}

C:\src\php\php.local\src\chainOfResponsibility\public>

So it "works", in that DB hits are now going into the cache.

But it also re-caches cache hits too :-(

Sigh. I could work around this by doing an exists call in the latter handler too:

public function perform($request, $response=null) {
    if (!is_null($response)) {
        $id = $request;
        if (!$this->cacheService->exists($id)) {
            $this->cacheService->put($id, $response);
        }
    }

    return $this->nextHandler->perform($request, $response);
}


But it all seems a bit clumsy to me. And when something seems clumsy to me, I start wondering if this square peg is supposed to be going in this round hole. And my conclusion is generally: "no".

The problem here goes back to the "cache the data" part of this chain doesn't belong in the chain. It's not a congruent task compared to the data-fetch tasks. The "responsibility" in the "Chain of Responsibility" is that the responsibility is "fulfilling the task". The caching agent makes no attempt to fulfil the requirement of the chain, so it doesn't belong in the chain to me.

So where does it belong? Well it belongs as near as possible to when we know the database was hit, and we have the result. So we could put it in the database agent. But then we have this imbalance that the cache-look-up is part of the chain, and the cache-put is within part of the chain. I don't like that asymmetry. Also it means the DB agent is doing two things: getting data and caching it, which seems like it's doing too much to me: it needs to know how to do two different things. We could go down the route of having like a CachedPersonRepository, and that at least removes those two responsibilities from the agent, but it does just move them: now the repo needs to know about the DB and the caching.

What I'd end up doing is using a caching decorator around a DB-only repository. That's symmetrical, and all concerns are separated.

Hmmm.

Second Intermission

Go find a suitable blade so you can open a vein and end it all instead of having to read on… (and on…)


Oh, as a footnote to this another requirement I wanted to prove is that the chain links behaved "sensibly" if called without their adjacent agents. IE: the DatabasePersonAgent agent didn't need the CachedPersonAgent, nor did it need the CacheablePersonAgent.

Here's some examples:

Just the database handler:

$loggingService = new LoggingService();
$personRepository = new PersonRepository($loggingService);

$databaseHandler = new DatabasePersonHandler($personRepository);

$personService = new PersonService($databaseHandler);

foreach ([1,5] as $id){
    $person = $personService->getById($id);
    var_dump($person);
}


Result:

C:\src\php\php.local\src\chainOfResponsibility\public>php manualTestWithJustDatabaseHandler.php
Database hit for 1
object(me\adamcameron\cor\model\Person)#7 (2) {
  ["firstName"]=>
  string(3) "Tui"
  ["lastName"]=>
  string(4) "Tahi"
}
Database miss for 5
object(me\adamcameron\cor\model\Person)#8 (2) {
  ["firstName"]=>
  NULL
  ["lastName"]=>
  NULL
}

C:\src\php\php.local\src\chainOfResponsibility\public>

Yup: it's getting data from the DB, and doesn't require the other agents.

Now one with just the CachedPersonHandler:

$loggingService = new LoggingService();
$cachingService = new CacheService($loggingService);

$cachedHandler = new CachedPersonHandler($cachingService);

$personService = new PersonService($cachedHandler);

$cachingService->put(5, ['firstName'=>'Ria', 'lastName'=>'Rima']);

foreach ([1,5] as $id){
    $person = $personService->getById($id);
    var_dump($person);
}

Result:

C:\src\php\php.local\src\chainOfResponsibility\public>php manualTestWithJustCachedHandler.php
Cache put for 5
Cache miss for 1
object(me\adamcameron\cor\model\Person)#7 (2) {
  ["firstName"]=>
  NULL
  ["lastName"]=>
  NULL
}
Cache hit for 5
object(me\adamcameron\cor\model\Person)#8 (2) {
  ["firstName"]=>
  string(3) "Ria"
  ["lastName"]=>
  string(4) "Rima"
}

C:\src\php\php.local\src\chainOfResponsibility\public>

Here I'm priming the cache manually, just to make sure it'll find the cached item OK. And it does.

For the sake of completeness I also test with just the cache-put agent. It won't do anything useless (a warning flag in itself), but it should "work":

$loggingService = new LoggingService();
$cachingService = new CacheService($loggingService);

$cacheableHandler = new CacheablePersonHandler($cachingService);

$personService = new PersonService($cacheableHandler);

foreach ([1,5] as $id){
    $person = $personService->getById($id);
    var_dump($person);
}


Result:

C:\src\php\php.local\src\chainOfResponsibility\public>php manualTestWithJustCacheableHandler.php
object(me\adamcameron\cor\model\Person)#7 (2) {
  ["firstName"]=>
  NULL
  ["lastName"]=>
  NULL
}
object(me\adamcameron\cor\model\Person)#8 (2) {
  ["firstName"]=>
  NULL
  ["lastName"]=>
  NULL
}

C:\src\php\php.local\src\chainOfResponsibility\public>

This is all unhelpful to the stated goal, but it's doing what it's told and is self-contained. "Win".

In conclusion I think this has helped me get a handle on the Chain of Responsibility pattern, which is good. I still want to have a look at chains wherein the agents achieve partial fulfilment, and possibly some recursion as well. That'd be cool.

But I really don't think this is the right solution to the job at hand at work.

I'll leave you be now. Oh... as per usual: all the code is on Github, including the working application. The first iteration of the work is tagged separately.

Righto.

--
Adam

Friday, 30 December 2016

PHP: using SoapClient to consume a SOAP web service

G'day:
Yeah, OK, just don't ask why I'm needing to consume SOAP web services. It's Adobe's fault. Not because of ColdFusion for a change, but because an app they offer which only exposes its API via SOAP. Because it's still 2006 at Adobe, it seems.

For the last few weeks, this has been running through my head:



But anyway.

Fighting with the Adobe web service was a right pain in the arse, but I won't go into that - it's not useful info to share - but I did have to monkey around with how PHP consumes SOAP web services, and piecing it together took a chunk of googling for me to get my brain around it, so I thought I'd write it up here.

I hasten to add that most of the hassle getting it to work was because of the shitty Adobe web service being unhelpful, so it's perhaps not a challenge for most other people.

I cannot deal with the Adobe web service in this article as it's subscription-only, and it's kinda work-related stuff I'm not allowed to expose here. So I knocked together my own web service to consume. I used CFML for this as all it takes to create a SOAP web service in CFML is to give a method an access qualifier of remote, and have it web-accessible. ColdFusion handles everything else.

Here's the web service code (I'll keep the CFML code to a minimum, but it's all on GitHub anyhow):

import me.adamcameron.accounts.*;

component wsversion=1 {

    remote Invoice function getById(numeric id) returnformat="wddx" {
        var address = new Address(1, "London", "United Kingdom", "E18");

        var account = new PersonalAccount(2, "Adam", "Cameron", "1970-02-17", address);

        var penguin = new Product(3, "Penguin", 4.56);
        var pangolin = new Product(7, "Pangolin", 8.90);
        var platypus = new Product(11, "Playtpus", 12.13);

        var lines = [
            new InvoiceLine(14, penguin, 15, 16.17),
            new InvoiceLine(18, pangolin, 19, 20.21),
            new InvoiceLine(22, platypus, 23, 24.25)
        ];

        invoice = new Invoice(id, account, lines);

        return invoice;        
    }

}

Note that I should not have to state the wsversion there, but for some reason I could not get the default - version 2 - to work, so I had to force it to be version 1. I didn't look into why it didn't work on version 2, as I really didn't want to spend more time than necessary on the CFML code for this exercise.

Other than that, this code is straight forward: it returns an Invoice object which comprises a Person (with an Address), and an array of InvoiceLines each of which have a Product. the modelling for those is the same as in the receiving PHP code, further down.

On the PHP side of things, I just call this code to consume it:

<?php

namespace me\adamcameron\accounts;

require __DIR__ . '/model.php';

$wsdl = "http://localhost:8516/cfml/webservices/soap/accounts/public/Invoices.cfc?wsdl";


$options = [
    'trace' => true
];

$client = new \SoapClient($wsdl, $options);

$invoice = $client->getById(2011);

var_dump($invoice);
echo "==============================" . PHP_EOL . PHP_EOL;
var_dump($client->__getLastResponse());

So it's simply a matter of creating a SoapClient with the WSDL URL and some options. I'm using the trace option here so I can do that call to __getLastResponse.

Let's have a look at the raw SOAP response first (warning: it's pretty turgid reading):

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <soapenv:Body>
        <ns1:getByIdResponse xmlns:ns1="http://public.accounts.soap.webservices.cfml" soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
            <getByIdReturn xmlns:ns2="http://accounts.adamcameron.me" xsi:type="ns2:Invoice">
                <account xsi:type="ns2:PersonalAccount">
                    <address xsi:type="ns2:Address">
                        <country xsi:type="xsd:string">United Kingdom</country>
                        <id xsi:type="xsd:double">1.0</id>
                        <localPart xsi:type="xsd:string">London</localPart>
                        <postcode xsi:type="xsd:string">E18</postcode>
                    </address>
                    <dateOfBirth xsi:type="xsd:string">1970-02-17</dateOfBirth>
                    <firstName xsi:type="xsd:string">Adam</firstName>
                    <id xsi:type="xsd:double">2.0</id>
                    <lastName xsi:type="xsd:string">Cameron</lastName>
                </account>
                <id xsi:type="xsd:double">2011.0</id>
                <items xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" soapenc:arrayType="xsd:anyType[3]" xsi:type="soapenc:Array">
                    <items xsi:type="ns2:InvoiceLine">
                        <count xsi:type="xsd:double">15.0</count>
                        <id xsi:type="xsd:double">14.0</id>
                        <price xsi:type="xsd:double">16.17</price>
                        <product xsi:type="ns2:Product">
                            <description xsi:type="xsd:string">Penguin</description>
                            <id xsi:type="xsd:double">3.0</id>
                            <rrp xsi:type="xsd:double">4.56</rrp>
                        </product>
                    </items>
                    <items xsi:type="ns2:InvoiceLine">
                        <count xsi:type="xsd:double">19.0</count>
                        <id xsi:type="xsd:double">18.0</id>
                        <price xsi:type="xsd:double">20.21</price>
                        <product xsi:type="ns2:Product">
                            <description xsi:type="xsd:string">Pangolin</description>
                            <id xsi:type="xsd:double">7.0</id>
                            <rrp xsi:type="xsd:double">8.9</rrp>
                        </product>
                    </items>
                    <items xsi:type="ns2:InvoiceLine">
                        <count xsi:type="xsd:double">23.0</count>
                        <id xsi:type="xsd:double">22.0</id>
                        <price xsi:type="xsd:double">24.25</price>
                        <product xsi:type="ns2:Product">
                            <description xsi:type="xsd:string">Playtpus</description>
                            <id xsi:type="xsd:double">11.0</id>
                            <rrp xsi:type="xsd:double">12.13</rrp>
                        </product>
                    </items>
                </items>
            </getByIdReturn>
        </ns1:getByIdResponse>
    </soapenv:Body>
</soapenv:Envelope>

Wot a bloody mouthful!

PHP takes all that in its stride, and rehydrates that into an object:

object(stdClass)#2 (3) {
    ["account"]=>
    object(stdClass)#3 (5) {
        ["address"]=>
        object(stdClass)#4 (4) {
            ["country"]=>
            string(14) "United Kingdom"
            ["id"]=>
            float(1)
            ["localPart"]=>
            string(6) "London"
            ["postcode"]=>
            string(3) "E18"
        }
        ["dateOfBirth"]=>
        string(10) "1970-02-17"
        ["firstName"]=>
        string(4) "Adam"
        ["id"]=>
        float(2)
        ["lastName"]=>
        string(7) "Cameron"
    }
    ["id"]=>
    float(2011)
    ["items"]=>
    array(3) {
        [0]=>
        object(stdClass)#5 (4) {
            ["count"]=>
            float(15)
            ["id"]=>
            float(14)
            ["price"]=>
            float(16.17)
            ["product"]=>
            object(stdClass)#6 (3) {
                ["description"]=>
                string(7) "Penguin"
                ["id"]=>
                float(3)
                ["rrp"]=>
                float(4.56)
            }
        }
        [1]=>
        object(stdClass)#7 (4) {
            ["count"]=>
            float(19)
            ["id"]=>
            float(18)
            ["price"]=>
            float(20.21)
            ["product"]=>
            object(stdClass)#8 (3) {
                ["description"]=>
                string(8) "Pangolin"
                ["id"]=>
                float(7)
                ["rrp"]=>
                float(8.9)
            }
        }
        [2]=>
        object(stdClass)#9 (4) {
            ["count"]=>
            float(23)
            ["id"]=>
            float(22)
            ["price"]=>
            float(24.25)
            ["product"]=>
            object(stdClass)#10 (3) {
                ["description"]=>
                string(8) "Playtpus"
                ["id"]=>
                float(11)
                ["rrp"]=>
                float(12.13)
            }
        }
    }
}

That's cool, but if we look at the SOAP response, we do have the object information in there too. Here's an extract:

xsi:type="ns2:Invoice"&gt;
    &lt;account xsi:type="ns2:PersonalAccount"&gt;
        &lt;address xsi:type="ns2:Address"&gt;
            &lt;country xsi:type="xsd:string"&gt;United Kingdom&lt;/country&gt;
            &lt;id xsi:type="xsd:double"&gt;1.0&lt;/id&gt;
            &lt;localPart xsi:type="xsd:string"&gt;London&lt;/localPart&gt;
            &lt;postcode xsi:type="xsd:string"&gt;E18&lt;/postcode&gt;
        &lt;/address&gt;
        &lt;dateOfBirth xsi:type="xsd:string"&gt;1970-02-17&lt;/dateOfBirth&gt;
        &lt;firstName xsi:type="xsd:string"&gt;Adam&lt;/firstName&gt;
        &lt;id xsi:type="xsd:double"&gt;2.0&lt;/id&gt;
        &lt;lastName xsi:type="xsd:string"&gt;Cameron&lt;/lastName&gt;
    &lt;/account&gt;
    &lt;id xsi:type="xsd:double"&gt;2011.0&lt;/id&gt;
    &lt;items xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" soapenc:arrayType="xsd:anyType[3]" xsi:type="soapenc:Array"&gt;
        &lt;items xsi:type="ns2:InvoiceLine"&gt;
            &lt;count xsi:type="xsd:double"&gt;15.0&lt;/count&gt;
            &lt;id xsi:type="xsd:double"&gt;14.0&lt;/id&gt;
            &lt;price xsi:type="xsd:double"&gt;16.17&lt;/price&gt;
            &lt;product xsi:type="ns2:Product"&gt;
                &lt;description xsi:type="xsd:string"&gt;Penguin&lt;/description&gt;
                &lt;id xsi:type="xsd:double"&gt;3.0&lt;/id&gt;
                &lt;rrp xsi:type="xsd:double"&gt;4.56&lt;/rrp&gt;
            &lt;/product&gt;
        &lt;/items&gt;
        &lt;items xsi:type="ns2:InvoiceLine"&gt;
            &lt;!-- etc &gt;


Hidden away in there we see there's an Invoice, PersonalAccount, Address, InvoiceLines and Products.

We can leverage this information to point PHP at classes to use when rehydrating the response, which is cool. We give the SoapClient a type map:

$options = [
    'trace' => true,
    'typemap' => [[
        'type_ns' => 'http://accounts.adamcameron.me',
        'type_name' => 'Address',
        'from_xml' => ['\me\adamcameron\accounts\Address', 'createFromXml']
    ],[
        'type_ns' => 'http://accounts.adamcameron.me',
        'type_name' => 'Account',
        'from_xml' => ['\me\adamcameron\accounts\Account', 'createFromXml']
    ],[
        'type_ns' => 'http://accounts.adamcameron.me',
        'type_name' => 'Product',
        'from_xml' => ['\me\adamcameron\accounts\Product', 'createFromXml']
    ],[
        'type_ns' => 'http://accounts.adamcameron.me',
        'type_name' => 'InvoiceLine',
        'from_xml' => ['\me\adamcameron\accounts\InvoiceLine', 'createFromXml']
    ],[
        'type_ns' => 'http://accounts.adamcameron.me',
        'type_name' => 'Invoice',
        'from_xml' => ['\me\adamcameron\accounts\Invoice', 'createFromXml']
    ]]
];

$client = new \SoapClient($wsdl, $options);

This maps the SOAP type to a PHP class, and method to use to rehydrate the object.

Running the code again with the typemap yields a fully-modelled result:

object(me\adamcameron\accounts\Invoice)#3 (3) {
    // ...
    ["account"]=>
    object(me\adamcameron\accounts\Account)#5 (5) {
        // ...
        ["address"]=>
        object(me\adamcameron\accounts\Address)#9 (5) {
            // ...
        }
    }
    ["items"]=>
    array(3) {
        [0]=>
        object(me\adamcameron\accounts\InvoiceLine)#13 (4) {
            // ...
            ["product"]=>
            object(me\adamcameron\accounts\Product)#15 (3) {
                // ...
            }
            // ...
        }
        [1]=>
        object(me\adamcameron\accounts\InvoiceLine)#14 (4) {
            // ...
            ["product"]=>
            object(me\adamcameron\accounts\Product)#17 (3) {
                // ...
            }
            // ...
        }
        [2]=>
        object(me\adamcameron\accounts\InvoiceLine)#16 (4) {
            // ...

            ["product"]=>
            object(me\adamcameron\accounts\Product)#19 (3) {
                // ...
            }
            // ...
        }
    }
}

(I've elided the irrelevant stuff from that one).

So that's pretty handy.

I do wish I could provide the typemap with each call though if I wanted to, not just when creating the SoapClient object. Oh well: it was no huge hardship.

There's no trick on the class side of things at all. Here's the InvoiceLine class:

<?php

namespace me\adamcameron\accounts;

class InvoiceLine {

    public $id;
    public $product;
    public $count;
    public $price;
        
    function __construct(int $id, Product $product, int $count, float $price){
        $this->id = $id;
        $this->product = $product;
        $this->count = $count;
        $this->price = $price;
    }
    
    static function createFromXml ($xml){
        $sxe = is_string($xml) ? new \SimpleXMLElement($xml) : $xml;
        $obj = new InvoiceLine(
            (int)$sxe->id,
            Product::createFromXml($sxe->product),
            (int)$sxe->count,
            (float)$sxe->price
        );
        return $obj;            
    }
}


OK the one trick is that the SoapClient only passes the "top level" object's XML through to the top level method. So in this case SoapClient calls InvoiceLine->createFromXml as per the type map, but from there it's up to me to do the same for any other objects this one uses, for example a Product in this case.

Oh and the other trick is that because of this, I need that XML / string differentiator at the beginning of the method. SoapClient passes in a string, but once I've converted that to XML, that's what the other calls will receive (ie: Product::createFromXml receives XML, not a string). That took me a while to nut-out, but that's just me being a div.

Oh yeah, and don't forget to explicitly cast the values going into yer properties, otherwise you'll end up with XML in there, not the actual values. That threw me for a while as well :-/

All the rest of the PHP code for this is on GitHub too.

The problem I had with the Adobe web service is that it didn't bother including the type attributes in its responses, other than on the container element (so in my example the array of InvoiceItems would have had a type on it). This means I had to pass the entire collection container into a handler method which then read the XML tag names and hand-cranked a switch based on that as to what rehydration method to use. I'm gonna try to contrive a non-business-sensitive version of that to show how I dealt with it.

Now... I've received orders from a mate that I'm expected at the pub in an hour. So I had better get cracking. Sorry to make you think about SOAP. And Adobe.

Righto.

--
Adam

Friday, 16 December 2016

PHP: adolescent weirdness with arrays

G'day:
This was put on my radar the other day, by Alexander Lisachenko:


Well put. The code concerned is thus:

<?php
$A = [1 => 1, 2 => 0, 3 => 1];
$B = [1 => 1, 3 => 0, 2 => 1];

var_dump($A > $B);
var_dump($B > $A);


var_dump($A < $B);
var_dump($B < $A);

And it outputs...

C:\temp>php array.php
bool(true)
bool(true)
bool(true)
bool(true)

C:\temp>

Well that's quite... contrary.

PHP Team member Kalle Sommer Nielsen explained that this sort of comparison (between arrays) is "unsupported" in PHP.

Further conversation ensued, which you can read on Twitter, rather than me repeating it here.

My thoughts on this are that it's completely fine for PHP to not support this sort of comparison, but in that case the way it should deal with it is actively not support it: raise an IllegalOperationException or some such (some extension of LogicException, anyhow). It should not just issue forth illogical and misleading results. Kalle explains some of the background to the situation, which was helpful.

Another issue I have is that if we conclude that asking "is this array greater than that array" doesn't simply result in an error, and it must answer "true" or "false", then the answer ought to be false. Kalle contended it didn't matter, but my parallel is "cheese is greater than blue (true or false)?". It's a nonsensical question, But the answer is not "true".

The conclusion was: well it's done now... one shouldn't try to do these things, and whilst PHP might not handle it in an ideal fashion, it's a sufficient edge-case to not warrant remedial action. All fair enough.

I had a quick look at what PHP does with various comparisons:

<?php

$i1 = 5;
$i2 = 7;
echo "$i1 > $i2: " . booleanAsString($i1 > $i2) . PHP_EOL;
echo "$i1 < $i2: " . booleanAsString($i1 < $i2) . PHP_EOL;

$f1 = 5.5;
$f2 = 7.7;
echo "$f1 > $f2: " . booleanAsString($f1 > $f2) . PHP_EOL;
echo "$f1 < $f2: " . booleanAsString($f1 < $f2) . PHP_EOL;

$s1 = "a";
$s2 = "z";
echo "$s1 > $s2: " . booleanAsString($s1 > $s2) . PHP_EOL;
echo "$s1 < $s2: " . booleanAsString($s1 < $s2) . PHP_EOL;

$s1 = "A";
$s2 = "a";
echo "$s1 > $s2: " . booleanAsString($s1 > $s2) . PHP_EOL;
echo "$s1 < $s2: " . booleanAsString($s1 < $s2) . PHP_EOL;

$b1 = true;
$b2 = false;
echo sprintf("%s > %s: ", booleanAsString($b1), booleanAsString($b2)) . booleanAsString($b1 > $b2) . PHP_EOL;
echo sprintf("%s < %s: ", booleanAsString($b1), booleanAsString($b2)) . booleanAsString($b1 < $b2) . PHP_EOL;

$o1 = (object) ["i"=>5];
$o2 = (object) ["i"=>7];
echo sprintf("%s > %s: ", json_encode($o1), json_encode($o2)) . booleanAsString($o1 > $o2) . PHP_EOL;
echo sprintf("%s < %s: ", json_encode($o1), json_encode($o2)) . booleanAsString($o1 < $o2) . PHP_EOL;

$a1 = ["i"=>5];
$a2 = ["i"=>7];
echo sprintf("%s > %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 > $a2) . PHP_EOL;
echo sprintf("%s < %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 < $a2) . PHP_EOL;


$a1 = [5];
$a2 = [7];
echo sprintf("%s > %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 > $a2) . PHP_EOL;
echo sprintf("%s < %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 < $a2) . PHP_EOL;

$a1 = [5=>5];
$a2 = [7=>7];
echo sprintf("%s > %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 > $a2) . PHP_EOL;
echo sprintf("%s < %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 < $a2) . PHP_EOL;

$a1 = [1 => 1, 2 => 0, 3 => 1];
$a2 = [1 => 1, 3 => 0, 2 => 1];
echo sprintf("%s > %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 > $a2) . PHP_EOL;
echo sprintf("%s < %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 < $a2) . PHP_EOL;

$a1 = [1 => 1, 2 => 0, 3 => 1];
$a2 = [1 => 1, 3 => 1, 2 => 0];
echo sprintf("%s > %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 > $a2) . PHP_EOL;
echo sprintf("%s < %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 < $a2) . PHP_EOL;


function booleanAsString($expression){
    return $expression ? "true" : "false";
}

I don't think any of that lot needs explaining: it's straight forward enough. There's a few baseline control expressions which apply the operator to numeric operands, which is clearly gonna have predictable results. Then I check with strings (including case differences), booleans, and then arrays (indexed and associative). The output:

5 > 7: false
5 < 7: true
5.5 > 7.7: false
5.5 < 7.7: true
a > z: false
a < z: true
A > a: false
A < a: true
true > false: true
true < false: false
{"i":5} > {"i":7}: false
{"i":5} < {"i":7}: true
{"i":5} > {"i":7}: false
{"i":5} < {"i":7}: true
[5] > [7]: false
[5] < [7]: true
{"5":5} > {"7":7}: false
{"5":5} < {"7":7}: false
{"1":1,"2":0,"3":1} > {"1":1,"3":0,"2":1}: true
{"1":1,"2":0,"3":1} < {"1":1,"3":0,"2":1}: true
{"1":1,"2":0,"3":1} > {"1":1,"3":1,"2":0}: false
{"1":1,"2":0,"3":1} < {"1":1,"3":1,"2":0}: false

Things work reasonably sensibly until we get to associative arrays, at which point PHP starts spouting nonsense.

I had a look around for something in the docs saying "don't do this", but what I did find kinda indicates there might indeed be a problem here (at least in the docs, if not in PHP's behaviour).

I've extracted this from "Comparison with Various Types":


Array with fewer members is smaller, if key from operand 1 is not found in operand 2 then arrays are uncomparable, otherwise - compare value by value (see following example)
(obviously that's my emphasis)

But this isn't so. See this example:

$a1 = ["a" => 1, "b" => 0, "c" => 1];
$a2 = ["a" => 1, "c" => 0, "b" => 1];
echo sprintf("%s > %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 > $a2) . PHP_EOL;
echo sprintf("%s < %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 < $a2) . PHP_EOL;

All the keys are present in both: "a", "b", "c", but the comparison doesn't work. I suspect it's the case of  poor wording in the docs, and it's down to whether the keys are in the same order too.

Also I'm not sure that suggesting they're "uncomparable" is helpful wording here. To me that indicates one would expect an exception if one tried. It should say "then the result is unreliable, and the operation should not be used".

I also note there's a big, highlighted warning about comparing floats:



Perhaps the same treatment should be given to array comparisons. If it tripped-up Mr Lisachenko - who seems to be pretty bloody good & experienced PHP dev - then it's gonna trip-up a lot of people.

Bottom line I agree that the handling here is "unfortunate", but it is pretty predictable given how PHP tends to handle not-obvious things, so one kinda oughta expect it. Equally it could be remediated by a documentation improvement, and I don't think it's worth anyone putting any effort into to "fixing" it.

Righto.

--
Adam