Adam Cameron's Dev Blog: PHP

Showing posts with label PHP. Show all posts

Thursday 17 November 2016

PHP: decorating async GuzzleHttp calls

G'day:
A while back I looked at the decorator pattern in the context of some stuff we were doing:

That's all surprisingly straight forward and handy stuff.

We're revisiting some of this for a new implementation of our HTTP adapter library, which had everything baked into it before, and the decorating has had me scratching my head a bit as we're implementing an adapter for GuzzleHttp, and it does its requests asynchronously, and we want to leverage that.

Just to recap, previously we'd have this sort of thing:

namespace me\adamcameron\testApp;

use GuzzleHttp\Client;

class GuzzleAdapter {

    private $client;
    private $endPoint;

    public function __construct($endPoint){
        $this->endPoint = $endPoint;
        $this->client = new Client();
    }

    public function get($id){
        $response = $this->client->requestAsync(
            "get",
            $this->endPoint . $id
        );

        return $response;
    }

}

And say a logging decorator for that:

namespace me\adamcameron\testApp;

class LoggedGuzzleAdapter {

    private $adapter;
    private $logger;

    public function __construct($adapter, $logger) {
        $this->adapter = $adapter;
        $this->logger = $logger;
    }

    public function get($id){
        $this->logger->logMessage(sprintf("Requesting for %s", $id));

        $response = $this->adapter->get($id);

        $body = $response->getBody();
        $this->logger->logMessage(sprintf("Response for %s: %s", $id, $body));
        
        return $response;
    }

}

And we'd init our adapter thus:

$endPoint  = "http://cf2016.local:8516/cfml/misc/guzzleTestEndpoints/getById.cfm?id=";

$guzzleAdapter = new GuzzleAdapter($endPoint);
$logger = new LoggingService();
$adapter = new LoggedGuzzleAdapter($guzzleAdapter, $logger);

So the underlying GuzzleAdapter handles the Guzzle stuff, the LoggedGuzzleAdapter handles just the logging stuff, but defers to its GuzzleAdapter to do its part of the job, and keeps all the moving parts and bits of functionality sensibly separated. And it's pretty simple. And as detailed in those earlier articles, we can keep layering decorators around an adapter to add caching or what-have-you in a similar way. Easy. Nice.

However this only works cos the call to Guzzle actually returns the result on the spot. And this is cos we were using it synchronously: we make a call to it, it blocks until it gets the answer back from the target and gives us the answer.

Now that we're using async calls, Guzzle doesn't give us the answer, it just gives us a Promise which will eventually resolve to be the answer. This is in theory good cos it means the calling code can make a bunch of HTTP calls, and not wait around for each of them to resolve in series: Guzzle will actually make them all in parallel. I have a look at this in article "PHP: async requests using Guzzle and request pools".

If we go back to our decorator we can see the problem:

public function get($id){
    $this->logger->logMessage(sprintf("Requesting for %s", $id));

    $response$promisedResponse = $this->adapter->get($id);

    $body = $response$promisedResponse->getBody();
    $this->logger->logMessage(sprintf("Response for %s: %s", $id, $body));
    
    return $response$promisedResponse;
}

At the point at which our decorator needs the body... we don't have it yet. All we have is a promise that at some point we'll have a body (or we'll have response object via the resolved promise, anyhow; and the response object will have the body).

To get the body we first need to wait for the promise to resolve... which is a blocking operation and kinda defeats the purpose of using the async approach in the first place. IE: we could do this sort of thing:

public function get($id){
    $this->logger->logMessage(sprintf("Requesting for %s", $id));

    $promisedResponse = $this->adapter->get($id);
    $response = $promisedResponse->wait();
    
    $body = $response->getBody();
    $this->logger->logMessage(sprintf("Response for %s: %s", $id, $body));
    
    return $response;
}

But doing the wait immediately defeats the purpose of making the call async in the first place. We want the wait call to... err... wait... until the calling code says "right I need to actually use that data now".

Our initial attempt to sort this out was to analyse the issue as being one of "well we don't have the data we need until we call wait, so then we need to do the logging then... which means we need to intercept the wait call... which means we need to return our own decorated version of the response from the call to the adapter with its own wait and have a LoggedResponse, and return that from the code above..." But... and I hope my colleague doesn't mind me saying... when I saw this code in code review I kinda went "there must be a better, more semantic way of doing this". I won't repeat the code as I cannot use our actual work code in my blog, and I am not in the office right now and can't remember the detail of the implementation anyhow. But it's not ideal, so I don't want to share it anyhow.

I've been off sick for the last coupla days, which gives me a lot of time to deliberate and mess around with stuff and google a lot more than I allow myself time for when I'm in the office. With a wee bit of "standing back and having another think about it", the solution became clear. We don't need to explicitly override the wait method before we call it... the promise builds that capability in! At the time we're making the call, we get to tell the Promise what happens when it gets resolved. So I've knocked together this proof of concept:

public function get($id){
    $this->logger->logMessage(sprintf("(%s) Requesting for %s", $this->thisFile, $id));

    $response = $this->adapter->get($id);

    $response->then(function($response) use ($id){
        $body = $response->getBody();
        $this->logger->logMessage(sprintf("(%s) Response for %s: %s", $this->thisFile, $id, $body));
        $body->rewind();
    });

    return $response;
}

This was a bit of a head-slap moment - in that it took me a while to work out what to do - the whole thing about Promises is that one's able to chain functionality onto their lifecycle. if we wanna run some code once the Promise resolves, we just sling a then handler onto it.

This is a good solution IMO as it keeps all the logging decoration in one place, doesn't require any blocking of the async part of the operation, and is still very clear what's going on, even if the inner workings of Promises are a bit opaque (which TBH, they still are for me. Every time I work with 'em I have to re-read the docs. Which in Guzzle's case are pretty bloody awful, if I'm to be unkindly frank).

I chucked all this into a test rig to demonstrate to myself it was sound.

First up I contrived an endpoint in CFML for the PHP code to hit. I wanted to do this outside of PHP so it didn't in any way interfere with the PHP code running. It's simple:

<cfscript>
cfcontent(type="application/json");

writeLog(file="testApp", text="[ID: #URL.id#] request received");

sleep(5000);

writeOutput(serializeJson({"id"=URL.id, "retrieved"=now().dateTimeFormat("HH:nn:ss.lll")}));
writeLog(file="testApp", text="[ID: #URL.id#] response returned");
</cfscript>

This just takes an ID parameter and waits 5sec then returns it in a JSON packet. I wait 5sec to make the async-ness of the PHP calls easier to see. I'm also logging some stuff here to compare when the requests got to CF compared to when they were sent by PHP.

On the PHP end of things, I've got this:

use \me\adamcameron\testApp\GuzzleAdapter;
use \me\adamcameron\testApp\LoggedGuzzleAdapter;
use \me\adamcameron\testApp\LoggingService;

require_once __DIR__ . "/../vendor/autoload.php";

$endPoint = "http://cf2016.local:8516/cfml/misc/guzzleTestEndpoints/getById.cfm?id=";

$guzzleAdapter = new GuzzleAdapter($endPoint);
$logger = new LoggingService();
$adapter = new LoggedGuzzleAdapter($guzzleAdapter, $logger);

$ids = ["001", "002", "003", "004"];

$thisFile = basename(__FILE__);

$logger->logMessage(sprintf("(%s) Making requests...", $thisFile));

$responses = [];
foreach ($ids as $id){
    $logger->logMessage(sprintf("(%s) Requesting for %s", $thisFile, $id));
    $responses[] = $adapter->get($id);
}
$logger->logMessage(sprintf("(%s) Requests made", $thisFile));

$logger->logMessage(sprintf("(%s) Getting bodies from requests...", $thisFile));
foreach ($responses as $response){
    $logger->logMessage(sprintf("(%s) before calling wait()", $thisFile));
    $body = $response->wait()->getBody()->getContents();
    $logger->logMessage(sprintf("(%s) Response Body: %s", $thisFile, $body));
}
$logger->logMessage("Done");

This does the following:

creates the GuzzleAdapter;
uses a LoggedGuzzleAdapter to decorate it with some logging (the LoggingService just wraps some Monolog stuff);
loops over four IDs;
and requests their bumpf;
we then loop over the responses we've accumulated;
wait for them to resolve (calling wait blocks until they are);
and outputs the result;
all the way along, writing to the log at key points.

The actual LoggedGuzzleAdapter I'm using here is:

class LoggedGuzzleAdapter {

    private $adapter;
    private $logger;
    private $thisFile;

    public function __construct($adapter, $logger) {
        $this->adapter = $adapter;
        $this->logger = $logger;
        $this->thisFile = basename(__FILE__);
    }

    public function get($id){
        $this->logger->logMessage(sprintf("(%s) Requesting for %s", $this->thisFile, $id));

        $response = $this->adapter->get($id);

        $response->then(function($response) use ($id){
            $body = $response->getBody();
            $this->logger->logMessage(sprintf("(%s) Response for %s: %s", $this->thisFile, $id, $body));
            $body->rewind();
        });

        return $response;
    }

}

And the underlying GuzzleAdapter:

use GuzzleHttp\Client;

class GuzzleAdapter {

    private $client;
    private $endPoint;

    public function __construct($endPoint){
        $this->endPoint = $endPoint;
        $this->client = new Client();
    }

    public function get($id){
        $response = $this->client->requestAsync(
            "get",
            $this->endPoint . $id,
            ["http_errors" => true]
        );

        return $response;
    }

    public function getHandlerStack()
    {
        return $this->client->getConfig('handler');
    }

}

The things to note here are:

I'm logging the file name the log entry was made from. This is just to make it clearer in the logs which code is doing what for a given entry.
Just note the complete separation of concerns between the two files. One does logging. One does... "Guzzling". And they each focus on the job at hand, and that's it.

Here's the result. This is what PHP logged. I've trimmed-out some repetition & clutter, but have not changed any context or ordering of what got logged:

[14:15:53](makeRequests.php) Making requests...
[14:15:53](LoggedGuzzleAdapter.php) Requesting for 001
[14:15:53](LoggedGuzzleAdapter.php) Requesting for 002
[14:15:53](LoggedGuzzleAdapter.php) Requesting for 003
[14:15:53](LoggedGuzzleAdapter.php) Requesting for 004
[14:15:53](makeRequests.php) Requests made
[14:15:53](makeRequests.php) Getting bodies from requests...
[14:15:53](makeRequests.php) before calling wait()
[14:15:58](LoggedGuzzleAdapter.php) Response for 003{"retrieved":"14:15:58.450","id":"003"}
[14:15:58](LoggedGuzzleAdapter.php) Response for 002{"retrieved":"14:15:58.450","id":"002"}
[14:15:58](LoggedGuzzleAdapter.php) Response for 004{"retrieved":"14:15:58.450","id":"004"}
[14:15:58](LoggedGuzzleAdapter.php) Response for 001{"retrieved":"14:15:58.450","id":"001"}
[14:15:58](makeRequests.php) Response Body {"retrieved":"14:15:58.450","id":"001"}
[14:15:58](makeRequests.php) before calling wait()
[14:15:58](makeRequests.php) Response Body {"retrieved":"14:15:58.450","id":"002"}
[14:15:58](makeRequests.php) before calling wait()
[14:15:58](makeRequests.php) Response Body {"retrieved":"14:15:58.450","id":"003"}
[14:15:58](makeRequests.php) before calling wait()
[14:15:58](makeRequests.php) Response Body {"retrieved":"14:15:58.450","id":"004"}

Things to note:

first and foremost: the PHP code is indeed making the HTTP requests asynchronously. Remember the CFML code pauses for 5sec for each response, which means if we do four requests that's a total of 20sec paused time. But the whole process for all four requests takes only 5sec (CF is intrinsically multi-threaded, so from its side of things, it'll process all four requests simultaneously).
PHP makes all four requests within the same first second of execution.
After 5sec, the responses start to come in.
The first wait call bears the brunt of this 5sec wait. Bear in mind this is a blocking operation so PHP is now waiting for the 5sec-long response from CF to come back. But after that, all the responses start to come in before their equivalent wait calls, so those promises are resolved before the code even calls wait for them.
most importantly: the correct logging is taking place at the correct time. So the code actually works as intended.

One interesting thing here. it doesn't seem like the act of making the request actually triggers the request. It's the act of waiting that causes the requests to be sent (albeit it: all of them).

I stuck a ten second wait in my PHP code, between making the requests and asking about their responses:

$logger->logMessage(sprintf("(%s) Requests made", $thisFile));

sleep(10);

$logger->logMessage(sprintf("(%s) Getting bodies from requests...", $thisFile));

And here's the log for that run:

[14:47:58] (makeRequests.php) Making requests...
[14:47:58] (LoggedGuzzleAdapter.php) Requesting for 001
[14:47:58] (LoggedGuzzleAdapter.php) Requesting for 002
[14:47:58] (LoggedGuzzleAdapter.php) Requesting for 003
[14:47:58] (LoggedGuzzleAdapter.php) Requesting for 004
[14:47:58] (makeRequests.php) Requests made
[14:48:08] (makeRequests.php) Getting bodies from requests...
[14:48:08] (makeRequests.php) before calling wait()
[14:48:13] (LoggedGuzzleAdapter.php) Response for 004: {"retrieved":"14:48:13.704","id":"004"}
[14:48:13] (LoggedGuzzleAdapter.php) Response for 001: {"retrieved":"14:48:13.719","id":"001"}
[14:48:13] (LoggedGuzzleAdapter.php) Response for 003: {"retrieved":"14:48:13.719","id":"003"}
[14:48:13] (LoggedGuzzleAdapter.php) Response for 002: {"retrieved":"14:48:13.704","id":"002"}
[14:48:13] (makeRequests.php) Response Bodies: {"retrieved":"14:48:13.719","id":"001"}
[14:48:13] (makeRequests.php) before calling wait()
[14:48:13] (makeRequests.php) Response Bodies: {"retrieved":"14:48:13.704","id":"002"}
[14:48:13] (makeRequests.php) before calling wait()
[14:48:13] (makeRequests.php) Response Bodies: {"retrieved":"14:48:13.719","id":"003"}
[14:48:13] (makeRequests.php) before calling wait()
[14:48:13] (makeRequests.php) Response Bodies: {"retrieved":"14:48:13.704","id":"004"}
[14:48:13] Done

Notes:

the requests are all made @ 58.
you can see the 10sec sleep here. If the requests had actually been made, then they'd've all come back in that 10sec window.
but we can tell from the values coming back from ColdFusion that the requests weren't actually even made until 08 (as the 5sec delay is indicated in 13 in the returned timestamp
so it's the first call to wait that actually triggers the requests to be sent.

I'm not too sure I like that. Guzzle should crack on with it, as soon as it's asked, surely? Not wait around for me to ask "are we there yet?" Still... I don't think this will adversely impact what we're doing very often. In reality we're making a single call and then waiting for it immediately, almost all the time (this is to the extent I have actually questioned why we are doing this work, and why we're arsing about with Guzzle when we could just be using curl. But... oh well).

So that's the end of my most recent adventure with Guzzle and Promises and async calls and the like. I'm pleased with the resulting code. Not so pleased with how Guzzle works.

I've another two articles that cover the next challenge I faced with this stuff still to come. I've got the code for the second one written... but the topic for the third one only came to mind whilst writing this. So perhaps tomorrow I'll document my adventures of howTF to coerce Guzzle into raising the exceptions I choose for it to raise when requests come back with 400-500 responses, not its generic ones; and after that, how to implement a caching decorator for all this stuff. I've been off work sick for yesterday and today, but will be back in the office tomorrow. So I suspect the second article won't be out until Saturday now. I've had enough of being awake today, and am heading back to bed now. it is 3pm, after all ;-)

Righto.

--
Adam

Thursday 10 November 2016

PHP & CFML: I get thrown by how static variables work

G'day:
I like it when I demonstrate to myself I'm a bit thick. Today was one of those days. I've been working with PHP for a coupla years now, and pretty much have ported my OO knowledge over to it. Coming from a CFML background one thing I'm not so familiar with is static variables and methods. I "get" it, that's cool, but clearly I'm still mid-way up the learning curve, and some idiosyncrasies still elude me. I found one today.

I was doing some testing, and the situation I was in was that my test classes needed some subclassing... don't ask... it was legit but I cannot go into details and it'd bore you rigid anyhow. In my base class I had a static variable which I needed to have a different value for in the subclass (it was actually a sub-sub-class. Dumb tests). So I set it to a new value in my test method, then the test method called a method from the base class, and the test went splat cos it was still using the base class value. This confused the shit out of me at first. Here's a simplified example:

class GP {

    public static $test = "GP";
    
    public function fromGP(){
        return self::$test;
    }
    
    public function get(){
        return self::$test;
    }

}

class P extends GP {

    public static $test = "P";
    
    public function fromP(){
        return self::$test;
    }

}

class C extends P {

    public static $test = "C";
    
    public function fromC(){
        return self::$test;
    }

}

$c = new C();

echo "fromC: " . $c->fromC() . PHP_EOL;
echo "fromP: " . $c->fromP() . PHP_EOL;
echo "fromGP: " . $c->fromGP() . PHP_EOL;
echo "get: " . $c->get() . PHP_EOL;

So:

I've got a grandparent class (GP) which sets a static variable, and has two methods to get its value.
And a Parent class (P) that extends GP, with its own value for the static variable, and another different method to get its value.
And a Child class (C) which does the same thing, but to the Parent.
In my test rig I create a C instance, and call each of the methods.

Now cos I'm thick, I expected this output:

fromC: C
fromP: C
fromGP: C
get: C

Because C overrides the value for the static test variable, so the inherited methods should all use that value.

But what I did get is this:

fromC: C
fromP: P
fromGP: GP
get: GP

It took me a while to click. I changed my variable to be not static, and got the results I expected. Changed it back: back to the "wrong" (where "wrong" is "not what I wanted") answer. Then it dawned on me I guess. Static variables are bound to the class. I already knew this, but the ramifications didn't hit me initially. $c might be an object that inherits from P and GP, and obviously object variables will also follow inheritance rules, but static variables relate to the class, not the object. So in the class GP, in which the method fromGP is defined... self::$test refers to the static $test variable in that class. So: the value is GP. and on from there for fromP and down to fromC too (and get).

It made sense once I engaged my brain.

When I got home from work I decided to "ask" Java what it thought about this, and knocked together an analogous solution. Please pardon my shit Java skills.

package me.adamcameron.test;

public class GP {

    public static String test = "GP";
    
    public String fromGP(){
        return test;
    }
    
    public String get(){
        return test;
    }

}

package me.adamcameron.test;

public class P extends GP {

    public static String test = "P";
    
    public String fromP(){
        return test;
    }

}

package me.adamcameron.test;

public class C extends P {

    public static String test = "C";
    
    public String fromC(){
        return test;
    }

}

package me.adamcameron.test;

public class Test {

    public static void main(String[] args){
        C c = new C();
        
        System.out.println("fromC: " + c.fromC());
        System.out.println("fromP: " + c.fromP());
        System.out.println("fromGP: " + c.fromGP());
        System.out.println("get: " + c.get());
        
    }

}

And that - predictably - confirmed PHP's behaviour (not that I doubted PHP. Much):

>java me.adamcameron.test.Test
fromC: C
fromP: P
fromGP: GP
get: GP

Coolio. I'm also pleased I managed to write 20-odd lines of Java and get it right ~~first~~second time (it was a typo! ;-)

Last but ~~not~~ least, I remember that Lucee's flavour of CFML introduced static variables in Lucee 5, so I decided to see what it did:

component {

    static {
        test = "GP";
    }
    
    public function fromGP(){
        return static.test;
    }
    
    public function get(){
        return static.test;
    }
}

component extends=GP {

    static {
        test = "P";
    }
    
    public function fromP(){
        return static.test;
    }
    
    public function get(){
        return static.test;
    }
}

component extends=P {

    static {
        test = "C";
    }
    
    public function fromC(){
        return static.test;
    }
    
    public function get(){
        return static.test;
    }
}

<cfscript>
o = new C();

writeOutput("fromC: #o.fromC()#<br>");
writeOutput("fromP: #o.fromP()#<br>");
writeOutput("fromGP: #o.fromGP()#<br>");
writeOutput("get: #o.get()#<br>");
</cfscript>

When I ran this...

fromC: C fromP: C fromGP: C get: C

Hrm. So Lucee does what I originally wanted, but I now reckon this is not what it should be doing. I best go ask someone from LAS what the story is there.

Anyway that was that. Just a small lesson learned for me today.

Oh, btw... the solution was to just make the variable not static. Then me code worked fine. There really was no need for it to be static in the first place, on reflection.

Update:
It's important to read Kalle's comment (below) regarding static:: in PHP.

Righto.

--
Adam

Thursday 3 November 2016

PHP: different ways of defining one-off objects

G'day:
This one came from a Stack Overflow question I watched be raised, commented-on and summarily closed again because there are rules to follow, and by golly the community moderators on S/O looove following rules.

Anyway, the question got asked, but never answered.

'ere 'tis:

In PHP 7, what's the use of stdClass with the availability of anonymous classes?

As of PHP 7 we have access to anonymous classes. This means we could ditch stdClass when we need generic object as a return value... But which one would be cleaner?

What other differences are between those two guys?

Is there any performance drawback on creating an anonymous class? i.e. as each one has a different "class name", is there any additional impact?

Do both work "the same way" from the developer point of view, or are they actually two different beasts in some cases?
<?php
$std_obj = new stdClass(); //get_class($std_obj) == 'stdClass'
$anonymous = new class {}; //get_class($std_obj) == 'class@anonymous\0file.php0x758958a3s'

Good question. Well the bit about performance ain't, cos at this stage of one's exploration into these things - asking initial questions - one clearly doesn't have a performance issue to solve, so just don't worry about it.

OK, so I like messing around with this sort of thing, and I still need to learn a lot about PHP so I decided to refresh my memory of the various approaches here.

Firstly, the "anonymous classes" thing is a new feature in PHP 7, and allows one to model objects via a literal expression, much like how one can define a function via an expression (making a closure) rather than a statement. Here's an example:

$object = new class('Kate', 'Sheppard') {

    private $firstName;
    private $lastName;
    
    public function __construct($firstName, $lastName){
        $this->firstName = $firstName;
        $this->lastName = $lastName;
    }
    
    public function getFullName(){
        return "{$this->firstName} {$this->lastName}";
    }
};

See how go new class, and then proceed to define the class right there in the following block. Note this doesn't create a class... person1 does not contain a class; it creates an object. As evidenced by my subsequent usage of it:

echo $person1->getFullName();

Output:

Kate Sheppard

TBH, I dunno about the merits of this. In PHP one can already declare classes in the same file as other code, and more than one class in the same file. I guess it's handy for when one needs to define a one-off modelled object within another function or something? I dunno.

The other version mentioned was via stdclass, but let's not go there yet.

Now bear in mind the OP was asking about creating empty objects. One can do that with an anonymous class by just not giving the class definition any body:

$empty = new class(){};

Dumping that out gives:


object(class@anonymous)#1 (0) {
}

And with stdclass:

$empty = new stdClass();

Dumping that out gives:


object(stdClass)#1 (0) {
}

Not much to it. But I dunno if the intent in PHP is to use stdClass directly. The docs just say:

Created by typecasting to object.

It doesn't say "instantiate one of these to create an empty object". And that's where it'd say it, if that was the intent.

What it does allude to is this:

$empty = (object) [];

Dumping that out gives:


object(stdClass)#1 (0) {
}

(so same as with stdClass). To me the intended usage of stdClass is just one of inference. An array cast to an object needs to be of some type, so there's stdClass to accommodate that.

Also: how often does one want an empty object, and that's it? At least with casting an array to an object one can give it some properties off the bat:

$person = (object) [
    'firstName' => 'Kate,
    'lastName' => 'Sheppard'
];

One can add properties to the object afterwards with a new stdClass(), but that seems clunky.

Thinking about it, using new stdClass() seems about as right a way to do things as using array() to declare an array, as opposed to using a literal []. Bleah. Just use the literal.

But this all got me thinking though. Empty objects be damned. How can I implement the same as that anonymous class using these other syntaxes.

I managed to write some really crap code implementing analogues using stdclass and an object literal.

First the object literal, as it's not too awful:

$person2 = (object) [
    'firstName' => null,
    'lastName' => null,
    '__construct' => function ($firstName, $lastName) use (&$person2) {
        $person2->firstName = $firstName;
        $person2->lastName = $lastName;
    },
    'getFullName' => function () use (&$person2) {
        return "{$person2->firstName} {$person2->lastName}";
    }
];

It's not sooooo bad. Until one ones to try to use it. Check this out:

$person2->__construct('Emmeline', 'Pankhurst');
echo $person2->getFullName() . PHP_EOL;

This just gives:

PHP Fatal error:  Uncaught Error: Call to undefined method stdClass::__construct() in literalVsObjectCast.php:37
Stack trace:
#0 {main}
  thrown in literalVsObjectCast.php on line 37

Yikes. This turns out to be some weirdo order of operation things, and I can't quite remember what the story is, but it's down to how and what the function call is bound to the function, and the -> operator taking precedence over the () operator. Or something.

It's easily enough solved:

($person2->__construct)('Emmeline', 'Pankhurst');
echo ($person2->getFullName)() . PHP_EOL;

See how I've used parentheses there so the method calls are on the reference to the object/method. This works:

Emmeline Pankhurst

But it's unorthodox code, so I'll be avoiding that. Oh yeah, also notice the weirdness I had to do with the reference with the closure there. Plus having to reference the object's properties by this "external" reference. Still: good to be able to do it though.

We can go from "unorthodox" to "shit" now. There's the equiv using stdclass:

$person3 = new stdclass();
$person3->firstName = null;
$person3->initial = null;
$person3->lastName = null;
$person3->__construct = function ($firstName, $initial, $lastName) use (&$person3) {
    $person3->firstName = $firstName;
    $person3->initial = $initial;
    $person3->lastName = $lastName;
};
$person3->getFullName = function () use (&$person3) {
    return "{$person3->firstName} {$person3->initial}  {$person3->lastName}";
};

And again to run it, we need to use those extra parentheses:

($person3->__construct)('Susan', 'B', 'Anthony');
echo ($person3->getFullName)() . PHP_EOL;

Susan B Anthony

It's all a bit grim for words: the literal version was a bit awkward, but this as all the awkwardness as well as a lot of clutter (and, yes, this version also supports a middle initial, but that's not really the bulk of the bloat).

Another thing with these latter two versions is that - until the class literal version - firstName and lastName are public. I guess it's no big thing for throwaway objects, but it seems a bit "leaky" to me.

One of these JavaScript tricks we learn when trying to fit the oval peg into the round hole that is "doing OO in JS", is the Module Pattern: using an IIFE to emulate an object with private data. PHP 7 also added IIFEs, it already has closure, so we can do this in PHP too:

$person4 = (function($firstName, $lastName){
    return (object)[
        'getFullName' => function() use ($firstName, $lastName){
            return "$firstName $lastName";
        }
    ];
})('Vida', 'Goldstein');

And running it:

echo ($person4->getFullName)() . PHP_EOL;

Output:

Vida Goldstein

Now I reckon that's the tidiest syntax of all! Even beating the anonymous class version. But one is still stuck with those extra parentheses on the function call though. Pity.

In conclusion, I reckon this:

if you want an empty object... use (object)[]. Don't use a mechanism - anonymous classes - specifically intended to derive fully modelled objects, just to create empty objects.
If you want a throw-away object with a few public properties in it it: still use (object) []. It's tidier and less typing.
If you need some methods in there as well... go for the anonymous class literal; its usage syntax is more orthodox than using the IIFE.
But do remember PHP7 has IIFEs! They come in handy for things other than emulating JS design patterns.
I don't reckon one ought to directly use stdClass. It's clunky, leads to bloated code, and the docs kinda position it as a necessary implementation, not something that one's supposed to actively use.
But... generally speaking use a proper named class. These anonymous classes are not a replacement for them; they're for throw-away objects. You'll seldom need those.
Don't start an exercise worrying about performance. Write good clean code, using TDD, and worry about performance if it comes up. And differences like this are never gonna be where to make savings anyhow.

Thoughts? Obviously I'm just a relative n00b at PHP so I might be missing some important thing that seasoned PHPers all "get" but I've not come across yet. Lemme know.

Righto.

--
Adam

Saturday 27 August 2016

Friday puzzle: my PHP answer

G'day:
So as per my previous article - "Friday code puzzle" - here's my answer to the current puzzle.

I knocked the first version of this out over a beer whilst waiting for my flight to Galway, last night. Then scrapped that as being crap, and rewrote whilst in transit. And then... erm... back-filled my unit tests y/day evening. I feel guilty about not doing proper TDD, but... so be it. It's got test coverage now.

Just on TDD for a second. It was wrong of me not to do the testing first, and whilst I was just about to manufacture an excuse as to why not ("I didn't have PHPUnit installed on this machine", "I couldn't be arsed making a whole project out of it", "um [something about being offline on the flight]"), it was all bullshit laziness. I had to write tests to prove that my function would work outside of the immediate quiz requirements: the easiest way to do this is with unit tests and clear expectations etc. These tests were of limited benefit to me, given I already had the code written, and had already been through the trial-and-error process of getting the thing working correctly. The thing is, in a professional environment the tests aren't for you necessarily. They're for the next person who comes along to maintain your code. Or to troubleshoot your code when an edge-case crops up. It should not be down to the next person to write the tests for your code. That's lazy-arse, ego-centric shit. Do your job properly. I have encountered a coupla devs recently who think they're too good for tests, and refuse to do them properly. I don't want to work with people like that as they're all "me me me me".

But anyway... some code.

Well first a recap. Here's the "official" description from Jesse. But in summary:

We have this data set:

id,parentId,nodeText
1,,File
2,1,New
3,2,File
4,2,Folder
5,,Edit
6,5,Copy
7,5,Cut
8,5,Paste
9,,Help
10,9,About

This reflects hierarchical data (I would not represent a hierarchy that way, but that's a different story for another time), and we want to convert it to a hierarchical representation, eg:

[
  {
     nodeText : "File"
    ,children : [
      {
         nodeText : "New"
        ,children : [
           { nodeText: "File" }
          ,{ nodeText: "Folder" }
        ]
      }
    ]
  }
  ,{
     nodeText : "Edit"
    ,children : [
       {nodeText: "Copy"}
      ,{nodeText: "Cut"}
      ,{nodeText: "Paste"}
    ]
  }
  ,{
     nodeText : "Help"
    ,children : [
      { nodeText: "About" }
    ]
  }
]

There are a few "rules of engagement" at that link I mentioned, but that's the gist of it.

I decided I had better write some PHP code for a change, so knocked something together hastily. Initially I thought I might need to write some recursive monster to generate the hierarchy, but instead realised I did not need to do that, I just needed to track each "parent" node as I encountered it, and then append children to it as appropriate. Note that the data is sorted, so each record could be a parent of a subsequent node, and it will never be the case one will encounter a node before already having processed its parent. So all I needed to do is iterate over the data from top to bottom. Once. Nothing more tricky than that. Then by the time I had finished, the first parent would represent the entire tree. That was easy enough but then I had to convert it to JSON. Note the above representation is not JSON, but it's close enough, so that's what I decided to run with. As it turns out this was pretty easy to achieve in PHP as it has the ability to provide a custom serialiser for a class, and given I'd used a mix of associative and indexed arrays to build the data structure, it was simply a matter of returning the first parent node's children. Done.

Enough chatter, here's the code...

Update:

I have updated this from the first version I posted. This is slightly simpler, and also works even if the data is not pre-sorted. Thanks to Mingo for encouraging me to look into this.

<?php

namespace puzzle;

class Tree implements \JsonSerializable {

    private $parents = [];

    function __construct() {
        $tree = [
            "children" => []
        ];
        $this->parents[0] = $tree;
    }

    static function loadFromCsv($filePath) {
        $dataFile = fopen($filePath, "r");

        $tree = new Tree();
        while(list($id, $parent, $nodeText) = fgetcsv($dataFile)) {
            $tree->addNode($nodeText, $id, $parent);
        }

        return $tree;
    }

    private function addNode($nodeText, $id, $parent) {
        $parent = $parent === "" ? 0 : $parent;

        $this->parents[$id]["nodeText"] = $nodeText;
        $this->parents[$parent]["children"][] = &$this->parents[$id];
    }

    function jsonSerialize() {
        return $this->parents[0]["children"];
    }
}

The only other thing to note here is that the requirements indicated the data "should be in the form of an RDBMS resultset", but I could not be arsed horsing around with DB data for this, so I'm just reading a CSV file instead.

I also wrote the tests, as I said:

<?php

namespace test\puzzle;

use puzzle\Tree;

/** @coversDefaultClass \puzzle\Tree */
class TreeTest extends \PHPUnit_Framework_TestCase {

    private $testDir;

    function setup() {
        $this->testDir = realpath(__DIR__ . "/testfiles");
    }

    /**
     * @covers ::loadFromCsv
     * @covers ::__construct
     * @covers ::addNode
     * @covers ::jsonSerialize
     * @dataProvider provideCasesForLoadFromCsvTests
     */
    function testLoadFromCsv($baseFile){
        $files = $this->getFilesFromBase($baseFile);

        $result = Tree::loadFromCsv($files["src"]);
        $resultAsJson = json_encode($result);
        $resultAsArray = json_decode($resultAsJson);

        $expectedJson = file_get_contents($files["expectation"]);
        $expectedAsArray = json_decode($expectedJson);

        $this->assertEquals($expectedAsArray, $resultAsArray);
    }

    private function getFilesFromBase($base){
        return [
            "src" => sprintf("%s/%s/data.csv", $this->testDir, $base),
            "expectation" => sprintf("%s/%s/expected.json", $this->testDir, $base)
        ];
    }

    function provideCasesForLoadFromCsvTests(){
        return [
            "puzzle requirements" => ["testSet" => "puzzle"],
            "one element" => ["testSet" => "one"],
            "deep" => ["testSet" => "deep"],
            "flat" => ["testSet" => "flat"],
            "not ordered" =&gt ["testSet" =&gt "notOrdered"]
        ];
    }
}

There's no real surprise of discussion point there, beside highlighting the test cases I decided upon:

the puzzle requirements;
a single element;
a deep structure;
a flat structure.

I think those covered all the bases for a Friday Puzzle. An example of the data and expectation are thus (this is the "deep" example):

1,,Tahi
2,1,Rua
3,2,Toru
4,3,Wha
5,4,Rima
6,5,Ono
7,6,Whitu
8,7,Waru
9,8,Iwa
10,9,Tekau

[
  {
    "nodeText": "Tahi",
    "children": [
      {
        "nodeText": "Rua",
        "children": [
          {
            "nodeText": "Toru",
            "children": [
              {
                "nodeText": "Wha",
                "children": [
                  {
                    "nodeText": "Rima",
                    "children": [
                      {
                        "nodeText": "Ono",
                        "children": [
                          {
                            "nodeText": "Whitu",
                            "children": [
                              {
                                "nodeText": "Waru",
                                "children": [
                                  {
                                    "nodeText": "Iwa",
                                    "children": [
                                      {
                                        "nodeText": "Tekau"
                                      }
                                    ]
                                  }
                                ]
                              }
                            ]
                          }
                        ]
                      }
                    ]
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
  }
]

All the rest are on GitHub.

I ran the code coverage report on the tests, and they're all green:

So that's that: surprisingly simple once I got into my head how to approach it.

Give it a blast. Reminder: the puzzle details are in this gist.

Righto.

--
Adam

Thursday 28 July 2016

PHP: I finally get around to seeing how PHP interacts with a DB

G'day:
This is "interesting" in that I've been a PHP dev for closing in on two years now, but as yet I've not had a requirement to connect to a DB and... like... query stuff. Crazy! This is all down to everything we consume coming from web services rather than a DB. We had other teams to do the DB stuff behind the web services. And now I find myself actually on one of those teams. So I better learn how to do my job.

A word of warning on this one. It's gonna be one of those "just me pootling about testing how stuff works", and not really offering much insight on things given this is my first "go" at any of this stuff. It's more a (delayed) stream of consciousness of my experimentation. Delayed cos I wrote the code last night, but am only getting a chance to write it up now.

OK, so first things first, one needs to enable a DB driver... and they're not enabled by default. This is in php.ini:


;extension=php_bz2.dll

extension=php_curl.dll

;extension=php_fileinfo.dll

;extension=php_gd2.dll

;extension=php_gettext.dll

;extension=php_gmp.dll

extension=php_intl.dll

;extension=php_imap.dll

;extension=php_interbase.dll

;extension=php_ldap.dll

extension=php_mbstring.dll

;extension=php_exif.dll      ; Must be after mbstring as it depends on it

;extension=php_mysqli.dll

;extension=php_oci8_12c.dll  ; Use with Oracle Database 12c Instant Client

extension=php_openssl.dll

;extension=php_pdo_firebird.dll

;extension=php_pdo_mysql.dll ; uncomment this

;extension=php_pdo_oci.dll

;extension=php_pdo_odbc.dll

;extension=php_pdo_pgsql.dll

;extension=php_pdo_sqlite.dll

;extension=php_pgsql.dll

;extension=php_shmop.dll

The second thing I needed to do was to give php.ini a fully-qualified path to my extensions directory:


; Directory in which the loadable extensions (modules) reside.

; http://php.net/extension-dir

; extension_dir = "./"

; On windows:

;extension_dir = "ext"

extension_dir = "C:\apps\php\7\ext"

Sometimes I need to do this on a PHP install on Windows; sometimes not. And it doesn't even seem to be uniform on a given machine! Last night I needed to give this machine the full path to the ext dir; I reversed this this evening whilst writing this, and it didn't give me problems just being "ext".

Anyway, if you get this error:


Fatal error: Uncaught PDOException: could not find driver

Then it's the driver not being enabled, or PHP not being able to find the extension.

Right. So some code.

First things first, to connect to a DB one needs a connection to do so. I've wrapped this up in a class for my purposes, to get it out of the way of the rest of my code:

class ConnectionFactory {

    private static $host = 'localhost';
    private static $port = '3306';
    private static $dbName = 'scratch';

    public static function createConnection() {
        $connectionString = sprintf('mysql:host=%s;port=%s;dbname=%s', self::$host, self::$port, self::$dbName);
        $dbConnection = new \PDO($connectionString, Credentials::$login, Credentials::$password);

        return $dbConnection;
    }
}

And to get my connection:

$dbConnection = ConnectionFactory::createConnection();

Oh yeah, my credentials are hidden away in another class still:

class Credentials {

    public static $login = 'scratch';
    public static $password = 'scratch';
}

I def don't want to be sharing that secret information around the place.

So the connection could easily be created with one line of code, inline:

$dbConnection = new \PDO("mysql:host=localhost;port=3306;dbname=scratch", "scratch", "scratch");

Phew. One way or another we got there.

OK, so to run a query from there is pretty easy:

$numbers = $dbConnection->query('SELECT * FROM numbers');

And from there I can do stuff with my numbers:

foreach ($numbers as $row) {
    printf('ID: %s: English: %s, Maori: %s%s', $row['id'], $row['en'], $row['mi'], PHP_EOL);
}

This outputs:


ID: 1: English: one, Maori: tahi
ID: 2: English: two, Maori: rua
ID: 3: English: three, Maori: toru
ID: 4: English: four, Maori: wha
ID: 5: English: five, Maori: rima
ID: 6: English: six, Maori: ono
ID: 7: English: seven, Maori: whitu
ID: 8: English: eight, Maori: waru
ID: 9: English: nine, Maori: iwa
ID: 10: English: ten, Maori: tekau

It doesn't really get much easier than that.

And I'm sorry to my old CFMLer readers who still hang on to <cfquery> being a thing... no. Creating a connection and just running a query in two statements is easier than horsing around with admin config and tags and shit like that. And the code is nicer (a rare occasion of me saying something complimentary about PHP's "mise-en-scène").

That's a very overly simplistic example though. What about doing some filtering via a parameter in a WHERE clause?

The query method seems to just be very quick way of passing a static string to the DB and having it processed and returned. That's about it. To pass params, we need to do slightly more work. We need to prepare a statement and execute it:

$preparedStatement = $dbConnection->prepare('SELECT * FROM numbers WHERE id <= :upperThreshold');
$preparedStatement->execute(['upperThreshold' => $argv[1]]);

$numbers = $preparedStatement->fetchAll();

(note that one doesn't need to fetchall: one can just get the result row-at-a-time with fetch, but that's something for another day).

And this runs thus, when I give it a "4" (using the same "view" code as I did last time):


>php C:\src\preparedStatement.php 4
ID: 1: English: one, Maori: tahi
ID: 2: English: two, Maori: rua
ID: 3: English: three, Maori: toru
ID: 4: English: four, Maori: wha

Process finished with exit code 0

That was easy.

Aside:

I thought passing data values in the SQL statement was a failing of CFML developers. I mean this sort of thing:

<cfquery>
    SELECT *
    FROM table
    WHERE someColumn = '#dataValue#'
</cfquery>

Unfortunately I am seeing this shit in PHP code too. The devs are saying "but we sanitise the value first so there's no SQLi risk". Completely missing the point that the SQL statement and its parameters are supposed to be kept separate so the statement can be compiled.

Sigh.

As well as just passing the parameter values to the execute call, one can also bind values or variables to a parameter. Binding a value isn't that interesting:

$preparedStatement = $dbConnection->prepare('SELECT * FROM numbers WHERE id <= :upperThreshold');
$preparedStatement->bindValue(':upperThreshold', $upperThreshold, PDO::PARAM_INT);
$preparedStatement->execute();

The added bonus here is one can enforce a type on the value being bound too.

Binding a variable is more interesting. Check this out:

$upperThreshold = 1;

$dbConnection = ConnectionFactory::createConnection();
$preparedStatement = $dbConnection->prepare('SELECT * FROM numbers WHERE id <= :upperThreshold');
$preparedStatement->bindParam(':upperThreshold', $upperThreshold, PDO::PARAM_INT);

$upperThreshold = 10;

$preparedStatement->execute();
$numbers = $preparedStatement->fetchAll();

Here I'm binding the variable $upperThreshold to the param. But it's binding a reference so by the time we run this query, the bound value is 10, not the 1 that it was when the bind was first made.

It occurs to me now (as I type) that I've not tested binding a not-yet-declared variable. Let's try a variation:

$preparedStatement = $dbConnection->prepare('SELECT * FROM numbers WHERE id <= :upperThreshold');
$preparedStatement->bindParam(':upperThreshold', $upperThreshold, PDO::PARAM_INT);

$upperThreshold = 4;

$preparedStatement->execute();

Here I bind the $upperThreshold variable before I even declare it. And this all works fine!

>php.exe C:\src\preparedStatementBindParamWithNoVariable.php
ID: 1: English: one, Maori: tahi
ID: 2: English: two, Maori: rua
ID: 3: English: three, Maori: toru
ID: 4: English: four, Maori: wha

Process finished with exit code 0

So that's quite cool. Kinda.

The next thing I looked at was a simple transaction proof of concept:

$paramArray = ['id' => $argv[1]];

$dbConnection = ConnectionFactory::createConnection();

$dbConnection->beginTransaction();
$deleteStatement = $dbConnection->prepare('DELETE FROM numbers WHERE id=:id');
$deleteStatement->execute($paramArray);

$selectStatement = $dbConnection->prepare('SELECT * FROM numbers WHERE id <= (:id + 1)');
$selectStatement->execute($paramArray);
$numbers = $selectStatement->fetchAll();

echo 'In transaction:' . PHP_EOL;
include __DIR__ . '/../view/numbers.php';

$dbConnection->rollBack();

$selectStatement->execute($paramArray);
$numbers = $selectStatement->fetchAll();

echo 'After rollback:' . PHP_EOL;
include __DIR__ . '/../view/numbers.php';

Here we do this:

start a transaction;
delete a record;
run a query which shows the deletion having taken place;
rollback the transation;
perform the same query again, and the deletion has been rolled back.

Easy. I'd rather see some capability to explicitly hold onto the transaction until it was released. I don't think the act of rolling back should intrinsically mark the resolution of the transaction? One should be able to perform actions and rollback as often as one wants in one single transaction, shouldn't one? It's not something I've ever had to try, that said.

Oh, and the output:


>php.exe C:\src\transaction.php 4
In transaction:
ID: 1: English: one, Maori: tahi
ID: 2: English: two, Maori: rua
ID: 3: English: three, Maori: toru
ID: 5: English: five, Maori: rima
After rollback:
ID: 1: English: one, Maori: tahi
ID: 2: English: two, Maori: rua
ID: 3: English: three, Maori: toru
ID: 4: English: four, Maori: wha
ID: 5: English: five, Maori: rima

Process finished with exit code 0

Excellent.

Lastly I had to remind myself how to create a stored proc in MySQL, and I managed a pretty stupid one, but one which works for my example:

delimiter $$

CREATE DEFINER=`root`@`localhost` PROCEDURE `getThingsById`(id INT)
BEGIN
  SELECT * FROM numbers n WHERE n.id = id;
  SELECT * FROM colours c WHERE c.id = id;
  SELECT * FROM days d WHERE d.id = id;
END$$

This returns three record sets, each containing the number, colour or day of the week of the passed-in ID. It's nonsense, I know.

To call a proc, one still uses a prepared statement:

$id = $argv[1];

$dbConnection = ConnectionFactory::createConnection();

$preparedStatement = $dbConnection->prepare('call getThingsById(:id)');
$preparedStatement->execute(['id' => $id]);

$resultSet = $preparedStatement->fetchAll();
include __DIR__ . '/../view/general.php';

$preparedStatement->nextRowset();

$resultSet = $preparedStatement->fetchAll();
include __DIR__ . '/../view/general.php';

$preparedStatement->nextRowset();

$resultSet = $preparedStatement->fetchAll();
include __DIR__ . '/../view/general.php';

general.php:

foreach ($resultSet as $row) {
    printf('ID: %s: English: %s, Maori: %s%s', $row['id'], $row['en'], $row['mi'], PHP_EOL);
}

This is all pretty familiar, except the thing to note is the nextRowset call, which moves between the recordsets returned by the proc call.

Oh yeah... I keep forgetting... the output:


>php.exe C:\src\php\doctrine.local\scripts\callProc.php 3
ID: 3: English: three, Maori: toru
ID: 3: English: yellow, Maori: kowhai
ID: 3: English: Wednesday, Maori: Rāapa

That's all I've looked at so far. All in all it all seems uncharacteristically sensibtle for PHP, I must say.

And now I've been sitting in this hotel bar typing for the best (?) part of four hours. Time to focus on finishing this beer and going to bed I think.

Righto.

--
Adam

Wednesday 27 July 2016

PHP / Silex / Dependency injection: should I use a reference to a method?

G'day:
This is another example of where I want to put a question out there, but it needs more space that Twitter will allow. I guess I'll put it on Stack Overflow too.

We use Silex and Pimple's DI container. Generally in our service providers we expose references to objects, eg:

$app['service.something'] = $app->share(function ($app) {
    return new SomethingService();
});

Then usage of that is predictable:

$result = $app['service.something']::someMethod(1, 2, 3);

But here's the thing. As with the case above, the method is actually a static one. So it seems "odd" to be calling it on an instance of the SomethingService, rather than on the class. As coincidence would have it, this is the only public method in SomethingService too.

So we've done this sort of thing in our service provider:

$app['something'] = $app->protect(function ($arg1, $arg2, $etc) {
    return SomeClass::someMethod($arg1, $arg2, $etc);
});

Usage:

$result = $app['something'](1, 2, 3);

Now this works OK, but I have a coupla hesitations:

the name $app['something'] is a bit noun-y. As it's a method it ought to be verb-y IMO, eg: $app['doTheThing']. I guess that's small fish.
I'm just not sure if it's particularly "semantic" to be exposing just methods like that. All usage of Pimple I've seen has been to expose dependent objects.

I think we'll run with what we've got, but I kinda want to get more thoughts on if this is the right approach. And if so (or if not) why not. There might be some gotchas here I'm not thinking of, or something.

Cheers.

--
Adam

Monday 25 July 2016

PHP: re-call the same method for the same result, or use an intermediary variable?

G'day:
I could not fit this in a Twitter message, so I'll clutter up this joint instead. And, hey, it could use some content, I know.

I recently came across some PHP code (this happens to me a lot these day), which was along these lines:

$someObj->someMethod()['someKey']->doSomething();
$someObj->someMethod()['someOtherKey']->doSomethingElse();
$someObj->someMethod()['anotherKey']->performAnotherAction();
$someObj->someMethod()['lastKey']->doLastThing();

Obviously I've changed the names to protect the in~~nocent~~terloper, but this is pretty much what's going on in the code. On four consecutive statements the same method is being called four times, returning the same result, which is then used for further processing.

If this was my code, I'd see no point in repeating myself all over again, instead just ask the question once, and reuse the answer:

$descriptiveName = $someObj->someMethod();
$descriptiveName['someKey']->doSomething();
$descriptiveName['someOtherKey']->doSomethingElse();
$descriptiveName['anotherKey']->performAnotherAction();
$descriptiveName['lastKey']->doLastThing();

To me calling a method is work, and I'd like to avoid work if possible. Just as my colleagues.

However this is not the only occurrence of the former approach I've seen, so perhaps I am missing something? Is there some benefit to desirable behaviour from calling the same method for the same answer repeatedly?

Note I'm not so fussed about performance in situations like this, although I would have expect calling a method multiple times would have an inconsequential performance hit compared to an intermediary variable. Is this not the case? But like I said... whether or not it might be is not really a consideration here. It just seems like doing work for the sake of it?

Thoughts?

For the sake of conversation and broadening everyone's knowledge... in your own language of choice if it's not PHP: what would your answer to this be?

--
Adam

Wednesday 4 May 2016

PHP: constants vs private static variables and understanding the intent of code

G'day:
I was looking at some code the other day, and got into a conversation with the author about it. I can't show you the actual code (and, indeed, I'll be fictionalising the content of the conversation for editorial reasons), but here's a contrived reproduction of the situation.

There is a function:

function someFunction($arg1, $arg2) {
    $intermediaryValue = $this->firstDependency->prepare($arg1);
    return $this->secondDependency->process($intermediaryValue, $arg2);
}

Don't worry about that, what I was looking at were the tests for the function. Those dependencies needed to be mocked-out, and for the purposes of this function, it's important that what's returned from someFunction is exactly the result of the process method. So we need to specifically test for that.

class SomeClassTest extends \PHPUnit_Framework_TestCase {

    const MOCKED_PREPARE_RESULT = 'MOCKED_PREPARE_RESULT';
    const MOCKED_PROCESS_RESULT = 'MOCKED_PROCESS_RESULT';

    private $mockedFirstDependency;
    private $mockedSecondDependency;
    private $someObject;
    
    
    function setup(){
        $this->mockedFirstDependency = $this->getMockedFirstDepenedency();
        $this->mockedSecondDependency = $this->getMockedSecondDepenedency();
        $this->someObject = new SomeClass($mockedDependency);
    }
    
    /**
    * @dataProvider provideCasesForSomeFunctionTests
    */
    function testSomeFunction($testArg1, $testArg2){
        $this->mockedFirstDependency->method('prepare');
            ->with($testArg1, $testArg2)
            ->willReturn(self::MOCKED_PREPARE_RESULT);
            
        $result = $this->someObject->someFunction($testArg1, $testArg2);
        
        $this->assertEquals(self::MOCKED_PROCESS_RESULT, $result);
        
    }
    
    function provideCasesForSomeFunctionTests(){
        return [
            'first variation' => ['testArg1'=>'test value 1', 'testArg2'=>'test value 2'],
            'second variation' => ['testArg1'=>'test value 3', 'testArg2'=>'test value 4']
        ];
    }
    
    private function getMockedFirstDependency(){
        $mockedFirstDependency = $this->getMockBuilder('\some\namespace\FirstDependency')
            ->setMethods(['prepare'])
            ->getMock();
            
        return $mockedFirstDependency;
    }
    
    private function getMockedSecondDependency(){
        $mockedSecondDependency = $this->getMockBuilder('\some\namespace\SecondDependency')
            ->setMethods(['process'])
            ->getMock();
        
        $mockedSecondDependency->method('process')
            ->with(self::MOCKED_PREPARE_RESULT)
            ->willReturn(self::MOCKED_PROCESS_RESULT);
            
        return $mockedSecondDependency;
    }

}

What I was specifically looking at was the usage of constants here. Chiefly I was wondering "why are those constants?" I scanned the rest of the code and they were only ever used in this class (and remember it's a test class), and the names and the values are very specific to their usage in said class.

So I asked. The reason why they're constants is because they're unchanging values. Hmmm. Well... that doesn't wash, because a variable that one simply doesn't change also doesn't change value. So that's not valid. There is more to that necessary to determine something ought to be a constant.

Firstly, in PHP - until PHP7 anyhow - constants are intrinsically public. And according to the principle of encapsulation and the notion of information hiding, the only thing in a class that ought to be public is stuff that is specifically supposed to be public. IE: the API to the class. By inference, if you make something public, it's a flag saying "use this publicly... it's there specifically for you to do that". That is not valid here, so it's invalid for the value to be public, so - even if the value/usage was appropriate to be a constant - it kinda can't be one.

Secondly... the usage here simply isn't that of a constant. The rationalisation that its value is a constant/static/literal value therefore it should be a constant is specious. That's not how it works. Let's have a look at how we ended up with this constant being created.

Firstly we had this pre-refactoring version of the code:

function testSomeFunction($testArg1, $testArg2){
    $this->mockedFirstDependency->method('prepare');
        ->with($testArg1, $testArg2)
        ->willReturn('MOCKED_PREPARE_RESULT');
        
    $result = $this->someObject->someFunction($testArg1, $testArg2);
    
    $this->assertEquals('MOCKED_PROCESS_RESULT', $result);
    
}

// [...]

private function getMockedSecondDependency(){
    $mockedSecondDependency = $this->getMockBuilder('\some\namespace\SecondDependency')
        ->setMethods(['process'])
        ->getMock();
    
    $mockedSecondDependency->method('process')
        ->with('MOCKED_PREPARE_RESULT')
        ->willReturn('MOCKED_PROCESS_RESULT');
        
    return $mockedSecondDependency;
}

Note how we have a coupled duplicated values there: MOCKED_PREPARE_RESULT and MOCKED_PROCESS_RESULT. That's not DRY, so obviously there's some refactoring to do. Before we refactor, let's pause for a second. Imagine the code was like this:

function testSomeFunction($testArg1, $testArg2){
    $this->mockedFirstDependency->method('prepare');
        ->with($testArg1, $testArg2)
        ->willReturn('GENERIC_DUMMY_VALUE');
        
    $result = $this->someObject->someFunction($testArg1, $testArg2);
    
    $this->assertEquals('GENERIC_DUMMY_VALUE', $result);
    
}

We have the same duplication twice (he says, being deliberately tautological) in the same function. What do we do? We extract it to a variable:

function testSomeFunction($testArg1, $testArg2){
    $genericDummyValue = 'GENERIC_DUMMY_VALUE';
    $this->mockedFirstDependency->method('prepare');
        ->with($testArg1, $testArg2)
        ->willReturn($genericDummyValue);
        
    $result = $this->someObject->someFunction($testArg1, $testArg2);
    
    $this->assertEquals($genericDummyValue, $result);
    
}

Now all things being equal, that's a fine enough solution. If it was in a "real" class as opposed to just a test one might conclude that having a magic value in the middle of the code like that might be poor play, but in this case it's fine.

But in our case, the usage of the values is spread across two functions, so using function-local variables doesn't help it. What do we do when we need the same variable across more than one function? We hoist it up to be a class variable (well, two: $mockedPrepareResult and $mockedProcessResult):

private $mockedPrepareResult = 'MOCKED_PREPARE_RESULT';
private $mockedProcessResult = 'MOCKED_PROCESS_RESULT';

// [...]

function testSomeFunction($testArg1, $testArg2){
    $this->mockedFirstDependency->method('prepare');
        ->with($testArg1, $testArg2)
        ->willReturn($this->mockedPrepareResult);
        
    $result = $this->someObject->someFunction($testArg1, $testArg2);
    
    $this->assertEquals($this->mockedProcessResult, $result);
    
}

// [...]

private function getMockedSecondDependency(){
    $mockedSecondDependency = $this->getMockBuilder('\some\namespace\SecondDependency')
        ->setMethods(['process'])
        ->getMock();
    
    $mockedSecondDependency->method('process')
        ->with($this->mockedPrepareResult)
        ->willReturn($this->mockedProcessResult);
        
    return $mockedSecondDependency;
}

And one might go one step further and conclude these variables are indeed class variables and not instance variables, so one makes them static as well:

private static $mockedPrepareResult = 'MOCKED_PREPARE_RESULT';

// [...]

    $this->mockedFirstDependency->method('prepare');
        ->with($testArg1, $testArg2)
        ->willReturn(self::$mockedPrepareResult);

Bear in mind that in common English "constant" and "static" are analogous, but in programming usage they are for the most part unrelated. A constant is an immutable value. a static value is one that is specific to the class it's in, rather than an instance of that class. There's no overlap in the two notions in either of those contexts the terms have meaning.

So as to why one would not then go one step further and change them from private static variables to constants... it's because that's a complete non-sequitur. At no point is there any indication that the usage of these values here implies that they're supposed to be constants. They're just variables, and in the usage of these variable it would be a logic error to change the value between separate references to the variable. This is analogous to this code:


$theTotal = $this->getTheTotal();
$theTotal += 1;
return $theTotal;

It'd be a logic error to increment $theTotal by one, so don't bloody do it. It's not down to a language construct to prevent this, it's down to the programmer not being a div.

I doubt any developer would do this:


final $theTotal = $this->getTheTotal();
$theTotal += 1;
return $theTotal;

(pre-supposing PHP has the final keyword, which it doesn't). Thus putting it down to the compiler to catch the logic error.

What's a constant then?

Well whether or something ought to be a constant is down to two things. First the constant's value. It's not so much that "one oughtn't change it" (although preventing that is a side effect of something being a constant), it's that the value of the constant doesn't change.

This is a sensible usage of constants:

class Numbers {

    const BRACE = 2;
    const DOZEN = 12;
    const SCORE = 20;
    const GROSS = 144;
    
    // [...]
}

It's not so much that one doesn't want to change the value of brace, it's that the value of what it is to be a brace doesn't change. It's constant. That's when one would use a constant.

The second criterion as to why one might make a value a constant is that one actively wants to expose it publicly. A good real-world example of constants are in Symfony 2's Response class:

class Response
{
    const HTTP_CONTINUE = 100;
    const HTTP_SWITCHING_PROTOCOLS = 101;
    const HTTP_PROCESSING = 102;            // RFC2518
    const HTTP_OK = 200;
    const HTTP_CREATED = 201;
    const HTTP_ACCEPTED = 202;
    const HTTP_NON_AUTHORITATIVE_INFORMATION = 203;
    const HTTP_NO_CONTENT = 204;
    const HTTP_RESET_CONTENT = 205;
    // etc you get the idea

So when creating responses in our controllers, we don't do stuff like:

return Response::create($twig->render('clientError.twig'), 404);

Instead we do this:

return Response::create($twig->render('clientError.twig'), Response::HTTP_NOT_FOUND);

It just make the code more clear as to what's going on.

Also note that these example usages are all publicly exposed, and that's their intended usage. In that Numbers class if we only ever used the notion of a BRACE internally, and didn't have reason to expose it (it's a not oft-used word, after all), they we'd not make a constant, we'd just have a private static variable.

Realistically constants are something that one generally won't want to use. It's only in those rare situations where one has a literal value which has a specific constant meaning, and could benefit from being given a human-friendly (/code-clarity-friendly) label. Most values one will want to expose from a class or an object will not be simple literal values, so one would be exposing them via methods.

And it's a misunderstanding of the intended purpose of constant to conclude they're for literal values you're unlikely to want to change (in just a logical sense). They're for literal values which don't. That's a different thing.

Righto.

--
Adam

Sunday 1 May 2016

PHP: Scaling the decorator pattern

G'day:
I while back I had a look at using the Decorator Pattern to simplify some code:

One again me mate Brian has come to the fore with a tweak to make this tactic an even more appealing prospect: leveraging PHP's "magic" __call method to simplify decorator code.

Let's have a look at one of the examples I used earlier:

class User {

    public $id;
    public $firstName;
    public $lastName;

    function __construct($id, $firstName, $lastName){
        $this->id = $id;
        $this->firstName = $firstName;
        $this->lastName = $lastName;
    }

}

class DataSource {

    function getById($id){
        return json_encode([
            'id' => $id,
            'firstName' => 'Zachary',
            'lastName' => 'Cameron Lynch',
        ]);
    }
}

class LoggerService {

    function logText($text){
        echo "LOGGED: $text" . PHP_EOL;
    }
}


class UserRepository {

    private $dataSource;

    public function __construct($dataSource) {
        $this->dataSource = $dataSource;
    }

    public function getById($id) {
        $userAsJson = $this->dataSource->getById($id);

        $rawUser = json_decode($userAsJson);
        $user = new User($rawUser->id, $rawUser->firstName, $rawUser->lastName);

        return $user;
    }

}

class LoggedUserRepository {

    private $repository;
    private $loggerService;

    public function __construct($repository, $loggerService) {
        $this->repository = $repository;
        $this->loggerService = $loggerService;
    }

    public function getById($id) {
        $this->loggerService->logText("$id requested");
        $object = $this->repository->getById($id);

        return $object;
    }

}

$dataSource = new DataSource();
$userRepository = new UserRepository($dataSource);

$loggerService = new LoggerService();
$loggedUserRepository = new LoggedUserRepository($userRepository, $loggerService);

$user = $loggedUserRepository->getById(5);
var_dump($user);

That looks like a chunk of code, but the User, DataSource and LoggerService are just dependencies. The code you really wannna look at are the two repo variations: the basic UserRepository, and the decorated LoggedUserRepository.

This code all works fine, and outputs:

C:\src>php baseline.php
LOGGED: 5 requested
object(User)#6 (3) {
["id"]=>
int(5)
["firstName"]=>
string(7) "Zachary"
["lastName"]=>
string(13) "Cameron Lynch"
}

C:\src>

Fine.

But this is a very simple example: the repo has only one one method. So the decorator only needs to implement one method. But what if the repo has ten public methods? Suddenly the decorator is getting rather busy, as it needs to implement wrappers for those methods too. Yikes. This is made even worse if the decorator is only really interested in decorating one method... it still needs those other nine wrappers. And it'd be getting very boiler-plate-ish to have to implement all these "empty" wrapper methods.

Let's add a coupla more methods to our repo:

class UserRepository {

    private $dataSource;

    public function __construct($dataSource) {
        $this->dataSource = $dataSource;
    }

    public function getById($id) {
        $userAsJson = $this->dataSource->getById($id);

        $rawUser = json_decode($userAsJson);
        $user = new User($rawUser->id, $rawUser->firstName, $rawUser->lastName);

        return $user;
    }

    public function getByFilters($filters) {
        $usersAsJson = $this->dataSource->getByFilters($filters);

        $rawUsers = json_decode($usersAsJson);
        $users = array_map(function($rawUser){
            return new User($rawUser->id, $rawUser->firstName, $rawUser->lastName);
        }, $rawUsers);

        return $users;
    }

    public function create($firstName, $lastName){
        $rawNewUser = json_decode($this->dataSource->create($firstName, $lastName));
        return new User($rawNewUser->id, $rawNewUser->firstName, $rawNewUser->lastName);
    }
}

We have two new methods: getByFilters() and create(). Even if we don't want to log calls to those for some reason, we still need the LoggedUserRepository to implement "pass-through" methods for them:

class LoggedUserRepository {

    private $repository;
    private $loggerService;

    public function __construct($repository, $loggerService) {
        $this->repository = $repository;
        $this->loggerService = $loggerService;
    }

    public function getById($id) {
        $this->loggerService->logText("$id requested");
        $object = $this->repository->getById($id);

        return $object;
    }

    public function getByFilters($filters) {
        return $this->repository->getByFilters($filters);
    }

    public function create($firstName, $lastName) {
        return $this->repository->create($firstName, $lastName);
    }

}

See how we still need methods for getByFilters and create? Suck. I mean it's not a huge amount of code, but given the LoggedUserRepository doesn't actually wanna log those methods, they're out of place.

About

I've been a web developer since 2001: 13yrs as a CFML developer; 6yrs as a PHP dev; and now back in the CFML community but focusing on code quality, design and testing. As of 2023, I am migrating my dev focus back to PHP again.

The code I write and discuss here is pretty much just looking at random conundrums I encounter in my day job.

I tend to be a bit "forthright" in my opinions, I am indelicate, and I tend to swear too much. This will come out occasionally here: I make no apology for it.

Everything said here is my own opinion. Feel free to disagree with me :-)