Wednesday, 4 May 2016

PHP: constants vs private static variables and understanding the intent of code

G'day:
I was looking at some code the other day, and got into a conversation with the author about it. I can't show you the actual code (and, indeed, I'll be fictionalising the content of the conversation for editorial reasons), but here's a contrived reproduction of the situation.

There is a function:

function someFunction($arg1, $arg2) {
    $intermediaryValue = $this->firstDependency->prepare($arg1);
    return $this->secondDependency->process($intermediaryValue, $arg2);
}

Don't worry about that, what I was looking at were the tests for the function. Those dependencies needed to be mocked-out, and for the purposes of this function, it's important that what's returned from someFunction is exactly the result of the process method. So we need to specifically test for that.

class SomeClassTest extends \PHPUnit_Framework_TestCase {

    const MOCKED_PREPARE_RESULT = 'MOCKED_PREPARE_RESULT';
    const MOCKED_PROCESS_RESULT = 'MOCKED_PROCESS_RESULT';

    private $mockedFirstDependency;
    private $mockedSecondDependency;
    private $someObject;
    
    
    function setup(){
        $this->mockedFirstDependency = $this->getMockedFirstDepenedency();
        $this->mockedSecondDependency = $this->getMockedSecondDepenedency();
        $this->someObject = new SomeClass($mockedDependency);
    }
    
    /**
    * @dataProvider provideCasesForSomeFunctionTests
    */
    function testSomeFunction($testArg1, $testArg2){
        $this->mockedFirstDependency->method('prepare');
            ->with($testArg1, $testArg2)
            ->willReturn(self::MOCKED_PREPARE_RESULT);
            
        $result = $this->someObject->someFunction($testArg1, $testArg2);
        
        $this->assertEquals(self::MOCKED_PROCESS_RESULT, $result);
        
    }
    
    function provideCasesForSomeFunctionTests(){
        return [
            'first variation' => ['testArg1'=>'test value 1', 'testArg2'=>'test value 2'],
            'second variation' => ['testArg1'=>'test value 3', 'testArg2'=>'test value 4']
        ];
    }
    
    private function getMockedFirstDependency(){
        $mockedFirstDependency = $this->getMockBuilder('\some\namespace\FirstDependency')
            ->setMethods(['prepare'])
            ->getMock();
            
        return $mockedFirstDependency;
    }
    
    private function getMockedSecondDependency(){
        $mockedSecondDependency = $this->getMockBuilder('\some\namespace\SecondDependency')
            ->setMethods(['process'])
            ->getMock();
        
        $mockedSecondDependency->method('process')
            ->with(self::MOCKED_PREPARE_RESULT)
            ->willReturn(self::MOCKED_PROCESS_RESULT);
            
        return $mockedSecondDependency;
    }

}

What I was specifically looking at was the usage of constants here. Chiefly I was wondering "why are those constants?" I scanned the rest of the code and they were only ever used in this class (and remember it's a test class), and the names and the values are very specific to their usage in said class.

So I asked. The reason why they're constants is because they're unchanging values. Hmmm. Well... that doesn't wash, because a variable that one simply doesn't change also doesn't change value. So that's not valid. There is more to that necessary to determine something ought to be a constant.

Firstly, in PHP - until PHP7 anyhow - constants are intrinsically public. And according to the principle of encapsulation and the notion of information hiding, the only thing in a class that ought to be public is stuff that is specifically supposed to be public. IE: the API to the class. By inference, if you make something public, it's a flag saying "use this publicly... it's there specifically for you to do that". That is not valid here, so it's invalid for the value to be public, so - even if the value/usage was appropriate to be a constant - it kinda can't be one.

Secondly... the usage here simply isn't that of a constant. The rationalisation that its value is a constant/static/literal value therefore it should be a constant is specious. That's not how it works. Let's have a look at how we ended up with this constant being created.

Firstly we had this pre-refactoring version of the code:

function testSomeFunction($testArg1, $testArg2){
    $this->mockedFirstDependency->method('prepare');
        ->with($testArg1, $testArg2)
        ->willReturn('MOCKED_PREPARE_RESULT');
        
    $result = $this->someObject->someFunction($testArg1, $testArg2);
    
    $this->assertEquals('MOCKED_PROCESS_RESULT', $result);
    
}

// [...]

private function getMockedSecondDependency(){
    $mockedSecondDependency = $this->getMockBuilder('\some\namespace\SecondDependency')
        ->setMethods(['process'])
        ->getMock();
    
    $mockedSecondDependency->method('process')
        ->with('MOCKED_PREPARE_RESULT')
        ->willReturn('MOCKED_PROCESS_RESULT');
        
    return $mockedSecondDependency;
}


Note how we have a coupled duplicated values there: MOCKED_PREPARE_RESULT and MOCKED_PROCESS_RESULT. That's not DRY, so obviously there's some refactoring to do. Before we refactor, let's pause for a second. Imagine the code was like this:

function testSomeFunction($testArg1, $testArg2){
    $this->mockedFirstDependency->method('prepare');
        ->with($testArg1, $testArg2)
        ->willReturn('GENERIC_DUMMY_VALUE');
        
    $result = $this->someObject->someFunction($testArg1, $testArg2);
    
    $this->assertEquals('GENERIC_DUMMY_VALUE', $result);
    
}

We have the same duplication twice (he says, being deliberately tautological) in the same function. What do we do? We extract it to a variable:

function testSomeFunction($testArg1, $testArg2){
    $genericDummyValue = 'GENERIC_DUMMY_VALUE';
    $this->mockedFirstDependency->method('prepare');
        ->with($testArg1, $testArg2)
        ->willReturn($genericDummyValue);
        
    $result = $this->someObject->someFunction($testArg1, $testArg2);
    
    $this->assertEquals($genericDummyValue, $result);
    
}

Now all things being equal, that's a fine enough solution. If it was in a "real" class as opposed to just a test one might conclude that having a magic value in the middle of the code like that might be poor play, but in this case it's fine.

But in our case, the usage of the values is spread across two functions, so using function-local variables doesn't help it. What do we do when we need the same variable across more than one function? We hoist it up to be a class variable (well, two: $mockedPrepareResult and $mockedProcessResult):

private $mockedPrepareResult = 'MOCKED_PREPARE_RESULT';
private $mockedProcessResult = 'MOCKED_PROCESS_RESULT';

// [...]

function testSomeFunction($testArg1, $testArg2){
    $this->mockedFirstDependency->method('prepare');
        ->with($testArg1, $testArg2)
        ->willReturn($this->mockedPrepareResult);
        
    $result = $this->someObject->someFunction($testArg1, $testArg2);
    
    $this->assertEquals($this->mockedProcessResult, $result);
    
}

// [...]

private function getMockedSecondDependency(){
    $mockedSecondDependency = $this->getMockBuilder('\some\namespace\SecondDependency')
        ->setMethods(['process'])
        ->getMock();
    
    $mockedSecondDependency->method('process')
        ->with($this->mockedPrepareResult)
        ->willReturn($this->mockedProcessResult);
        
    return $mockedSecondDependency;
}

And one might go one step further and conclude these variables are indeed class variables and not instance variables, so one makes them static as well:

private static $mockedPrepareResult = 'MOCKED_PREPARE_RESULT';

// [...]

    $this->mockedFirstDependency->method('prepare');
        ->with($testArg1, $testArg2)
        ->willReturn(self::$mockedPrepareResult);

Bear in mind that in common English "constant" and "static" are analogous, but in programming usage they are for the most part unrelated. A constant is an immutable value. a static value is one that is specific to the class it's in, rather than an instance of that class. There's no overlap in the two notions in either of those contexts the terms have meaning.

So as to why one would not then go one step further and change them from private static variables to constants... it's because that's a complete non-sequitur. At no point is there any indication that the usage of these values here implies that they're supposed to be constants. They're just variables, and in the usage of these variable it would be a logic error to change the value between separate references to the variable. This is analogous to this code:


$theTotal = $this->getTheTotal();
$theTotal += 1;
return $theTotal;

It'd be a logic error to increment $theTotal by one, so don't bloody do it. It's not down to a language construct to prevent this, it's down to the programmer not being a div.

I doubt any developer would do this:

final $theTotal = $this->getTheTotal();
$theTotal += 1;
return $theTotal;

(pre-supposing PHP has the final keyword, which it doesn't). Thus putting it down to the compiler to catch the logic error.

What's a constant then?

Well whether or something ought to be a constant is down to two things. First the constant's value. It's not so much that "one oughtn't change it" (although preventing that is a side effect of something being a constant), it's that the value of the constant doesn't change.

This is a sensible usage of constants:

class Numbers {

    const BRACE = 2;
    const DOZEN = 12;
    const SCORE = 20;
    const GROSS = 144;
    
    // [...]
}

It's not so much that one doesn't want to change the value of brace, it's that the value of what it is to be a brace doesn't change. It's constant. That's when one would use a constant.

The second criterion as to why one might make a value a constant is that one actively wants to expose it publicly. A good real-world example of constants are in Symfony 2's Response class:

class Response
{
    const HTTP_CONTINUE = 100;
    const HTTP_SWITCHING_PROTOCOLS = 101;
    const HTTP_PROCESSING = 102;            // RFC2518
    const HTTP_OK = 200;
    const HTTP_CREATED = 201;
    const HTTP_ACCEPTED = 202;
    const HTTP_NON_AUTHORITATIVE_INFORMATION = 203;
    const HTTP_NO_CONTENT = 204;
    const HTTP_RESET_CONTENT = 205;
    // etc you get the idea

So when creating responses in our controllers, we don't do stuff like:

return Response::create($twig->render('clientError.twig'), 404);

Instead we do this:

return Response::create($twig->render('clientError.twig'), Response::HTTP_NOT_FOUND);

It just make the code more clear as to what's going on.

Also note that these example usages are all publicly exposed, and that's their intended usage. In that Numbers class if we only ever used the notion of a BRACE internally, and didn't have reason to expose it (it's a not oft-used word, after all), they we'd not make a constant, we'd just have a private static variable.

Realistically constants are something that one generally won't want to use. It's only in those rare situations where one has a literal value which has a specific constant meaning, and could benefit from being given a human-friendly (/code-clarity-friendly) label. Most values one will want to expose from a class or an object will not be simple literal values, so one would be exposing them via methods.

And it's a misunderstanding of the intended purpose of constant to conclude they're for literal values you're unlikely to want to change (in just a logical sense). They're for literal values which don't. That's a different thing.

Righto.

--
Adam

Sunday, 1 May 2016

PHP: Scaling the decorator pattern

G'day:
I while back I had a look at using the Decorator Pattern to simplify some code:


One again me mate Brian has come to the fore with a tweak to make this tactic an even more appealing prospect: leveraging PHP's "magic" __call method to simplify decorator code.

Let's have a look at one of the examples I used earlier:

class User {

    public $id;
    public $firstName;
    public $lastName;

    function __construct($id, $firstName, $lastName){
        $this->id = $id;
        $this->firstName = $firstName;
        $this->lastName = $lastName;
    }

}

class DataSource {

    function getById($id){
        return json_encode([
            'id' => $id,
            'firstName' => 'Zachary',
            'lastName' => 'Cameron Lynch',
        ]);
    }
}

class LoggerService {

    function logText($text){
        echo "LOGGED: $text" . PHP_EOL;
    }
}


class UserRepository {

    private $dataSource;

    public function __construct($dataSource) {
        $this->dataSource = $dataSource;
    }

    public function getById($id) {
        $userAsJson = $this->dataSource->getById($id);

        $rawUser = json_decode($userAsJson);
        $user = new User($rawUser->id, $rawUser->firstName, $rawUser->lastName);

        return $user;
    }

}

class LoggedUserRepository {

    private $repository;
    private $loggerService;

    public function __construct($repository, $loggerService) {
        $this->repository = $repository;
        $this->loggerService = $loggerService;
    }

    public function getById($id) {
        $this->loggerService->logText("$id requested");
        $object = $this->repository->getById($id);

        return $object;
    }

}

$dataSource = new DataSource();
$userRepository = new UserRepository($dataSource);

$loggerService = new LoggerService();
$loggedUserRepository = new LoggedUserRepository($userRepository, $loggerService);

$user = $loggedUserRepository->getById(5);
var_dump($user);

That looks like a chunk of code, but the User, DataSource and LoggerService are just dependencies. The code you really wannna look at are the two repo variations: the basic UserRepository, and the decorated LoggedUserRepository.

This code all works fine, and outputs:

C:\src>php baseline.php
LOGGED: 5 requested
object(User)#6 (3) {
  ["id"]=>
  int(5)
  ["firstName"]=>
  string(7) "Zachary"
  ["lastName"]=>
  string(13) "Cameron Lynch"
}

C:\src>

Fine.

But this is a very simple example: the repo has only one one method. So the decorator only needs to implement one method. But what if the repo has ten public methods? Suddenly the decorator is getting rather busy, as it needs to implement wrappers for those methods too. Yikes. This is made even worse if the decorator is only really interested in decorating one method... it still needs those other nine wrappers. And it'd be getting very boiler-plate-ish to have to implement all these "empty" wrapper methods.

Let's add a coupla more methods to our repo:

class UserRepository {

    private $dataSource;

    public function __construct($dataSource) {
        $this->dataSource = $dataSource;
    }

    public function getById($id) {
        $userAsJson = $this->dataSource->getById($id);

        $rawUser = json_decode($userAsJson);
        $user = new User($rawUser->id, $rawUser->firstName, $rawUser->lastName);

        return $user;
    }

    public function getByFilters($filters) {
        $usersAsJson = $this->dataSource->getByFilters($filters);

        $rawUsers = json_decode($usersAsJson);
        $users = array_map(function($rawUser){
            return new User($rawUser->id, $rawUser->firstName, $rawUser->lastName);
        }, $rawUsers);

        return $users;
    }

    public function create($firstName, $lastName){
        $rawNewUser = json_decode($this->dataSource->create($firstName, $lastName));
        return new User($rawNewUser->id, $rawNewUser->firstName, $rawNewUser->lastName);
    }
}

We have two new methods: getByFilters() and create(). Even if we don't want to log calls to those for some reason, we still need the LoggedUserRepository to implement "pass-through" methods for them:

class LoggedUserRepository {

    private $repository;
    private $loggerService;

    public function __construct($repository, $loggerService) {
        $this->repository = $repository;
        $this->loggerService = $loggerService;
    }

    public function getById($id) {
        $this->loggerService->logText("$id requested");
        $object = $this->repository->getById($id);

        return $object;
    }

    public function getByFilters($filters) {
        return $this->repository->getByFilters($filters);
    }

    public function create($firstName, $lastName) {
        return $this->repository->create($firstName, $lastName);
    }

}


See how we still need methods for getByFilters and create? Suck. I mean it's not a huge amount of code, but given the LoggedUserRepository doesn't actually wanna log those methods, they're out of place.

Saturday, 30 April 2016

PHP: Strategy pattern to defer implementation details

G'day:
Once again my Saturday evening is set in the local pub (well: the one in Galway I go to when I'm here anyhow. I go to this place more than I do my "local" at home), with a Guinness and the laptop in front of me. I've not actually done this for a while, coming to think of it.

Right, so today's random code exercise is brought to me via Stack Overflow, with some bod asking a question about how to decouple knowledge of the implementation of a task from the code that needs to execute the task. They ask it in a slightly different way, but that's the gist of it. Go read it anyhow.

This very thing has come up at work a bit recently... we have a tendency to have a single class which has various different implementations of the same task, but for different situational variations. We could all see this was a bit of a smell, but weren't initially sure how we should approach it. Enter my mate Brian and his ability to remember things that he's read (damn him), and at some stage he's read that - apparently turgid to the point of being close to unreadable - book "Design Patterns [etc]" which at least identifies a bunch of design patterns we should all be aware of when solving problems. I think I'll stick with the copy of "Head First Design Patterns" sitting on my desk. It's eminently readable.

Right, so this sounds like a case for the Strategy Pattern, which - according to Wikipedia - "… is a software design pattern that enables an algorithm's behavior to be selected at runtime". Basically you have a class which has a method to perform a task, but the implementation of the task is handed off to a collaborator which is implemented separately.

By way of example I'm gonna use a very contrived situation to demonstrate the technique. NB: this is different from my answer on Stack Overflow, but amounts to the same thing. Here we the notion of a Name, and a NameDecorator; and the requirement is to render the name in two ways: plain text as "[lastName], [firstName]" and with a mark-up template to effect something like "[firstName] [lastName]". A naïve implementation might be as follows:

First we just have a very generic Name class:

class Name {

    public $firstName;
    public $lastName;

    function __construct($firstName, $lastName){
        $this->firstName = $firstName;
        $this->lastName = $lastName;
    }
}

That's not really part of the problem here, but the code revolves around this. And now our Decorator class:

class NameDecorator {

    private $name;
    private $fullNameTemplate;

    function __construct($name, $fullNameTemplate){
        $this->name = $name;
        $this->fullNameTemplate = $fullNameTemplate;
    }

    function renderFullNameAsPlainText(){
        return sprintf('%s, %s', $this->name->lastName, $this->name->firstName);
    }

    function renderFullNameWithHtmlTemplate(){
        return sprintf($this->fullNameTemplate, $this->name->firstName, $this->name->lastName);
    }
}

And sample usage:

$name = new Name('Zachary', 'Cameron Lynch');
$template = '%s <strong>%s</strong>';

$nameDecorator = new NameDecorator($name, $template);

$rendered = $nameDecorator->renderFullNameAsPlainText();
echo "Simple rendering: $rendered<br>";

$rendered = $nameDecorator->renderFullNameWithHtmlTemplate();
echo "Rendering with template: $rendered<br>";

And result:
Simple rendering: Cameron Lynch, Zachary
Rendering with template: Zachary Cameron Lynch


Here's the problem. Sometimes we need to render the name using a template: sometimes we don't. Some entire usages of this decorator will never need the template, but we still need to initialise the decorator with it.

This could be easily solved by passing the template with the rendering call, instead of init-ing the whole object with it:

function renderFullNameWithHtmlTemplate($template){
    return sprintf($template, $this->name->firstName, $this->name->lastName);
}

And if that was the full requirement, that'd be fine. But the requirement is such that we need to call this code from other code which doesn't want to fuss about with templates and which method to call, it just wants to call a render method and dependent on how the decorator has been configured, will use whatever rendering implementation the decorator has decided upon.

We could solve this with an abstract notion of what the NameDecorator is, and have concrete implementations for each variation:

abstract class NameDecorator {

    protected $name;

    function __construct($name){
        $this->name = $name;
    }

    abstract public function render();
}

class PlainTextNameDecorator extends NameDecorator {

    function render(){
        return sprintf('%s, %s', $this->name->lastName, $this->name->firstName);
    }
}

class TemplatedNameDecorator extends NameDecorator {

    protected $fullNameTemplate;

    function __construct($name, $fullNameTemplate){
        parent::__construct($name);
        $this->fullNameTemplate = $fullNameTemplate;
    }

    function render(){
        return sprintf($this->fullNameTemplate, $this->name->firstName, $this->name->lastName);
    }
}

Now we just use separate Decorator classes: PlainTextNameDecorator and TemplatedNameDecorator, but we have the one unified render() method:

$name = new Name('Zachary', 'Cameron Lynch');

$nameDecorator = new PlainTextNameDecorator($name);
$rendered = $nameDecorator->render();
echo "Simple rendering: $rendered<br>";

$template = '%s <strong>%s</strong>';
$nameDecorator = new TemplatedNameDecorator($name, $template);
$rendered = $nameDecorator->render();
echo "Rendering with template: $rendered<br>";

TBH, if the full requirement was as I've stated here, that'd probably do. But... same as with my discussion "Decorator Pattern vs simple inheritance", it's not very scalable. This works for one variation, but the number of inherited classes gets out of hand very quickly with each method that potentially needs its implementation abstracted. For example if one had a Person class and two methods which need handling: renderName() and renderAddress()... to cover the bases with "plain text" and "templated" options, one would need classes like this:
  • PlainNamePlainAddressPersonDecorator
  • PlainNameTemplatedAddressPersonDecorator
  • TemplatedNamePlainAddressPersonDecorator
  • TemplatedNameTemplatedAddressPersonDecorator

And if we wanted to include a phone number? Eight classes. Basically it's m^n classes, where m is the ways to decorate things, and n is the number of items needing decoration. So if we also had a need to provide for JSON decoration for these three: up from eight to 27 possible variations. Yikes. Not. Scalable.

A more scalable approach is to use composition over inheritance (the more I program, the more I find this to be the case), and use the Strategy Pattern. This basically means you extract just the method functionality into a collaborator, and just tell your main class about the collaborator.

class NameDecorator {

    private $renderingHandler;
    private $name;

    function __construct($name, $renderingHandler){
        $this->name = $name;
        $this->renderingHandler = $renderingHandler;
    }

    function render(){
        return $this->renderingHandler->render($this->name->firstName, $this->name->lastName);
    }
}

Here NameDecorator knows it can render stuff, but it itself has no idea or care about the detail of how to render stuff. All it is doing is exposing a uniform API to the calling code: it can simply call render():

$rendered = $nameDecorator->render();
echo "Simple rendering: $rendered<br>";

Before doing that though, we need to initialise the decorator with a handler:

$nameDecorator = new NameDecorator($name, new DefaultRenderingHandler());

And it's the handler that knows its own various pre-requisites:

class DefaultRenderingHandler {

    function render($firstName, $lastName){
        return "$lastName, $firstName";
    }
}

class TemplatedRenderingHandler {

    private $template;

    function __construct($template){
        $this->template = $template;
    }

    function render($firstName, $lastName){
        return sprintf($this->template, $firstName, $lastName);
    }
}

These classes just focus on what it is to render a name. The default one just gets on with it, whereas the templated one needs to be initialised with a template. But these sort of conceits are limited to the element that needs them: the Decorator itself doesn't need to know the templated handler needs a template. All it needs is a collaborator which implements a render(firstName, lastName) method.

If we have further rendering requirements (say an address, as per earlier), we just need the collaboration for that. So the scaling for this model is m*n, not m^n. Obviously better. Of course one might not need every variation, but it's more the rapidity of things getting out of hand which is the important point here.

It also keeps things more focused: each collaborator simply needs to focus on what it needs to deal with, and not anything else. And the decorator itself just worries about providing its API.

It might seem like a lot of work to create a whole class for a single method, but - really - the class boilerplating isn't much is it? One could use stand-alone functions for this:

class NameDecorator {

    private $renderingHandler;
    private $name;

    function __construct($name, $renderingHandler){
        $this->name = $name;
        $this->renderingHandler = $renderingHandler;
    }

    function render(){
        return ($this->renderingHandler)($this->name->firstName, $this->name->lastName);
    }
}


$renderAsPlainText = function ($firstName, $lastName) {
    return "$lastName, $firstName";
};

$template = '%s <strong>%s</strong>';

$renderWithTemplate = function ($firstName, $lastName) use ($template) {
    return sprintf($template, $firstName, $lastName);
};


$name = new Name('Zachary', 'Cameron Lynch');

$nameDecorator = new NameDecorator($name, $renderAsPlainText);
$rendered = $nameDecorator->render();
echo "Simple rendering: $rendered<br>";


$nameDecorator = new NameDecorator($name, $renderWithTemplate);
$rendered = $nameDecorator->render();
echo "Rendering with template: $rendered<br>";

See how I'm just using functions instead of whole classes here? But that's pretty poor code organisation IMO, so just stick 'em in classes.

The original question on Stack Overflow has asked for a variation and I think the Factory Method Pattern might be the way to go there. But... I'll think about that over night and perhaps come up with something tomorrow. I've got a bunch of downtime (this is for you, Andy Myers) at the airport waiting for a flight tomorrow afternoon.

Righto.

--
Adam

PS: four pints, that one.

Friday, 29 April 2016

PHP: Getting PHPCS and PHPMD working within PHPStorm

G'day:
These are just some quick notes on installing PHP Code Sniffer and PHP Mess Detector to work within PHPStorm. There's nothing particularly elucidating in here, it's simply what I did to get it working, cos someone asked me so I sat down and worked it out. Also my flight's been delayed so I have time to kill until Aer Lingus gets its act together.

I'm running PHPStorm 10 on Windows (Windows 10, in this case, not that it matters).

First up I used composer to install both applications:

C:\Users\adam.cameron\AppData\Roaming\Composer>php c:\ProgramData\ComposerSetup\
bin\composer.phar global require phpmd/phpmd

[much stuff ensued, but it all installed OK]

C:\Users\adam.cameron\AppData\Roaming\Composer>php c:\ProgramData\ComposerSetup\
bin\composer.phar global require squizlabs/php_codesniffer


Note that I initially tried to install the fabpot/php-cs-fixer as suggested in the composer global installation examples (I can never remember how to do it, so I googled), but that version of PHPCS seems to be out of date or dead or something. It dun't work in PHPStorm, anyhow.  I removed that:

C:\Users\adam.cameron\AppData\Roaming\Composer>php c:\ProgramData\ComposerSetup\
bin\composer.phar remove fabpot/php-cs-fixer

Once in PHPStorm I then go to File > Settings > Languages & Frameworks > PHP > Code Sniffer, and use the UI to point it to C:\Users\adam.cameron\AppData\Roaming\Composer\vendor\bin\phpcs.bat And do similar for PHPMD: C:\Users\adam.cameron\AppData\Roaming\Composer\vendor\bin\phpmd.bat.

In there one can "Validate" the install, which is how I found out the fabpot version of PHPCS didn't work.

Once I've done that, I went into Settings > Editor > Code Style > Inspections, within which each option of PHPCS and PHPMD is listed, and having selected each of those, can select the rulesets to use.

I just went for PSR-2 for PHPCS, and all options checked for PHPMD. At some stage I'll probably create a decent ruleset for my work (I hate PSR-2, and and to apply some Clean-Code-esque rules too), but for the time being, that'll do.

Having done that, PHPStorm now applies the rules:


Whilst I think that particular rule is bloody stupid, it is indeed one of PSR-2's rules, and it's also identifying it's being checked by phpcs. Success.

That's it!

Righto... time for another pint...

--
Adam

Tuesday, 12 April 2016

JavaScript: getting my brain further around bind()

G'day:
I was refactoring some JS code today, and one of the refactorings was extracting an inline callback into its own separate function. This was the "success" callback for a jQuery AJAX request. The reason I wanted to pull it out of its inline situation was twofold:

  1. the function it was in was already way too long;
  2. it itself was a bit on the long side, and had a few "moving parts", and I wanted to extract it so it was testable.
I hasten to add we don't have any JS testing in place yet, but I like our code to be positioned to be testable for making our lives easier once we bring it in. It occurs to me now that this is an example of premature optimisation in a way, but the remit of the work I was doing was precisely this, so I figure it's OK. What's more... I've currently got time to do this, and later on when someone comes to need to do maintenance on this code they might not have time to wade through layers of nested complicated logic, so now is a good time to tidy up a bit.

I can't use our actual code to demo what's going on here, but I've done a reduced proof of concept which demonstrates the point without all the distracting baggage.

The existing situation was this sort of thing:

var usingInline = function(firstThree, fourth, type){
    var allFour = firstThree;
    allFour.push(fourth);
    var prefix = type;
    allFour.forEach(function(element){
        console.log(prefix + ": " + element);
    });
};

var oneTwoThreeInMaori = ["tahi", "rua", "toru"];
var fourInMaori = "wha";
usingInline(oneTwoThreeInMaori, fourInMaori, "Number");

That code will run as it stands, and outputs:

Number: tahi
Number: rua
Number: toru
Number: wha


No AJAX call here, but there's an inline callback buried in the middle of it. Truth be told with a callback this small, I'd just leave it there in real life, but that would make for a short, pointless blog article. My task here is to extract that callback.

As is my occasional wont, I did not pay attention to what I was doing and just ripped the callback out into its own function, and replaced the inline call to it with the name of the new function:

var usingExtracted = function(firstThree, fourth, type){
    var allFour = firstThree;
    allFour.push(fourth);
    var prefix = type;

    allFour.forEach(eachHandler);
};

var eachHandler = function(element){
    console.log(prefix + ": " + element);
};

var oneTwoThreeInMaori = ["tahi", "rua", "toru"];
var fourInMaori = "wha";
usingExtracted(oneTwoThreeInMaori, fourInMaori, "Number");

And that goes... splat:

Uncaught ReferenceError: prefix is not defined(…)

Why? Cos in the first example we were leveraging closure: the function expression was declared in the context of the usingInline function, so it enclosed the prefix variable, so it was available to use within the callback code. Closure in action. Whereas with the second example, eachHandler is declared in the global document context (well: it was within a class in the original code, but same thing applies: it was a different context from that which prefix was defined), so prefix was not there to close over, so it was not available.

Duh.

Prior to a week or so ago, my way of handling this would be to still declare the handler separately (so it's still testable), but instead of calling it directly, wrapping it in an inline function which does nothing but wraps a call to it, in the process enclosing prefix, and passing it into the handler. Like this:

var usingWrapper = function(firstThree, fourth, type){
    var allFour = firstThree;
    allFour.push(fourth);
    var prefix = type;

    allFour.forEach(function(element){
        eachHandlerToUseWithWrapper(prefix, element);
    });
};

var eachHandlerToUseWithWrapper = function(prefix, element){
    console.log(prefix + ": " + element);
};

Note that the handler now takes a prefix argument too, instead of just the argument forEach() passes to its callback.

That's all good. It's a bit "Heath Robinson", but it works.

A few weeks ago I wrote that article "JavaScript: clarifying in my head this and that and proxies" about proxying this in JavaScript callbacks. One of the techniques I looked at was using bind() to bind a different this to a function. One of the things I did not investigate but recalled reading about was that it can also be used to pass additional arguments into the function it returns. This is exactly what I want, and it was quick to knock together:


var usingExtractedWithBind = function(firstThree, fourth, type){
    var allFour = firstThree;
    allFour.push(fourth);
    var prefix = type;

    allFour.forEach(eachHandlerToUseWithBind.bind(undefined, prefix));
};

var eachHandlerToUseWithBind = function(prefix, element){
    console.log(prefix + ": " + element);
};

Here I'm not interested in proxying this as I don't need it in the callback, so I just leave it undefined (this was from the docs, and seems reasonable, but I would not have guessed to do this), and simply specify prefix as a value to pass into the function bind() creates. Then I use the same callback as before with both the prefix argument, and the element that forEach() passes in. What bind() is doing here is basically the same as in my previous example: it's making a new function, which receives the prefix value as its first argument, and then receives whatever other args are passed to it. Cool!

Perhaps not earth-shattering for you lot, but this was the first time I put this into action, so it's kinda the more interesting "TIL" from today (I had a few today actually... but I'll get to them one at a time).

There's a working JSFiddle of this code here.

Righto.

--
Adam

Thursday, 7 April 2016

Maintaining untested code, and mocking too much when backfilling test (with JavaScript & Jasmine)

G'day:
Here's another example of me doing something wrong. Well: there's a litany of wrongness in this exercise, and I'll show the whole thing from start to finish. But this all stems from me ballsing something up.

Caveat

Be warned: there's a lot of code in this one, but the main point is the volume of the code I found necessary to write to test and fix a tiny bug. You don't need to worry too much about the content of the code... the important part is the line number at the bottom of the file.

The background to this is that I was helping a former colleague with some CFML code ("just when I thought I was out..."), and it was whilst reconfiguring stuff, I got hit by a trivial bug... the resolution to which is the basis of this article. I could not use their actual code in the article here due to NDA reasons, but I also didn't want to be discussing CFML either. So I've undertaken to "refactor" the entire thing into JavaScript instead (I don't mean their code; I mean a version of their logic which I will discuss here). The JavaScript code closely resembles the actual code I was looking at, but obviously some of the approaches to doing things needed to change, and I've repurposed some of the elements of it to better suit what I'm discussing. The logic and "code distribution" is about a 95% match to the original case though, I think. Also I'm using Jasmine to do JavaScript testing in this example, and I was using a CFML-based solution for the original code. This is irrelevant to the situation as both work mostly the same way, and produce similar amounts of code for the equivalent testing and mocking tasks.

I will start by saying that the code is a product of its time and their coding policies at that time. I'm happy to say they've since revised their approach to things, and are fine-tuning it as they go. the code base - like all others - is an evolving beast. And indeed to be fair to them: it's a testament to their evolving coding process that I'm able to write this article.

Background

This code comes some a project which is built with an MVC framework: the details of which are irrelevant to this article. They also use a dependency injection container to handle their dependencies.

The code in this article is part of the TranslatorProvider, which creates a TranslationService, and loads it with translations originally stored in some (optionally cached) data repository. The implementation details of these are not relevant here. These translations are for on-screen labels and other text. The bit of code I'm focusing on in this article is the logic to load the translations from storage. They don't quite follow the service / repository model I have presented here, but it's the same beast by different names, and the service / repository model suits me better so that's how I've labelled things here.

My mate's organisation produces numerous small websites, all of which are multi-lingual, and all share this code.

New site

They're in the process of building a new site, which is where I came in. We'd been piggy-backing onto the translation dictionaries for one of the existing sites whilst we were protoyping. the onscreen-labels we need to translate follow the same general theme, so it's workable - during development - to use one site's translations for another site, to a point (some translations don't work, but that's OK). It took a while to prepare the dictionaries for the new site (a different team was doing that), but when they came back from the translator I switched the dictionary bundles over.

I did this, and the whole site broke. Switched the dictionary back to the older one: site worked again. Switch it forward: site broke. OK: there's something wrong with our translations. Oh, I switched the translations on another one of their test sites - one I knew in which all the work was sound (ie: not written by me) - and that site broke, Harrumph. What is it about some translation dictionaries that can break a site?

Bug

I dug into the translator code and quickly (well: it took a few hours) found the problem:

var translations;

// ...

for (var key in rawTranslations.primary){
    translations.primary = translations.primary || {};
    translations.primary[key] = rawTranslations.primary[key];
}

for (var key in rawTranslations.secondary){
    translations.secondary = translations.secondary || {};
    translations.secondary[key] = rawTranslations.secondary[key];
}

var ttl = config.ttl;

cacheService.put(cacheKeyPrimary, translations.primary, ttl);
cacheService.put(cacheKeySecondary, translations.secondary, ttl);


Can you see it?

What happens if - as it is in our case - one hasn't got any secondary translations? translations.secondary never gets created, so that last line will break.

The fix is simply to initialise that object outside of the loop:

translations = {primary : {}, secondary : {}};

Simple.

Well except for one thing. No test coverage. This code was written before they had a TDD policy, so it's got no tests at all. Note that this bug never would have happened had they used TDD when developing this code, because the very first test would have been with empty arrays. I made sure to point this out to them, but they rightfully pointed out when it was written and they now do have a full TDD policy. And that comes into play for my fix: I needed to use TDD for implementing my fix.

Back-fill

My policy with maintaining untested code using our new TDD approach is that I will not back-fill all missing tests. I will only implement the minimum testing needed for the case I'm fixing. if I have time I'll make shell tests with a "skip" in them so the missing coverage gets flagged in the test run, but I'll not bother too much other than that.

So in this case all I need to test is that if the secondary object is null, that null gets send to the cache. Simple.

Or is it?

Unclean code

Unfortunately this code was also written before the era of "write Clean Code", so the function itself is doing way too much. It's 50 lines long for a start, which is a red flag, but one can easily see different sections of processing in it too, so it violates the "functions should do one thing" rule. This becomes a problem because even to test my wee fix, I need to make sure the preceding 40-odd lines of code don't break or do the wrong thing before I get to my line of code. And as this code leverages a fair number of dependencies, there's a bunch of mocking work to do before I can even get to what I want to test. Note: the dependencies in the original code were handled differently in that they passed the DI container into this provider, and just plucked out whatever they need when they need it. I cannot bring myself to show code like that (even for a blog article), so I've broken them down into multiple dependencies with specific roles. The logic is the same, and it produces the same amount of code (both sample code and test code).

If the code was Clean and I was just looking at a function that map the source objects(s) into the target objects(s), then my test job would be easy. But no.

Let's look at what stands between me and where I can test the results of my line of code. I've highlighted what needs mocking:

if (!requestService.isTranslatorEnabled()){
    return;
}

var currentLocale = requestService.getLocale();

var cacheKeyPrimary = getCacheKey("primary", currentLocale);
var cacheKeySecondary = getCacheKey("secondary", currentLocale);

var okToGetFromCache = cacheService.isActive()
    && cacheService.exists(cacheKeyPrimary)
    && cacheService.exists(cacheKeySecondary);

var translations = {};
if (okToGetFromCache) {
    translations.primary = cacheService.get(cacheKeyPrimary);
    translations.secondary = cacheService.get(cacheKeySecondary);
}else{
    var rawTranslations = {
        primary : translationRepository.loadBundle("primary", currentLocale),
        secondary : translationRepository.loadBundle("secondary", currentLocale)
    };

    for (var key in rawTranslations.primary){
        translations.primary = translations.primary || {};
        translations.primary[key] = rawTranslations.primary[key];
    }

    for (var key in rawTranslations.secondary){
        translations.secondary = translations.secondary || {};
        translations.secondary[key] = rawTranslations.secondary[key];
    }

    var ttl = config.ttl;

    cacheService.put(cacheKeyPrimary, translations.primary, ttl);
    // [...]

  1. requestService.isTranslatorEnabled
  2. requestService.getLocale
  3. cacheService.isActive
  4. translationRepository.loadBundle
  5. config.ttl
  6. cacheService.put
But that's not all. The function has to complete without error. So I also need to mock this lot:

var translationService = translationFactory.getTranslator();
var arrayLoader = translationFactory.getArrayLoader();

translationService.load(arrayLoader.load(translations.primary), 'primary', currentLocale);
translationService.load(arrayLoader.load(translations.secondary), 'secondary', currentLocale);

  1. translationFactory.getTranslator
  2. translationFactory.getArrayLoader
  3. arrayLoader.load
  4. translationService.load

What a nightmare. If this code was written cleanly, I'd simply have to test a method that takes a coupla arguments and returns a value. Simple.

Sunday, 3 April 2016

Decorator Pattern vs simple inheritance

G'day:
This is a sequel to my earlier article "Using a decorator pattern to reduce inappropriate code complexity". You'd better go breeze through that before reading this one, otherwise this one won't make too much sense.

One of my colleagues read the above article and said "yeah good... but I wonder why one would do that over simple inheritance?" Good question. Especially as he is my boss, so all his questions are good (like I said: he also reads this blog ;-). My prepared answer was that the decorator pattern was an implementation representation of an interface, and accordingly it can kinda reflect multiple inheritance (albeit by composition) where an inheritance-based approach only allows a single inheritance chain: and that will get clumsy quickly. Note that I am not suggesting that the decorator pattern actually reflects a multiple inheritance pattern, I'm just using it as a comparative metaphor. I think it's reasonable.

In the previous article I showed how to simplify and focus the implementations of a UserRepository and a LoggedUserRepository (it writes to a log as well as making a given repo call), and a CachedUserRepository (it caches the repo call). And, indeed from there it was easy to decorate the UserRepository with both logging and caching, to effect either a LoggedCachedUserReppository, or a CachedLoggedUserRepository. If you see the subtle difference there: it's what order the ancillary operations take place in. This was in lieu of having all the caching and logging code baked into the UserRepository, which I think is less-than-ideal design, makes testing harder, and also presupposed an implemenation (which had already been demonstrated - in our case - to be a bad supposition, hence the origin of this investigation).

Code-wise, I had this:

class UserRepository implements RepositoryInterface {

    public function getById($id) {
        return (object) [
            "id" => $id,
            "firstName" => "Number $id",
            "recordAccessed" => new \DateTime()
        ];
    }

}

And the LoggedRepository used to decorate a UserRepository with logging might be like this:

class LoggedRepository implements RepositoryInterface {

    private $repository;
    private $loggerService;

    public function __construct(RepositoryInterface $repository, LoggerServiceInterface $loggerService) {
        $this->repository = $repository;
        $this->loggerService = $loggerService;
    }

    public function getById($id) {
        $this->loggerService->logText("$id requested");
        $object = $this->repository->getById($id);

        return $object;
    }

}

The key points are that the UserRepository does all the data-getting, the LoggedRepository just decorates that with some logging. But each repo just plays to its strengths, and lets the other one get on with its own strength. That's the decorator pattern. The CachedRepository was analogous to the Logged~ one, just doing caching instead of logging. And to get both a logged & cached or a cached & logged UserRepository, it's just a matter of a further layer of decoration. Roughly like a matryoshka doll. Of sorts. Anyway: there's that whole other article about that.

So what about using inheritance instead? Let's swap out the repository type here to just a PersonRepository (the reason for this is solely cos all this code is in the same namespace, so I had already used the User~ metaphor). The PersonRespository is the same as the User one:

class PersonRepository implements RepositoryInterface {

    public function getById($id) {
        return (object) [
            "id" => $id,
            "firstName" => "Number $id",
            "recordAccessed" => new \DateTime()
        ];
    }

}

(I'm still just faking the actual data-fetch operation, as it's irrelevant here. But imagine getById() gets a user via its ID from some data store).

One could consider that a LoggedPersonRepository is a specialisation of UserRepository: it's for the same purpose, it just throws logging into the mix:

class LoggedPersonRepository extends PersonRepository {

    protected $loggerService;

    public function __construct(LoggerServiceInterface $loggerService) {
        $this->loggerService = $loggerService;
    }

    public function getById($id) {
        $this->loggerService->logText("$id requested");
        $object = parent::getById($id);

        return $object;
    }

}

One can accurately say a LoggedPersonRepository IS A PersonRepository, so the inheritance seems sound, and if one thinks through scenarios, I think it passes a Liskov substitution principle test as well. So the design is - thusfar - sound. And looking at the code, this is actually more simple than the decorator pattern version because there's no need for this LoggedPersonRepository to take a PersonRepository argument, as it is a PersonRepository.

And it's samesame for a CachedPersonRepository, but I'll not bore you with the implementation detail. I'm guessing you can see what I mean.

But what about when we do add caching to our repositories? We'd need two more classes:

class CachedPersonRepository extends PersonRepository {

    protected $cacheService;

    public function __construct(CacheServiceInterface $cacheService) {
        $this->cacheService = $cacheService;
    }

    public function getById($id) {
        if ($this->cacheService->isCached($id)) {
            return $this->cacheService->get($id);
        }
        $object = parent::getById($id);

        $this->cacheService->put($id, $object);
        return $object;
    }

}

And:
class CachedLoggedPersonRepository extends LoggedPersonRepository {

    protected $cacheService;

    public function __construct(LoggerServiceInterface $loggingService, CacheServiceInterface $cacheService) {
        parent::__construct($loggingService);
        $this-&gt;cacheService = $cacheService;
    }

    public function getById($id) {
        if ($this-&gt;cacheService-&gt;isCached($id)) {
            return $this-&gt;cacheService-&gt;get($id);
        }
        $object = parent::getById($id);

        $this-&gt;cacheService-&gt;put($id, $object);
        return $object;
    }

}

IE: CachedPersonRepository is an entirely different class from a CachedLoggedPersonRepository.  This is still sound inheritance, and stands up to the LSP test too; so from that side of things, the design is still "valid". This doesn't make it optimal though. Also if we wanted to just have the caching, or the caching just higher in the hierarchy than the logging? Two more classes again.

And say you then want to add an encryption layer? That's potentially another five different class variations:

  • EncrytedPersonRepository
  • EncryptedLoggedPersonRepository
  • EncryptedCachedPersonRepository
  • EncryptedCachedLoggedPersonRepository
  • EncryptedLoggedCachedPersonRepository
And if we want the encryption done at a different level than just the outer one... they you see how many more possible permutations there might potentially be a use case for. Now I'm not saying any one given system would need all these permutations (that'd be weird), but it does demonstrate that the approach doesn't scale so well. Using the decorator pattern, one needs just one class per task, for all variations of sequencing the operations. In this example one would need a PersonRepository, a LoggedPersonRepository, a CachedPersonRepository, and an EncryptedPersonRepository. Those four classes can be used to make a EncryptedLoggedCachedPersonRepository, or a CachedLoggedEncryptedPersonRepository, or any other implementation of a subset of those four notions.

I think my example of a LoggedCachedEncryptedPersonRepository is slightly egregious, and I don't want to use an appeal to extremes to make my case here. However I think - all things being equal - using the decorator pattern instead of inheritance to solve this sort of thing is going to be a more robust way of preempting potential scaling requirements; whilst still being a simple and recognised solution; and an easy way to facilitate a possible future situation, whilst not in any way actually going down the rabbit-hole of actually writing any code for that potential future.

All the code and sanity checks of the behaviour can be found in my github account: decorator.local.

Righto.

--
Adam