Monday, 24 April 2017

A cyclomatic complexity challenge

G'day:
This one came up briefly in a Friday afternoon messenger conversation between meself and my colleague in Dublin, Carlos. He's doing some code maintenance, and come across a function in one of our classes for which PHPMD's cyclomatic complexity analyser is complaining at us, and he was wondering what - if anything - do do about it.

We usually heed PHPMD's advice: variable names too long or too short; methods too long; classes with too many methods etc. But I think our general feel for cyclomatic complexity issues is to treat things with a grain of salt.

Now I'm not gonna explain cyclomatic complexity if you don't know about it: you can just read about it on Wikipedia: Cyclomatic complexity. In short it's an approach to giving a numeric score the complexity of some logic based on how many different logic branches it contains.

The code I present in this article offers a passing resemblance to the code in our code base, but I have tweaked it to make it more clear as a stand-alone example than it would be if I just ripped the code lock-stock out of its context. This captures the intent and the complexity (in that "cyclomatic" sense) of the original code, but that's about it. "all the names have been changed to protect the innocent" as it were.

Anyway, here's an analogy of the code in question:

public function update(int $id, array $parameters): void
{
    $this->propertyRepository->update($id, $parameters);

    if (array_key_exists('media', $parameters) && is_array($parameters['media'])) {
        $this->mediaRepository->update($id, $parameters['media']);
    }
    if (array_key_exists('metadata', $parameters) && is_array($parameters['metadata'])) {
        $this->metadataRepository->update($id, $parameters['metadata']);
    }
    if (array_key_exists('social', $parameters) && is_array($parameters['social'])) {
        $this->socialRepository->update($id, $parameters['social']);
    }
    if (array_key_exists('personnel', $parameters) && is_array($parameters['personnel'])) {
        $this->personnelRepository->update($id, $parameters['personnel']);
    }
    if (array_key_exists('availability', $parameters) && is_array($parameters['availability'])) {
        $this->availabilityRepo->update($id, $parameters['availability']);
    }
}

It's innocuous enough. This is part of PropertyService which services the business logic for a REST endpoint that handles updates to a Property. I'm in the online accommodation booking business, remember, so here a "Property" is like a hotel or something.

The raw data has been validated as "not completely wrong, missing or dangerous" at this point, but is still just in a raw array - a transformation of the incoming JSON payload from the request.

An update has a required component: the basic property info, plus a number of optional components depending on the nature of the source request. I've completely changed these here from the original, but I'm pretending a property can also comprise separate media, metadata, social, personnel and availability components to the business object, all of which have different storage implementations, marshalled by their respective repositories.

Rightly or wrongly this is a generic update endpoint which is the staging ground for everything else going on, so we've got all this conditional processing based on the existence of the relevant sub-components of the object itself. This is not so extraordinary.

But because we have five conditional bits, and each condition has a compound expression, this tips the scales of our cyclomatic complexity threshold. if we only had four of those conditions, we'd be fine. Now: no-one is saying that "four is OK; five is bad", but what we are saying is "well... four is OK; five needs further assessment". We don't wanna get too dogmatic about this stuff.

Anyway, here's PHPStorm whining at me:


(I have no idea what NPath complexity is, but I assume it's just a different way of measuring the same thing).

As I indicated, we had a quick chat and thought "do you know what? We're probably OK with this code". Granted the complexity is there, but the code is easy enough to follow. Because the conditions are "the same" each time a lot fo the perceived complexity is just metric, not reality. We also thought any tactic to reduce the perceived complexity might result in more real-world complexity.

I decided to see what I could do with it over the weekend, as a basis for this article. And, obviosuly, if I came up with a good solution we could use it.

First things first, I made sure we had 100% logical test coverage on the existsing code. This means we're in a safe place to refactor should we decide to. The tests are long and boring, so I'll not include them here. They're all pretty straight fwd anyhow. Oh, I'll repet the test cases from the data provider anyhow:

return [
    'no optionals' => ['parameters' => []],
    'media is simple' => ['parameters' => ['media' => self::$mediaParameter]],
    'media is array' => ['parameters' => ['media' => [self::$mediaParameter]]],
    'metadata is simple' => ['parameters' => ['metadata' => self::$metadataParameter]],
    'metadata is array' => ['parameters' => ['metadata' => [self::$metadataParameter]]],
    'social is simple' => ['parameters' => ['social' => self::$socialParameter]],
    'social is array' => ['parameters' => ['social' => [self::$socialParameter]]],
    'personnel is simple' => ['parameters' => ['personnel' => self::$personnelParameter]],
    'personnel is array' => ['parameters' => ['personnel' => [self::$personnelParameter]]],
    'availability is simple' => ['parameters' => ['availability' => self::$availabilityParameter]],
    'availability is array' => ['parameters' => ['availability' => [self::$availabilityParameter]]],
    'all optionals' => ['parameters' => $allParameters]
];

The tests just make sure the appropriate repo is called given the presence/absence of the optional part(s) in the inputs.

BTW I was mindful that the complexity of the tests is also part of the complexity of the code. That's a lot of test variations for one method.

Anyhow, the first thing I spotted - pretty much immediately - is the conditional logic could be simplified if we had a function which was "key exists in array and its value is itself an array". This straight-away halves the conditional expressions in the function:

public function update(int $id, array $parameters): void
{
    $this->propertyRepository->update($id, $parameters);

    if ($this->keyExistsAndIsArray('media', $parameters)) {
        $this->mediaRepository->update($id, $parameters['media']);
    }
    if ($this->keyExistsAndIsArray('metadata', $parameters)) {
        $this->metadataRepository->update($id, $parameters['metadata']);
    }
    if ($this->keyExistsAndIsArray('social', $parameters)) {
        $this->socialRepository->update($id, $parameters['social']);
    }
    if ($this->keyExistsAndIsArray('personnel', $parameters)) {
        $this->personnelRepository->update($id, $parameters['personnel']);
    }
    if ($this->keyExistsAndIsArray('availability', $parameters)) {
        $this->availabilityRepo->update($id, $parameters['availability']);
    }
}

private function keyExistsAndIsArray(string $key, array $array) : bool
{
    return array_key_exists($key, $array) && is_array($array[$key]);
}

Because I've got full test coverage, I know this solution is analogous to the original one. And the tests also cover the private method 100% too.  And it is slightly clearer. Win.

Next I took a closer look again, and each conditional block is pretty much the same too. Just a different key, and a different repo to send the optional parameters too. So I could do this in another single private function too:

private function updateOptional($id, $parameters, $optional, $repository)
{
    if (array_key_exists($optional, $parameters) && is_array($parameters[$optional])) {
        $repository->update($id, $parameters[$optional]);
    }
}

This is just a homogenised version of each if statement from the original method. The original method is now greatly simplified:

public function update(int $id, array $parameters): void
{
    $this->propertyRepository->update($id, $parameters);

    $this->updateOptional($id, $parameters, 'media', $this->mediaRepository);
    $this->updateOptional($id, $parameters, 'metadata', $this->metadataRepository);
    $this->updateOptional($id, $parameters, 'social', $this->socialRepository);
    $this->updateOptional($id, $parameters, 'personnel', $this->personnelRepository);
    $this->updateOptional($id, $parameters, 'availability', $this->availabilityRepo);
}

I'm not that cool with having a method that takes four arguments: one should aim for zero arguments, after all; and each additional argument beyond that is reducing yer code cleanliness, but in this case it's not part of the public interface and it's only used in one place, so I think it's OK. Also I've got rid of eight of the ten boolean expressions the original version had, and that is definitely a clean-code win.

And, of course, all the tests still run fine, and I've 100% coverage still.

Looking over the three solutions here, I do actually think this last one is more clear than the first, and we've not added any real complexity for complexity's sake. So I think I'll probably suggest this to the parties concerned. We won't apply it straight away as the first version of the code is written and released and working, but we'll bear it in mind should we come to maintain that code further, and perhaps as a technique to consider next time a similar situation comes up.

Also important to note that I'd not take this approach from the outset, if say we had only 2-3 conditions in that method. This really is just an "edge-case" improvement, and like I said I don't think it's worth diving in and changing unless we have reason to touch that code again. And like if we were still at the code review stage of the original code, I'd perhaps make this observation, but I'd not suggest we revisit the code before approving it. The code that stands is, after all, pretty much A-OK. It's important to not get too dogmatic about these things, and slow proceedings down just to make some OK code slightly more OK. "Good enough" is, well: good enough.

I've got another article brewing about the unit tests I wrote for this. I'll hope to tackle that during the week, but I've beer to drink this evening, so I won't be doing it then.

Righto.

--
Adam

Friday, 14 April 2017

I actively consider what I think ought to go into a controller

G'day:
This is one of these ones where I state the obvious (probably), and expose myself as a fraud in that it's taken me this long to reach this conclusion. I can only take solace in that I'm clearly in "good" company. Where "good" == "a lot".

Yesterday I was going a code review, and came across a controller method which made one of my eyebrows raise as it was about 40 lines long. Now this isn't intrinsically bad in context. I have seen (and still see!) controller methods that are hundreds and hundreds of lines long. In the past I've seen controllers tipping the 1000-line mark. Those examples are clearly just wrong. And the developers who wrote them and/or continue to maintain them should be shot. I'm not even sure I'm over-stating it with that sentence. Hmmm... I probably am I guess. But they should be squarely and actively admonished, anyhow.

So in the scheme of things, 40 lines ain't bad. Especially when we try to keep our code wrapped within 80-100chars, so a few of the statements were multi-line. There was about say a dozen statements in it altogether. Not so bad.

Before I go on - and not only because the dev concerned will be reading this article (once I send them the link... pretty sure they don't know this blog exists) - I hasten to add that this code is not by any means bad. I'm probably gonna go "close enough" on the code review, once one bit of the code is refactored at least into a private method. I am not having a go at the dev concerned here. They are just my muse for the day. And I happen to think they're pretty good at their job.

There was one bit of the controller that stood out to me and I went "nuh-uh, that's gotta go". There was a foreach loop just before the response was returned which performed a calculation. That's business logic and it has no place in a controller. Fortunately we have an appropriate service class which we can lift that out and re-home in, and slap some tests on it, and our code is a bit better and a bit more maintainable.

(NB: yeah... no tests... we have a policy of not testing controller methods, but that's with the caveat there's no logic in them. We followed the "don't test controller methods" bit, but not the "provided there's no logic in them" bit here).

However once I spot a problem, I immediately start looking for more, so I assessed the rest of the code there. The controller did seem too fat. I wanted to know if all the logic did actually belong in the controller, and my instinct was: probably not.

At the same time as writing this article I'm chatting on Slack about it, and me mate Mingo offered the perennial answer to the question "should this go in the controller?"

Isn't the answer always "No"?
~ mjhagen 2017

(yes, he cited the attribution too, so I'm including that ;-)

It's pleasingly pithy, and almost ubiquitously agreed with; but in a casually dismissive way that is seldom heeded, I think. The problem is it's alluding to a mindset one should adopt as a guideline, but clearly the answer isn't "no", so I think people dismiss it. If one says "one should never do x", and it's clear that "never" is slight hyperbole, people will acknowledge the hyperbole and stretch it back the other way as far as they see fit to accommodate their current whim.

With this in mind (retroactively... Mingo just mentioned that now, and the code review was yesterday, and this article was drafted in my head last night), I stopped to think about a rule of thumb as to when something ought to go in a controller.

Broadly speaking I feel there are only three types of logic that belong in a controller:

  1. a call to validate / sanitise all incoming arguments so they're fit for purpose for fulfilling the request;
  2. one (preferably) or more (and increasingly less preferably the more there are) calls to get data - using those incoming arguments - to build the response;
  3. a call to assemble that data for the response;
  4. returning the response.

Yeah that's 4. I was not counting the return statement. An example of this might be:

class UserController {
    function handleGet(rawArgs){
        args = validationService.validate(rawArgs)
        
        user = userService.get(args.id)
        publishedContent = contentService.getForId(args.id, args.startDate, args.endDate)
        twitterContent = twitterService.getForId(args.id, args.startDate, args.endDate)
        facebookContent = facebookService.getForId(args.id, args.startDate, args.endDate)
        
        response = {
            user = user,
            publishedContent = publishedContent,
            socialContent = {
                twitter = twitterContent,
                facebook = facebookContent
            }
        }
        return new Response(response)
    }
}

I think that's OK, even though there's four service calls. All the logic is busying itself with what it is to fulfil that request. The controller simply dictates what data needs returning, and gets that data. there is no business logic in there. It's all controller logic.

A pseudo-code example where we're adding business logic into this might be:

// ...
publishedContent = contentService.getForId(args.id, args.startDate, args.endDate)
topFiveArticles = publishedContent.slice(5)

// ...

response = {
    // ...
    articles = topFiveArticles,
    // ...
}

Here the slice call - even though it's a native array method - is still business logic. If the controller method was getTopFiveArticles or something, I'd go back to considering it controller logic. So it's a fine line.

Back to the code i was reviewing, it was doing 1&2, then it was going off-piste with that foreach loop, then doing 3&4. Ignoring the foreach, I was still itchy, but wasn't seeing why. It seemed like it was roughly analogous to my first example above.

So what's the problem? Why was the code still making me furrow my brow?

Well there was just too much code. That set-off a warning bell. I wanted to reduce the number of calls to various services (or repositories as is the case here: data from slightly different sources), but initially I couldn't really justify it. Then I looked closer.

The code in question was more like this (I've got rid of the date args to focus more):

class ContentController {
    function handleGet(rawArgs){
        args = validationService.validate(rawArgs)
        
        user = userService.get(args.id)
        publishedContent = contentService.getForId(args.id)
        
        twitterContent = twitterService.getForId(user.twitterId)
        facebookContent = facebookService.getForId(user.facebookId)
        
        response = {
            articles = publishedContent,
            socialContent = {
                twitter = twitterContent,
                facebook = facebookContent
            }
        }
        return new Response(response)
    }
}


I hasten to add the code I was reviewing is nothing to do with users and content and social media stuff, this is just an example which is analogous but would make more sense to people not in the context of our business.

The difference is subtle. The difference is that the decision that the IDs for the social content are stored in the user object is business logic. It's not controller logic to "know" that's how that data is stored. the user object here is not part of the data that composes the response; it's part of some business logic that dictates how to get the data.

And this does indeed break that second rule:

  • a call to validate / sanitise all incoming arguments so they're fit for purpose for fulfilling the request;
  • [...] calls to get data - using those incoming arguments - to build the response;

The call to get the social media content is not using those arguments. It's using a value from a separate call:
user = userService.get(args.id)
// ...
twitterContent = twitterService.getForId(user.twitterId)
facebookContent = facebookService.getForId(user.facebookId)


Another tell tale here is that user is not even part of the response. It's just used to get those "foreign keys".

My rule of thumb in those four points above is still correct. But one needs to pay special attention to the second point. Any time one's inclined to fetch data "indirectly", the indirection is business logic. It might not be a loop or some conditional or switch block or something more obvious that manipulates the data before it's returned, but that indirection is still business logic. And does not belong a the controller.

So how should that controller look? Move the content acquisition out into a service that has the explicit purpose of getting a user's content - from wherever - via some key:

class ContentController {
    function handleGet(rawArgs){
        args = validationService.validate(rawArgs)
        
        content = contentService.getById(args.id)
        
        response = {
            articles = content.published,
            socialContent = {
                twitter = content.twitter,
                facebook = content.facebook
            }
        }
        return new Response(response)
    }
}


This follows the second point to the letter, and there's now no business logic in the controller. Also the controller legitimately does not really need unit testing now, but the logic in the ContentService can be. Also the controller method is down to a size that would not raise any eyebrows.

This is not an earth-shattering conclusion by any measure, but I'm quite pleased to fine-tune my position here, and hopefully it can help drive discussion about this in code reviews in future.

It'd be great to hear anyone else's thoughts on this?

--
Adam

Friday, 24 March 2017

Help explain closure to the Adobe ColdFusion Team

G'day:
OK, this is a CFML-centric article. But it's also a call to provide code analogies in different languages to a CFML example, to show the ColdFusion Team they dunno what they're on about.

Here's some CFML code:

function doit() {
    var a = ["a", "b", "c"];
    var b = ["x", "y", "z"];
    var counter = 0;
    
    arrayEach(a, function(foo) {
        counter = 0;
        arrayEach(b, function(bar) {
            counter++;
            // in dump counter is always 0
            writeDump({
                counter: counter, 
                foo: foo, 
                bar: bar
            });
        });
    });
}
doit();

This leverages closure to reference the outer counter variable within the innermost function expression. Ergo, it's the same variable. So the expected output of this would be:


Note how the counter is declared in the main function, reset in the outer arrayEach handlers and incremented for each iteration of the inner arrayEach call. So it'll cycle through 1,2,3 three times when output.

That was run on Lucee. Running it on ColdFusion yields:


See how the counter is messed up: it's always zero. It should increment for each iteration of the inner function expression.

Adobe - being their typical selves - is claiming this is "by design":

https://tracker.adobe.com/#/view/CF-4197194:


That's nonsense.

Other languages behave predictably:

JavaScript:

function doit() {
    var a = ["a", "b", "c"];
    var b = ["x", "y", "z"];
    var counter = 0;
    
    a.forEach(function(foo) {
        counter = 0;
        b.forEach(function(bar) {
            counter++;
            // in dump counter is always 0
            console.log({
                counter: counter, 
                foo: foo, 
                bar: bar
            });
        });
    });
}
doit();
VM128:11 Object {counter: 1, foo: "a", bar: "x"}
VM128:11 Object {counter: 2, foo: "a", bar: "y"}
VM128:11 Object {counter: 3, foo: "a", bar: "z"}
VM128:11 Object {counter: 1, foo: "b", bar: "x"}
VM128:11 Object {counter: 2, foo: "b", bar: "y"}
VM128:11 Object {counter: 3, foo: "b", bar: "z"}
VM128:11 Object {counter: 1, foo: "c", bar: "x"}
VM128:11 Object {counter: 2, foo: "c", bar: "y"}
VM128:11 Object {counter: 3, foo: "c", bar: "z"}


And PHP (apologies for the rubbish way PHP does closure):

function doit() {
    $a = ["a", "b", "c"];
    $b = ["x", "y", "z"];
    $counter = 0;
    
    array_walk($a, function($foo) use (&$counter, $b) {
        $counter = 0;
        array_walk($b, function($bar) use (&$counter, $foo) {
            $counter++;
            // in dump counter is always 0
            var_dump([
                "counter" => $counter, 
                "foo" => $foo, 
                "bar" => $bar
            ]);
        });
    });
}
doit();


array(3) {
  ["counter"]=>
  int(1)
  ["foo"]=>
  string(1) "a"
  ["bar"]=>
  string(1) "x"
}
array(3) {
  ["counter"]=>
  int(2)
  ["foo"]=>
  string(1) "a"
  ["bar"]=>
  string(1) "y"
}
array(3) {
  ["counter"]=>
  int(3)
  ["foo"]=>
  string(1) "a"
  ["bar"]=>
  string(1) "z"
}
array(3) {
  ["counter"]=>
  int(1)
  ["foo"]=>
  string(1) "b"
  ["bar"]=>
  string(1) "x"
}
array(3) {
  ["counter"]=>
  int(2)
  ["foo"]=>
  string(1) "b"
  ["bar"]=>
  string(1) "y"
}
array(3) {
  ["counter"]=>
  int(3)
  ["foo"]=>
  string(1) "b"
  ["bar"]=>
  string(1) "z"
}
array(3) {
  ["counter"]=>
  int(1)
  ["foo"]=>
  string(1) "c"
  ["bar"]=>
  string(1) "x"
}
array(3) {
  ["counter"]=>
  int(2)
  ["foo"]=>
  string(1) "c"
  ["bar"]=>
  string(1) "y"
}
array(3) {
  ["counter"]=>
  int(3)
  ["foo"]=>
  string(1) "c"
  ["bar"]=>
  string(1) "z"
}


But I'm quite keen to know if there's any language that behaves as ColdFusion does in this example? If you've got a mo', could you knock out an equivalent of this code in [your language of choice], and share the results?

I'm 99.999% sure Adobe are just not quite getting closure, but wanna make sure I'm not missing anything.

Righto.

--
Adam

Wednesday, 22 March 2017

Code puzzle: find any match from an array of regex patterns

G'day:
I'm being slightly cheeky as I'm hijacking someone else's question / puzzle from the CFML Slack channel that we were discussing last night. Don't worry, that'll be the last mention of CFML in this article, and indeed the question itself was asking for a JS solution. But it's a generic code puzzle.

Let's say one has a string, and an array of regex patterns one wants to check for in said string. A number of solutions were offered, using a number of techniques. And it even branched away from JS into Clojure (thanks Sean).

How would you go about doing it?

The input data is:

patterns = ["Savvy", "Smart", "Sweet"]
testForPositive = "Super Savvy"
testForNegative = "Super Svelt"

And in these cases we want true and false to be returned, respectively.

Note that the patterns there need to be considered regular expressions, not simple strings. The example data is just using simple strings for the sake of clarity.

Initially I was gonna do something like this:

function matchesOne(str, patterns) {
    var onePattern = new RegExp(patterns.join("|"));
    return str.search(onePattern) > -1;
}

Here I'm joining all the patterns into one big pattern, and searching for that. I was slightly hesitant about this as it's loading the complexity into the regex operation, which I felt perhaps wasn't ideal. Also it's not very clever: if any of the individual patterns had a | in them: this'd break. Scratch that.

Ryan Guill seeded the idea of approaching it from the other direction. Using the array as an array, and using one of JS's array higher-order functions to do the trick. We saw examples using find and filter, but I reckoned some was the best option here:

function matchesOne(str, patterns) {
    return patterns.some(pattern => str.search(pattern) > -1);
}


I'm happy with that.

The ante was raised next: write a function which returned the position of the first match (not just a boolean as to whether there was a match or not).

To me this was a reduction operation: take an array and - given some algorithm applied to each element - return a final single value.

I tried, but I could not make it very clean. This is what I got:

function firstMatch(str, patterns) {
    return patterns.reduce(function(first, pattern){
        var thisMatch = str.search(pattern);
        return first == -1 ? thisMatch : thisMatch == -1 ? first : Math.min(first, thisMatch);
    }, -1);
}


I'm happy with the reduce approach, but I'm really not happy with the expression to identify the lowest match (or continue to return -1 if no match is found):

first == -1 ? thisMatch : thisMatch == -1 ? first : Math.min(first, thisMatch)


To be clear, if we expanded that out, the logic is:

if (first == -1) {
    return thisMatch;
}
if (thisMatch == -1) {
    return first;
}
return Math.min(first, thisMatch);

That's actually "cleaner", but I still think I'm missing a trick.

So the puzzle to solve here is: what's a clever (but also Clean - in the RC Martin sense) way returning the lowest non -1 value of first and thisMatch, or fall back to -1 if all else fails.

Using this code (or some approximation thereof):

function firstMatch(str, patterns) {
    return patterns.reduce(function(first, pattern){
        var thisMatch = str.search(pattern);
        // return result of your code here
    }, -1);
}

console.log(firstMatch("Super Savvy", ["Savvy", "Smart", "Sweet"]));
console.log(firstMatch("Super Svelt", ["Savvy", "Smart", "Sweet"]));


One should get 6 and -1 for the respective "tests".

Use any language you like.

Oh and hey... if you have any interesting solutions for the two functions themselves, feel free to post 'em too.


Righto.

--
Adam

Thursday, 9 March 2017

CFML: that survey... sit. rep.

G'day:
A coupla people have asked me what's happening with my feedback from that survey you all did for me (Survey: CFML usage and migration strategies).

I have to simply apologise for the delay. I started to look at it, but could not work out how best to present the data. Then I buggered off to NZ to sit in the sun watching cricket and drinking beer with me mates; and "CFML survey data analysis" was not high on my list of priorities. I did start looking at the data some more though.

So not much progress yet, but I'm now back in London and trying to refocus on IT stuff, so I'll try to start feeding back soon. -ish.

Again: apologies for the delay.

Righto.

--
Adam

PHPUnit: setMethods and how I've been writing too much code

G'day:
Seems our technical architect is becoming my PHP muse. Good on ya, Pete.

Anyway, he just sent out what I thought was a controversial email in which he said (paraphrase) "we're calling setMethods way too often: in general we don't need to call it". To contextualise, I mean this:

$dependency = $this
    ->getMockBuilder(MyDependency::class)
    ->setMethods(['someMethod'])
    ->getMock();

$dependency
    ->method('someMethod')
    ->willReturn('MOCKED RESULT');


See here I'm explicitly saying "mock out the someMethod method". Pete's contention is that that's generally unnecessary, cos getMockBuilder mocks 'em all by default anyhow. This goes against my understanding of how it works, and I've written bloody thousands of unit tests with PHPUnit so it surprised me I was getting it wrong. I kinda believed him cos he said he tested it, but I didn't believe him so much as to not go test it myself.

So I knocked out a quick service and dependency for it:

namespace me\adamcameron\myApp;

class MyService
{

    private $myDecorator;

    public function __construct(MyDecorator $myDecorator)
    {
        $this->myDecorator = $myDecorator;
    }

    public function decorateMessage($message)
    {
        $message = $this->myDecorator->addPrefix($message);
        $message = $this->myDecorator->addSuffix($message);
        return $message;
    }

}

namespace me\adamcameron\myApp;

class MyDecorator
{

    public function addPrefix($message)
    {
        return "(ACTUAL PREFIX) $message";
    }

    public function addSuffix($message)
    {
        return "$message (ACTUAL SUFFIX)";
    }

}

Nothing surprising there: the service takes that decorator as a constructor arg, and then when calling decorateMessage the decorator's methods are called.

In our tests of decorateMessage, we don't want to use the real decorator methods, we want to use mocks.

First I have a test the way I'd normally mock things: explicitly calling setMethods with the method names:

public function testWithExplicitMock()
{
    $decorator = $this
        ->getMockBuilder(MyDecorator::class)
        ->setMethods(['addPrefix', 'addSuffix'])
        ->getMock();

 $decorator
  ->method('addPrefix')
  ->willReturn('(MOCKED PREFIX)');

 $decorator
  ->method('addSuffix')
  ->willReturn('(MOCKED SUFFIX)');

 $service = new MyService($decorator);

 $result = $service->decorateMessage('TEST MESSAGE');

 $this->assertSame(
        '(MOCKED SUFFIX)',
        $result
    );
}

And this mocks out the method correctly. This is my baseline.

Secondly, I still call setMethods, but I give it an empty array:

public function testWithImplicitMock()
{
    $decorator = $this
        ->getMockBuilder(MyDecorator::class)
        ->setMethods([])
        ->getMock();

 $decorator
  ->method('addPrefix')
  ->willReturn('(MOCKED PREFIX)');

 $decorator
  ->method('addSuffix')
  ->willReturn('(MOCKED SUFFIX)');

 $service = new MyService($decorator);

 $result = $service->decorateMessage('TEST MESSAGE');

 $this->assertSame(
        '(MOCKED SUFFIX)',
        $result
    );
}

This also mocks out the method correctly. Passing an empty array mocks out all the methods in the mocked class.

Next I try passing null to setMethods:

public function testWithNull()
{
    $decorator = $this
        ->getMockBuilder(MyDecorator::class)
        ->setMethods(null)
        ->getMock();

    $decorator
        ->method('addPrefix')
        ->willReturn('(MOCKED PREFIX)');

    $decorator
        ->method('addSuffix')
        ->willReturn('(MOCKED SUFFIX)');

    $service = new MyService($decorator);

    $result = $service->decorateMessage('TEST MESSAGE');

    $this->assertSame(
        '(ACTUAL PREFIX) TEST MESSAGE (ACTUAL SUFFIX)',
        $result
    );
}

This creates the mocked object, but the methods are not mocked. So the result here is using the actual methods.

In the last example I mock one of the methods (addPrefix), but not the other (addSuffix):

public function testWithOneExplicitMock()
{
    $decorator = $this
        ->getMockBuilder(MyDecorator::class)
        ->setMethods(['addPrefix'])
        ->getMock();

    $decorator
        ->method('addPrefix')
        ->willReturn('(MOCKED PREFIX)');

    $service = new MyService($decorator);

    $result = $service->decorateMessage('TEST MESSAGE');

    $this->assertSame(
        '(MOCKED PREFIX) (ACTUAL SUFFIX)',
        $result
    );
}

Here we see the important bit: when one explicitly mocks one or a subset of methods, then the other methods are not mocked. Which is kinda obvious now I say it, but I guess we had been thinking one needed to call setMethods to be able to then mock the behaviour of the method. And, TBH, I don't think we thought through what was happening to the methods we were not mocking. I think we started calling setMethods all the time because we were slavishly following the PHPUnit docs, which always call it too, which is a less than ideal precedent to "recommend".

Generally speaking, one should want to mock all methods of a dependency, and it should really be the exception when explicitly not wanting to mock a method: ie, to leave it actually callable.

The win here is that we can now de-clutter our tests a bit. We really seldom need to call setMethods, and we sure want to make it clear when we do want to only mock some of a dependency's methods. So this will make our code cleaner and clearer, and easier to code review and maintain. Win!

Cheers Pete!

Righto.

--
Adam

Wednesday, 8 February 2017

PHP: more array dereferencing weirdness

G'day:
(Firstly: I hasten to add that the "weirdness" is "weird to me". It might be completely predictable to everyone else in the PHP world).

This is a follow-on from my earlier article "PHP: accessing an undefined array element yields a notice. Or does it? WTF, PHP?". Now before I start: Kalle Sommer Nielsen has come back to me in a comment on that article and mentions it's expected behaviour, and explained it... but I don't quite follow yet. I'm hoping some ensuing experimentation will cast some light on the scene.

My next variation of this has me scratching my head more. Well: it's demonstrating the same thing, but perhaps more clearly.

<?php

echo PHP_EOL . "Test with array" . PHP_EOL;

$a = [1, 2, 3];
var_dump($a[0]);
var_dump($a[0]['id']);
var_dump($a[0][0]['id']);
var_dump($a[0]['id'][0]);

echo PHP_EOL . "Test with int" . PHP_EOL;
$i = 1;
var_dump($i['id']);
var_dump($i[0]['id']);
var_dump($i['id'][0]);

echo PHP_EOL;

echo PHP_EOL . "Test with string" . PHP_EOL;
$s = '1';
var_dump($s['id']);
var_dump($s[0]['id']);
var_dump($s['id'][0]);

This yields:


C:\temp>php test.php

Test with array
C:\temp\test.php:6:
int(1)
C:\temp\test.php:7:
NULL
C:\temp\test.php:8:
NULL
C:\temp\test.php:9:
NULL

Test with int
C:\temp\test.php:13:
NULL
C:\temp\test.php:14:
NULL
C:\temp\test.php:15:
NULL


Test with string
PHP Warning:  Illegal string offset 'id' in C:\temp\test.php on line 21
PHP Stack trace:
PHP   1. {main}() C:\temp\test.php:0

Warning: Illegal string offset 'id' in C:\temp\test.php on line 21

Call Stack:
    0.0002     356384   1. {main}() C:\temp\test.php:0

C:\temp\test.php:21:
string(1) "1"
PHP Warning:  Illegal string offset 'id' in C:\temp\test.php on line 22
PHP Stack trace:
PHP   1. {main}() C:\temp\test.php:0

Warning: Illegal string offset 'id' in C:\temp\test.php on line 22

Call Stack:
    0.0002     356384   1. {main}() C:\temp\test.php:0

C:\temp\test.php:22:
string(1) "1"
PHP Warning:  Illegal string offset 'id' in C:\temp\test.php on line 23
PHP Stack trace:
PHP   1. {main}() C:\temp\test.php:0

Warning: Illegal string offset 'id' in C:\temp\test.php on line 23

Call Stack:
    0.0002     356384   1. {main}() C:\temp\test.php:0

C:\temp\test.php:23:
string(1) "1"

C:\temp>

OK, so here I factor-out the top-level array from the mix. That test was always referring to the first element of the array, which is just the integer 1, so we might as well just test with that. Needless to say, there's the same results as with the first array element. But I still don't get it. What is using array notation to dereference an integer supposedly doing? It's a non-sensical operation, as far as I understand things. And if it is a nonsensical operation: it should error.

For completeness I also try a variation with a string "1". This does error like I'd expect. Well: it gives a warning and then returns an inappropriate (IMO) result anyhow. This is probably PHP doing it's combo of telling me I did it wrong but then going ahead and guessing what I meant anyhow. Grrr.

I'm gonna mess around with de-referencing integers some more when I get a moment, and try to work out WTF.

Righto.

--
Adam

Tuesday, 7 February 2017

TDD: still testing too much

G'day:
A while back I wrote this article - Maintaining untested code, and mocking too much when backfilling test (with JavaScript & Jasmine) - wherein my colleague pulled me up for testing too much when I'm doing my TDD. I'm working with PHP this time, and I've just been pulled up on it again. For much the same reason. It created an interesting thought exercise... well interetsing ish... so I'll write it up here.

Firstly, I cannot use the actual code I am working with due to the usual over-baked "commercial sensitivity" considerations, so this is simply an analogy, but it captures the gist of what I was doing.

We have an adapter that handles HTTP calls. It's basically a generic wrapper that masks the API differences between Guzzle 5 and Guzzle 6, both of which we are still using, but we want a unified adapter to work with both. It's very generic, and basically has methods for GET, POST, PUT etc, and doesn't encompass any of our business logic.

Business reality is that we use this mostly to transmit JSONified objects, so we're adding a layer in front of it to take the raw text responses and rehydrate them. I'm writing the GET method.

When I start designing my solution, I know that the happy path will return me a Response object that has JSON as its content. I just need to deserialise the JSON and return it. This is easy enough:

public function testGetAsObject() {
    $expectedResult = ['result' => 'valid JSON'];
    $expectedContent = json_encode($expectedResult);

    $adapter = $this->mockAdapterWithResponse(200, $expectedContent);

    $client = new HttpClient($adapter);
    $result = $client->getWithJson($this->uri);

    $this->assertEquals($expectedResult, $result);
}

mockAdapterWithResponse just does this:

private function mockAdapterWithResponse($statusCode, $content){
    $response = $this->getMockBuilder(Response::class)
        ->setMethods(['getStatusCode', 'getContent'])
        ->getMock();
        
    $response->expects($this->once())
        ->method('getStatusCode')
        ->willReturn(statusCode);

    $response->expects($this->once())
        ->method('getContent')
        ->willReturn(content);

    $adapter = $this->getMockBuilder(Adapter::class)
        ->setMethods(['get'])
        ->getMock();
        
    $adapter->expects($this->once())
        ->method('get')
        ->willReturn($this->response);
        
    return $adapter;
}

(there are other tests that already needed this). All it's doing is enabling us to set what the content for the call to the HttpClient will return, because we don't care how the client goes about doing this; we just care what we do with the content afterwards.

So back to my test code:

public function testGetAsObject() {
    $expectedResult = ['result' => 'valid JSON'];
    $expectedContent = json_encode($expectedResult);

    $adapter = $this->mockAdapterWithResponse(200, $expectedContent);

    $client = new HttpClient($adapter);
    $result = $client->getWithJson($this->uri);

    $this->assertEquals($expectedResult, $result);
}

All I'm doing here is mocking the adapter to return a predetermined result, and I then test that what comes back from my function is the deserialisation of that value. That's it. I then implement some code to make that test pass:

public function getAsObject($uri, $params = [], $headers = []) {
    $response = $this->get($uri, $params, $headers);

    $object = json_decode($response->getContent(), true);

    return $object;
}

And that's green: all good.

I've already done too much in my test, but have not noticed (and not put it in for code review, so no-one's pointed it out).

The next requirement for this method is that if the content coming back from the adapter isn't JSON, then just throw an exception instead of trying to deserialise it. Again, pretty easy to TDD that:

/** @expectedException \Exception */
public function testGetAsObjectThrowsExceptionOnNonJsonContent() {
    $content = 'NOT_VALID_JSON';

    $adapter = $this->mockAdapterWithResponse(200, $content);

    $client = new HttpClient($adapter);
    $client->getWithJson($this->uri);
}

Note there's not even any explicit assertions in there: the @expectedException annotation takes care of it.

This test fails at present: no exception is thrown thanks to PHP's stupid-arse behaviour if json_decode is given non-JSON: it just returns null. Bravo, PHP. But that's what we want... we've designed our code and have tests to prove the design, now we need to implement the code:

public function getAsObject($uri, $params = [], $headers = []) {
    $response = $this->get($uri, $params, $headers);

    $object = json_decode($response->getContent(), true);
    
    if (is_null($object)) {
        throw new \Exception('Unable to decode JSON result');
    }
    return $object;
}

The tests are green, so all good.

But what have I done wrong? Nothing major, but once again I'm testing too much. Actually my tests are fine. It's my helper method that's "wrong":

private function mockAdapterWithResponse($statusCode, $content){
    $response = $this->getMockBuilder(Response::class)
        ->setMethods(['getStatusCode', 'getContent'])
        ->getMock();
        
    $response->expects($this->once())
        ->method('getStatusCode')
        ->willReturn(statusCode);

    $response->expects($this->once())
        ->method('getContent')
        ->willReturn(content);

    $adapter = $this->getMockBuilder(Adapter::class)
        ->setMethods(['get'])
        ->getMock();
        
    $adapter->expects($this->once())
        ->method('get')
        ->willReturn($this->response);
        
    return $adapter;
}

Why am I testing that those methods are called once? It is completely irrelevant to what I'm testing how many times they're called.

What got into my head is I was being too dogmatic when I was writing my code, and I also cheated a bit with the TDD. Instead of designing the test variation first, and then implemented the code to pass the test, I was doing both at the same time. And it shows from my test implementation. Basically I didn't design my solution first, I designed it as I went, so - whilst tricking myself into thinking I was doing TDD - I was asking myself "right, what does my code need to do? It needs to call get on the adapter. OK, so I'll test I call it. What next... I need to get the content from the response. Right: I'll make sure that getContent gets called". All very stolid and admirable, but: wrong. All this is doing is testing my code (almost line by line!). TDD is not about testing my code; it's about testing my design. And it's the design and behaviour of the API of the method I'm testing. As I've said before, it'd be excellent it TDD was retconned to mean "Test Driven Design", not "Test Driven Development". The former is closer to the intent of the exercise.

I don't need to test those methods are called once: there's no logic in my function that will vary how many times they might be called, so they will be called once. We can trust PHP to do what we tell it to do by giving it source code.

The logic flaw here was demonstrated by my code reviewer observed: if you think you need to check how many times that functionality is called, why are you also not concerned with how many times json_decode is called too? It's an important part of your functionality which definitely needs to be called. Ha... yeah "duh", I thought. This drove home I was testing too much.

Another thing is that I was getting too distracted by what my implementation was whilst writing my test. The requirements of that function are:

  • make a request, the response of which might be JSON:
  • if it is: return the deserialised content;
  • if it's not: raise an exception.

That first step is implementation detail from the POV of testing. What we're testing with TDD is the outcome of the function call. Which is: an object, or an exception. We're also only supposed to be testing the logic in the function the test is actually calling. I emphasise "logic" there because we're not testing the code. TDD is not about that. We're testing the logic. This is why we have two test variations: one wherein the logic to throw the exception is not true, so we get our object back; another when that if condition is true, so we get the exception.

We only need to test logic branching code (conditionals and loops, basically).

We do not need to prove every single statement is executed.

I was reminded a rule of thumb to apply when considering whether to use expects or not. Basically it boils down to if you have test variations which might result in a dependency being called a different amount of times, and that variation is germane to the result, then verify the call count.

Another good point made by my other colleague Brian was that when mocking stuff... only mock the bare minimum to get the test passing. Then stop. And that's probably the right place to leave it be.

An example of this might be if we had this logic in our method:

public function getAsObject($uri, $params = [], $headers = []) {
    $response = $this->get($uri, $params, $headers);

    $object = json_decode($response->getContent(), true);
    
    if (is_null($object)) {
        $this->logger("Bung content returned: $object");
        throw new \Exception('Unable to decode JSON result');
    }
    return $object;
}

Logging is an optional outcome of that method call, so we might want to test that the logger definitely gets called once in that second test. We could also be belt-and-braces about it and test it's not called at all in the happy-path test.

Or let's say the code was a caching proxy and one is testing the uncached and cached variations:

function get($id){
    if ($this->cache->exists($id)){
        return $this->cache->get($id);
    }
    return $this->dao->get($id);
}

In the uncached variation, the data is not found in the cache, so we want to make sure that the DAO is called to get the data from the DB. In the cached variation, we want to make sure the DAO isn't called. Even here, in the first example we're not really testing the DAO call was made specifically; we're testing the logic around the cache check didn't trigger the early return. We're testing the if statement, not the DAO call.

Bottom line, I think that if I find myself starting to do a call count on a mocked method, I need to stop to think about whether I will have cases where the expected call count will be different, and only then ought I even be bothering if the call-count variation is important to the outcome of the function call. If I'm only checking for one call count value... I'm doing it wrong.

And now rather than talking about my day job... I should actually be doing it...

Righto.

--
Adam

Monday, 30 January 2017

PHP: accessing an undefined array element yields a notice. Or does it? WTF, PHP?

G'day:
So we encountered this today:

<?php
$array = [1, 2, 3];

var_dump($array['id']); // this yields a warning
var_dump($array[0]['id']); // none of these do
var_dump($array[0][0]['id']);
var_dump($array[0]['id'][0]);

As indicated, this yields results:

C:\temp>php shite.php
PHP Notice:  Undefined index: id in C:\temp\shite.php on line 4
PHP Stack trace:
PHP   1. {main}() C:\temp\shite.php:0

Notice: Undefined index: id in C:\temp\shite.php on line 4

Call Stack:
    0.0002     353472   1. {main}() C:\temp\shite.php:0

C:\temp\shite.php:4:
NULL
C:\temp\shite.php:5:
NULL
C:\temp\shite.php:6:
NULL
C:\temp\shite.php:7:
NULL

C:\temp>

I think this behaviour is bloody daft. Accessing something that doesn't exist should be more than a notice, and it shouldn't then go ahead and return "null" anyhow. And giving a notice and returning null is possibly the worst way of combining possible actions here. What am I supposed to do with that? Other than bury my head in my hands a weep.

Now this is partially documented on the Arrays page of the docs:

Note:
Attempting to access an array key which has not been defined is the same as accessing any other undefined variable: an E_NOTICE-level error message will be issued, and the result will be NULL.
Note:
Array dereferencing a scalar value which is not a string silently yields NULL, i.e. without issuing an error message.

That's shit, but... well: it's not that surprising (I mean... it's PHP, right?). But how come

$array['id']

incurs the warning, but this doesn't:

$array[0]['id']

This just seems to make fairly unhelpful behaviour just that much less helpful.

I'm sure there's a reason for this. Not a good one, I hasten to add, but no doubt there's a reason.

Anyone know what it is?

Update:

Thanks to Dave (comment below) for pointing out there's a bug been raised for this: 68110.


Righto.

--
Adam

Thursday, 26 January 2017

Survey results: most useful response first

G'day:
I've got the results in from that survey, and started looking through them last night. I haven't really organised anything yet, so nothing useful to report, but this response stood out. The survey had 16 questions, but they left them all blank except question 15:

  1. What else do you think might be useful to share?
    You are toxic.
That's it. That's what they decided to write. That's what they figured was worth investing their time in.

Now I'll not contest whether or not I'm toxic, but I'm not sure that in a community survey aimed at understand how best to help people migrate their skills from CFML is really the best place for it. They coulda just emailed me (my email address is in the "Communications policy" link to the right there). But what I will conclude is if they read the survey questions, and decided that was a sensible, relevant and effective answer to make, they should perhaps be worrying less about how toxic I might or might not be, and more concerned about their own state of mind.

But still: at least I got a new quote for my banner line.

Yours toxically,

--
Adam

Tuesday, 24 January 2017

I'm a dick: part 4: why I'm a dick

G'day:
Sigh.

So other the page few days I've posted these articles:

This was all around the survey I was running (Survey: CFML usage and migration strategies), which freeonlinesurveys.com had closed access to unless I upgraded to a pay-for account. I assumed they were being dodgy, and reacted poorly to this.

A very patient user support person contacted me this evening, and pointed out this screen, when I first went to launch the survey:


And I needed to click on the "OK" button to proceed. I will admit I scanned the first coupla options and went "yeah yeah, same as the pricing page, where's the button to click?", but didn't get as far as the relevant bullet point, or it didn't sink in, or whatever.

However one spins it: they did tell me this information, I simply chose not to pay attention to it.

In freeonlinesurveys.com's favour, the offered to release the data anyhow. I have said to them "no, ballocks to that: it's my bad, I'll pay for it".

Sigh.

Let's join in a chorus of "Cameron's a dick".

But at least I can now start assessing those results, and try to work out how to feed back to you.

Righto.

--
Adam (who is a dick)

I'm a dick: part 3 Email sent to Bristol Trading Standards office re freeonlinesurveys.com

G'day:
What I said in this article turns out to be erroneous. I will post a follow-up to it shortly, but first things first, I'm taking the content down as I don't want it on Google.

But equally I don't want to hide what I said, so I'm gonna re-insert it as an image, so y'all can see the original.

I'll link to the follow-up once I've written it.

Righto.

--
Adam




I'm a dick: part 2 T&Cs located! partial correction on the freeonlinesurveys.com stitch-up

G'day:
What I said in this article turns out to be erroneous. I will post a follow-up to it shortly, but first things first, I'm taking the content down as I don't want it on Google.

But equally I don't want to hide what I said, so I'm gonna re-insert it as an image, so y'all can see the original.

I'll link to the follow-up once I've written it.

Righto.

--
Adam





Saturday, 21 January 2017

I'm a dick: part 1 Survey results: we've been stitched-up by freeonlinesurveys.com

G'day:
What I said in this article turns out to be erroneous. I will post a follow-up to it shortly, but first things first, I'm taking the content down as I don't want it on Google.

But equally I don't want to hide what I said, so I'm gonna re-insert it as an image, so y'all can see the original.

I'll link to the follow-up once I've written it.

Righto.

--
Adam



Saturday, 14 January 2017

Incredibly: a reader asks me for help with PHP & Nginx

G'day:
Jesus fuck, you lot. Don't ask me shit about systems support / server application config / all that godawful shit that should be consigned to the Systems Support Team (sorry to my mates in this role, but I fucking hate it, and became a dev so I didn't have to do it any more).

Right so ages ago I wrote an article "PHP: getting PHP 5 and PHP 7 running side by side on the same machine (a better way)", and someone recently asked me how do do the same on Nginx. Well it's "slow news night" here in my life: I'm just at the pub in Galway passing time by getting pissed on Guinness (I'm on me fifth pint, and even I can tell the writing here is reflecting that) and writing blog articles, so I had a look at it. It's pretty easy, as it turns out.

Firstly: I am no expert on Nginx. I don't like web servers. I try not to ever have to use them. I know enough about IIS to know it's a pain in the arse but I can do what I need if I have to; and I know enough about Apache to get things working. I don't even know why Nginx exists. I'm guessing the reason is "to annoy Adam, cos it's just one more bloody thing he'll need to know about at some point". Sigh. I got PHP working on Nginx once before ("PHP: getting my dev environment running on Nginx instead of Apache"), and other than that have converted some rewrite rules from Apache to Nginx (man: does Nginx suck compared to Apache for those!).

This is a different laptop from the one I did that other exercise on, so whilst I had Nginx on here (for the rewrite exercise, which was work-related and this is my work laptop), I decided to start from scratch. I deleted what I had installed (where for Nginx "installed" means "unzipped").

Here's what I did:
  1. grabbed the latest Windows Nginx download from their site. I just googled "nginx windows download" to find that.
  2. Unzipped it. Relocated it to my apps dir: c:\apps\nginx
  3. Stopped Apache (which listens on port 80), and instead started Nginx (just run nginx.exe from the dir above). By default it listens on localhost and port 80.
  4. Browse to http://localhost/ and verified the Nginx default index.html page was served. OK, so it works as a baseline. Always test the baseline before proceeding with any customisations of things.
  5. Set-up a coupla test hosts in my hosts file (C:\windows\system32\drivers\etc\hosts... make sure to start yer text editor as an admin, otherwise you won't be able to save it):
    127.0.0.1 php5.nginx.local
    127.0.0.1 php7.nginx.local

    I foresaw that to run both PHP5 and PHP7, Nginx is gonna need to differentiate between which one wants to use, and using different host names seemed to make sense and be easy.
  6. I edited my nginx.conf file (in the conf subdir of the one above) to know about these two hosts:
    http {
        # [...]
    
        server {
            listen       8800;
            server_name  php5.nginx.local;
            root   c:/src/php/php.local/www;
    
            # [...]
        }
    
        server {
            listen       8800;
            server_name  php7.nginx.local;
            root   c:/src/php/php.local/www;
    
            # [...]
        }
    }
    

    Where I've elided stuff, it's the same as it was before. The chief considerations here are:
    • having a server for both PHP5 and PHP7;
    • having them listen on each of those two new hosts I set up;
    • setting the root to be where my PHP code is. This is the same for both in this case, as I want to serve exactly the same code via both 5 and 7;
    • oh and I'm listening on port 8800 as I don't want Nginx to interfere with my normal Apache install (I'll be sticking with Apache after this experiment, thanks).
    Note that this config will still not serve PHP, but it'll at least run.
  7. When one runs Nginx from the console it hogs the prompt, so to stop it one needs to run another console and call nginx -s stop. We need to do this to test the config changes. So I did that, and used the other console to start it again.
  8. I browse to each of http://php5.nginx.local:8800 and http://php7.nginx.local:8800 to test they were working. They were running, but giving a 403 cos I didn't have an index.html in that directory, nor did I have directory browsing switched on (which for my test code I do, as it makes finding stuff easier).
  9. I told Nginx to allow directory browsing for each of the server configs:
    location / {
        index  index.html index.htm;
        autoindex on;
    }
    

    Do not do this in production. Well: don't do anything I say in production.
  10. I cycled Nginx again and tested both hosts:

    Index of /


    ../
    code-coverage-reports/                             15-Nov-2016 13:33                   -
    community/                                         30-Jun-2016 11:47                   -
    experiment/                                        15-Nov-2016 08:47                   -
    library/                                           30-Jun-2016 11:47                   -
    stackoverflow/                                     30-Jun-2016 11:47                   -
    deleteme.php                                       30-Jun-2016 11:47                 233
    gdayWorld.html                                     30-Jun-2016 11:47                  11
    gdayWorld.php                                      30-Jun-2016 11:47                  50
    phpinfo.php                                        30-Jun-2016 11:47                  21
    utf8.html                                          21-Dec-2016 08:21                 133
    


    So that's all good except for me not having deleted that file that perhaps I meant to, a while back ;-)
  11. Next I just followed the instructions from me other blog article, getting PHP to listen out to traffic coming in from Nginx (the code below is in a batch file):
    start C:\bin\RunHiddenConsole.exe C:\apps\php\5\5\php-cgi.exe -b 127.0.0.1:8550 start C:\bin\RunHiddenConsole.exe C:\apps\php\7\1\php-cgi.exe -b 127.0.0.1:8710
    Note that each of them is listening on a different port.
  12. And then tell Nginx to pass PHP requests across to PHP:
    server {
        # [...]
        location ~ \.php$ {
            fastcgi_pass   127.0.0.1:8710;
            fastcgi_index  index.php;
            fastcgi_param  SCRIPT_FILENAME  $document_root$fastcgi_script_name;
            include        fastcgi_params;
        }
        # [...]
    }
    

    That's the one for PHP7, but the PHP5 one is the same except for the port.
  13. I restart Nginx again, and this time hit phpinfo.php on each host:
    {i'd show you proof, but BlogSpot is refusing to let me put two images inline in an <ol> list, it seems. Trust me... it works).
  14. Hurrah!
  15. The last thing I tried to do is to wrap my batch file that starts Nginx into a service, using RunAsService.exe, but this is my work laptop and it's locked-down too tight for me to run that. But... well... the batch file works.
So, anyway... that's it. That's what one needs to do to get Nginx serving two different versions of PHP. My will to live has been sapped by even having to think about this sort of shit, so I'm going back to my Guinness (two pint further ahead than I was when I started this exercise).

Sorry this one is a bit scrappy but... well... I'm drunk.

Righto.

--
Adam

PHP: 2000 words on elseif (and PHPMD)

G'day:
This is another one that just lives up to the definitive notion of a blog: a web log; a log of what I'm doing. But, like... web-based. A blog. What I basically mean is that I'm not sure I'm showing any insight here (look: I basically know I'm not), but it's what I looked at today, so it's what I'm writing about today.

We're trying to lift our game when it comes to our code "Cleanliness" (in the RC Martin "Clean Code" sense), and we've started more rigorously using some code review tools recently, such as PHPCS and PHPMD. They're both pretty good, and I recommend them to anyone using PHP. Almost everyone probably already is: I'm always late to these games.

One PHPMD rule that tends to irk me is the one that declares "[one's code] uses an else expression. Else is never necessary and you can simplify the code to work without else". Now: generally this is true. One can usually refactor one's code to stick it in a helper method and early return in the if logic, then just the rest of the code after the if block is what happens in the else case. This is good because it helps one's code be "closer to the left-edge"; ie: it's not indented because it's not in a logic block any more. The more one needs to indent one's code: the more it probably smells. It's got to the stage with our code now that a double nesting of logic-control statements almost gets the same reaction from code reviewers as when they see a comment in our code; "hey, gather round everyone: Adam's made a comment. Let's look at it!" (Adam shrinks into the carpet and starts getting his excuses ready. Later he refactors the code to be clearer, and the comment goes; the code reviewers are happy). The problem for me is that I'm not clever enough sometimes, and it takes me a coupla takes before I see how to get rid of a wayward else. At least now I usually pick it up and work it out during my "self code review" phase, and sort it out. And PHPMD helps with this.

Today I was reviewing a colleague's code, and noted they had not an else but an elseif. And I triumphantly(*) pinged them with a comment reminding them to run PHPMD before submitting their pull request.

(*) because I'm petty.

It turns out that they had run PHPMD, and it hadn't pulled 'em up. Initially (again: cos I'm petty) I didn't believe them and knocked together my own SSCCE and looked at it.

This demonstrates the problem:

function takeAChance(){
    $firstChance = (bool) rand(0,1);

    if ($firstChance) {
        echo "firstChance was true";
    } else {
        echo "firstChance was false";
    }
}

takeAChance();

And now running PHPMD over it:

C:\src\php\php.local\src\phpmd>phpmd else_block.php text phpmd.xml
C:\src\php\php.local\src\phpmd\else_block.php:8 The method takeAChance uses an else expression.
Else is never necessary and you can simplify the code to work without else.

C:\src\php\php.local\src\phpmd>


OK, fair enough. Oh btw, the solution to this is to early return:

function takeAChance(){
    $firstChance = (bool) rand(0,1);

    if ($firstChance) {
        echo "firstChance was true";
        return;
    }
    echo "firstChance was false";
}

(note: yeah yeah, in the real world I'd not be echoing from a function. I get that. Not the point here)

OK, now here's an SSCCE of the sort of thing we had in this pull request:

function takeAChance(){
    $firstChance = (bool) rand(0,1);
    $secondChance = (bool) rand(0,1);

    if ($firstChance) {
        echo "firstChance was true; don't care what secondChance was";
    }elseif($secondChance){
        echo "well at least secondChance was true";
    }
}

And running PHPMD:

C:\src\php\php.local\src\phpmd>phpmd elseif_block.php text phpmd.xml

C:\src\php\php.local\src\phpmd>

No problems. Grrr. This does not make sense to me. If we add an actual else in there:

function takeAChance(){
    $firstChance = (bool) rand(0,1);
    $secondChance = (bool) rand(0,1);

    if ($firstChance) {
        echo "firstChance was true; don't care what secondChance was";
    }elseif($secondChance){
        echo "well at least secondChance was true";
    }else{
        echo "both were false";
    }
}

And PHPMD says:
C:\src\php\php.local\src\phpmd>phpmd if_elseif_else_block.php text phpmd.xml
C:\src\php\php.local\src\phpmd\if_elseif_else_block.php:11
The method takeAChance uses an else expression.
Else is never necessary and you can simplify the code to work without else.

C:\src\php\php.local\src\phpmd>


And, yeah, it's true... but the elseif seems to me to be as much an issue here as an else. We can short-circuit both easily enough with early returns:

function takeAChance(){
    $firstChance = (bool) rand(0,1);
    $secondChance = (bool) rand(0,1);

    if ($firstChance) {
        echo "firstChance was true; don't care what secondChance was";
        return;
    }
    if($secondChance){
        echo "well at least secondChance was true";
        return;
    }
    echo "both were false";
}

Equally, to me a PHP elseif (note this is an actual command, it's not a concatenation of else if like one might expect), really is, functionally, a "single-statement else", just with the statement itself being an if statement. What I mean is remember that the if / elseif / else statement doesn't need to use blocks (ie: a bunch of statements within braces), it could be a single statement:

function takeAChance() {
    $firstChance = (bool) rand(0,1);

    if ($firstChance)
        echo "firstChance was true";
    else
        echo "firstChance was false";

}

(what an abomination: almost looks like Python. Bleah).

Anyway... so yeah, one doesn't need to use braces if there's only one statement to execute based on the result of the condition. Don't do this. But it's possible.

What I'm getting at is this:

if ($firstChance) {
    echo "firstChance was true; don't care what secondChance was";
}elseif($secondChance){
    echo "well at least secondChance was true";
}

Is basically like this:

if ($firstChance) {
    echo "firstChance was true; don't care what secondChance was";
}else{
    if($secondChance){
        echo "well at least secondChance was true";
    }
}

Just without the braces on the else:

if ($firstChance) {
    echo "firstChance was true; don't care what secondChance was";
}else
    if ($secondChance){
        echo "well at least secondChance was true";
    }

Compare this again to the elseif version:

if ($firstChance) {
    echo "firstChance was true; don't care what secondChance was";
}elseif ($secondChance){
    echo "well at least secondChance was true";
}

The only difference there is the whitespace between the else and the if. And logically there is no difference, as far as I can tell.

I tested all variations I could think of of elseifs, elses, braces, no braces, and the like (see the samples on GitHub), and ran PHPMD over them:

C:\src\php\php.local\src\phpmd\samples>phpmd . text phpmd.xml
C:\src\php\php.local\src\phpmd\samples\else_block.php:8 The method takeAChance uses an else expression. Else is never necessary and you can simplify the code to work without else.
C:\src\php\php.local\src\phpmd\samples\else_if_block.php:9      The method takeAChance uses an else expression. Else is never necessary and you can simplify the code to work without else.
C:\src\php\php.local\src\phpmd\samples\if_elseif_else_block.php:11      The method takeAChance uses an else expression. Else is never necessary and you can simplify the code to work without else.

C:\src\php\php.local\src\phpmd\samples>

Hmmmm... it's not a case of else clauses being a problem it seems. It's a problem with else blocks being a problem.

It's complaining about this:

if ($firstChance) {
    echo "firstChance was true";
} else {
    echo "firstChance was false";
}


But not about this:

if ($firstChance) {
    echo "firstChance was true";
} else 
    echo "firstChance was false";
 


The only difference being the lack of braces on the second one.

Now I guess this makes an element of sense. The "untidiness" of large logic blocks is that they are more of a pain to navigate across when maintaining the code. With an if/else statement one needs to pay attention to where the if is, where the else is, which code is in which clause, and where the whole thing ends. This is not a problem if one keeps the code clean so each block has only 1-2 statements in it, but we've all seen if/elses that go on for miles and miles worth of scrolling down (and, it's worth pointing out: that is shit code if it's written like that. Don't do it). It's exacerbated by further nested logic within that. Fair enough.

But with a non-block else case - which intrinsically can only be one statement - the problem of unclean block sizes can simply not occur. Also fair enough.

But in that case, the warning should not be able else, intrinsically; it should be about the size of the blocks. And the size rule should probably apply to the if block too! If a single-statement else is OK, then a single-statement else block should be OK too.

I think the solution to this is that any else clause should be flagged: be it a single statement, or be it a block. So intrinsically an elseif should be flagged too. I'll raise this with the PHPMD bods.

Whilst on this: they should probably swing the other way and actually flag any instances of single-statement logic clauses: ifs and loops should always have braces. That's a widely accepted truism in curly-brace-language programming. I realise PHPCS does pick up on these (I only realise it just now cos I just tested it), but if PHPMD is busying itself with this sort of thing too; it should do a thorough job. In my view brace-less control logic is mess.


Lastly on this, I'm slightly irked by this, from the PHPMD Clean Code Rules page:

ElseExpression

[...]

An if expression with an else branch is never necessary.

It is terribly pedantic of me, but… it's not an expression. PHP flow control statements are…statements, not expressions. An expression is a specific thing: it is some code that represents a value. In PHP if/else statement does not represent a value. One cannot do this:
$message = if ($firstChance) {
    "firstChance was true";
} else {
    "firstChance was false";
}

If you cannot have some code on the right-hand side of an assignment (or anywhere else a value might be expected), then it's not an expression. A counterpoint to this, one can do this in Ruby:

message = if firstChance
    "firstChance was true";
else
    "firstChance was false";
end

puts message

This is because "everything is an object" in Ruby, I guess: even statements. In this case it seems the value of the if/else statement is that of the last statement in the relevant clause that gets executed by the logic. I'm buggered if I know whether Ruby best practices would encourage that (and by that I mean: I really actually don't know!), but I dunno that I'd be doing that sort of thing. Still: it demonstrates an if statement that is also an expression. Which a PHP one is not.

This sort of thing wouldn't bug me if it was in a casual conversation, but in an official implementation, I think it's a bit… ah… well let's saynit could just be done properly. But then again I guess this is PHP. And "doing it right" doesn't seem to be something PHP concerns itself with too much. It seems to be more "just get it done".

All this aside, I think PHPMD is a bloody handy tool, and whether or not I agree with the minutiae of some of its analysis or its turn of phrase is minor. It's helped my code a lot in the last coupla weeks. And I think our code in general is better of for using tools like this.

So… yeah. That was what I was doing yesterday (it was Fri when I started, it's now Sat evening and I'm in the pub). not scintillating stuff, but the examination of what was going on and the general experimentation was useful (to me, anyhow), I think. And it's enabled me to write a 2000-word blog article which is basically about elseif expressionsclauses.

Righto.

--
Adam

Monday, 9 January 2017

Testing: Too much detail, Cameron?

G'day:
Just really quickly here. Here's a question I posted on the Testing sub-channel of the #CFML Slack channel:

Question: I have a function which is basically a cron job which gets stuff from one data repository, monkeys with it, then sends the monkeying back to a different repository.

I log stuff along the way, for example:
  • "process started"
  • "got n records"
  • "need to update m records"
  • "job done"

Currently I am actively testing that "got n records" reflects the right value for n (where n in the context of the test is a test value coming back from a mock), and similarly with "need to update m records".

On one hand I think "well that's an important partial 'result' from the process, so - yes - test for it".

On the other hand I go "hmmm... is it? Or is that implementation detail?"

Thoughts?

That's it, really. Not an answer to anything today; just a question for you. What do you think?

--
Adam

Saturday, 7 January 2017

Survey: CFML usage and migration strategies

G'day:
This comes off the back of a few unrelated discussions I've had recently about people having moved on from CFML, and there being a reasonable desire to encourage and facilitate other people to do so too. There's a feeling that a lot of CFML devs might move away from CFML if they were helped to, but lack the confidence to make the leap themselves, or know where to start.

One suggestion floated was to start a project similar to ColdFusion UI the Right Way, but basically covering any part of CFML. Someone's suggested they might help giving examples of how to do stuff in Go; and I'd definitely help out giving examples in PHP (which might encourage people to stick with CFML, I dunno! ;-)

The motivation for this is that I think persisting with CFML is - for a lot of people in a lot of situations - career "assisted dying". I think for most CFML devs it's deleterious to stick with it. I'd like to help show these devs that other languages aren't scary, and indeed a bunch of them are similar in ease of use as CFML claims it is.

I've created a survey on freeonlinesurveys.com. I used to use Survey Monkey for this, but they now limit their free surveys to ten questions, and this one has 15. I've not tried freeonlinesurveys.com (either as a provider or a consumer), so I hope this works out. The UI for creating the surveys sure was easy to use though.

Anyway, the survey is here: "CFML usage and migration strategies". Go fill  it in if you can be arsed. To let you know what yer getting into, the survey is 15 free-flow questions, as follows. I've included my own answer for each question:
  1. Provide a brief comment about youself (don't worry about your CFML usage or dev work just yet: this is just about you). Don't worry if you'd rather not give too much detail, that's cool.
    This is my own answer: Adam Cameron. I'm London-based and (breaking my own rule slightly here) have been a dev for 16 years and in the IT industry for... blimey... 23 years.
  2. How did you come to be a developer, and are you primarily a developer or is it an adjunct to another role (like a sysop or designer or something like that)?
    I did programming at polytech, and loved it. I then took the first job that presented itself to me, which was as a sysop & desktop support bod on a NetWare network. I started using CFML there just before I went abroad for a few years. On return I got offered a job as a CFML dev with no professional programming experience at all.
  3. Summarise your CFML usage timeline (just timeline for this one). Include things like what year you started, when you moved on, if you have. (or what's the time timetable for moving on if it's just planned). Mention versions in this one.
    I started in 2001, and stuck with it until 2014. I shifted to PHP because the company I work for did. During that time I used all versions of CF from 4.5->9 professionally, but also have good knowledge of 10-12 as well, as I've been involved heavily in the CFML community and testing of ColdFusion's CFML implementation. I've never used any other CFML platform other than ColdFusion, other than experimentally & community participation.
  4. During that time was it your primary or sole dev language, do you think? Or was it always an adjunct to some other language? How did it fit within the mise en scene of your daily usage? For the purposes of this answer, let's consider "a wee bit of client-side JS and a bunch of HTML & CSS" as a given. Only mention those if they represent a significant part of your work.
    Yup. I did purely CFML from 2001-2014.
  5. Did you work on just in-house code bases for your employer, or did you also work on third party code bases - like open source projects - too.
    Just in house, and for client applications when I worked in a studio.
  6. If you're still primarily a CFML developer... why? That's not a loaded question, and I'm not suggesting that you're wrong for being where you are. It's just good framing information. Don't answer this if you've moved on from CFML: the next question is for you.
    I've moved on.
  7. If you've moved on from CFML: why? Did you just change jobs? Did other languages you were using just seem more appealing? Over time did you find yourself using CFML less and less? Did you actively change because of career-longevity considerations? What language(s) did you move to? Stuff like that.
    I was offered a job doing CFML, with no professional programming expertise at all, and it was a good way of getting away from doing network / hardware / desktop suppport work, which I didn't enjoy. That was my only reason for even starting with CFML: it was easy to pick up. I stuck with it cos I am lazy, and couldn't be arsed changing, and it paid the bills. By the time I decided I'd had enough I was tithed to my employer, so had to stick with them. We started a move to C# but that didn't take as we got bought out by a PHP shop. So now I do PHP. It was a good opportunity to shift languages, even if PHP would not have been my choice.

    My chief concern with changing languages is having to take a pay cut from "senior" dev to being a newbie in another language. I cannot sustain a pay cut.
  8. Do (/did) you use primarily or solely ColdFusion; or Lucee or Railo; or some variation of BlueDragon? In what proportions? If you migrated from one to another: why?
    ColdFusion. I never used any of the others professionally, but have messed around with all of them a fair bit.
  9. Do (/did) you participate in any CFML-based open source projects? To what degree (like you are the owner or primary committer of the project; or just a single commit; or raised some bugs but never actually coded anything; or wrote some docs, or whatever)?
    I raise bugs when I find them, and have helped a bit with cfdocs.org and luceedocs too. But never submitted a pull req.
  10. And what about in other languages?
    I've raised a bug in PHP 7 which got fixed. I have not directly contributed to any side projects though.
  11. What is or was - for you - the best feature of CFML which has you going "yeah, that's pretty cool actually". List more than one if you like. Importantly: had you compared similar functionality to how it's done in other languages, or was it just based on liking the CFML feature?
    That it's loosely and dynamically typed,which cuts down on pointless boilerplate code like one sees in Java. I didn't really compare to other languages too much.
  12. Are there any CFML features that would have fallen into that category for you when you were doing CFML, but ended up not being as cool as you thought when you looked at other languages?
    <cfquery> isn't really any easier to use than how it's done in any other language. That's a myth.
  13. What is it about CFML (or the underlying ColdFusion / Lucee / etc platform) you like the least? Explain why, if poss. Again, list as many or as few as you like. If this/these contributed to you moving on (if you have, I mean), mention that too.
    The vendor attitude. Adobe is slack at their job with ColdFusion, because they have too "enterprise" an approach to something that should be dynamic and agile. The community is very small and shrinking. Community expertise seems to migrate away, rather than steward the community (most of the community stalwarts still around aren't even CFML devs any more!)
  14. If there was a project similar to "ColdFusion UI the Right Way", but aimed at any part of CFML (like how to make a DB query in PHP instead of CFML for example), would you be keen to help on it? Would it be of interest to you to be a "user" of it?
    Well yeah. It was partly my idea.
  15. What else do you think might be useful to share?
    nothing really
So you could be writing a lot of stuff there: that's excellent if you do. Or you could write as little as you like: also cool, just slightly less so.

One of my Twitter contacts said that if I furnished the questions, they'd blog their thoughts. That's another thing you could do instead of doing the survey if you like: just let me know about it, as I don't follow any technical blogs these days (CFML or otherwise). Or you could reply in a comment here (although for me the survey would be easier to deal with, to be honest). Just whatever.

If you could also circulate word about this to your CFML contacts, that'd be cool too. I dunno how much notice the CFML community takes of me any more. Cheers.

That URL again: https://freeonlinesurveys.com/s/AuQ2WWKy#/0.

Or just RT it:



Righto.

--
Adam