Adam Cameron's Dev Blog: Iwan Dessers

Showing posts with label Iwan Dessers. Show all posts

Sunday, 3 April 2016

Decorator Pattern vs simple inheritance

G'day:
This is a sequel to my earlier article "Using a decorator pattern to reduce inappropriate code complexity". You'd better go breeze through that before reading this one, otherwise this one won't make too much sense.

One of my colleagues read the above article and said "yeah good... but I wonder why one would do that over simple inheritance?" Good question. Especially as he is my boss, so all his questions are good (like I said: he also reads this blog ;-). My prepared answer was that the decorator pattern was an implementation representation of an interface, and accordingly it can kinda reflect multiple inheritance (albeit by composition) where an inheritance-based approach only allows a single inheritance chain: and that will get clumsy quickly. Note that I am not suggesting that the decorator pattern actually reflects a multiple inheritance pattern, I'm just using it as a comparative metaphor. I think it's reasonable.

In the previous article I showed how to simplify and focus the implementations of a UserRepository and a LoggedUserRepository (it writes to a log as well as making a given repo call), and a CachedUserRepository (it caches the repo call). And, indeed from there it was easy to decorate the UserRepository with both logging and caching, to effect either a LoggedCachedUserReppository, or a CachedLoggedUserRepository. If you see the subtle difference there: it's what order the ancillary operations take place in. This was in lieu of having all the caching and logging code baked into the UserRepository, which I think is less-than-ideal design, makes testing harder, and also presupposed an implemenation (which had already been demonstrated - in our case - to be a bad supposition, hence the origin of this investigation).

Code-wise, I had this:

class UserRepository implements RepositoryInterface {

    public function getById($id) {
        return (object) [
            "id" => $id,
            "firstName" => "Number $id",
            "recordAccessed" => new \DateTime()
        ];
    }

}

And the LoggedRepository used to decorate a UserRepository with logging might be like this:

class LoggedRepository implements RepositoryInterface {

    private $repository;
    private $loggerService;

    public function __construct(RepositoryInterface $repository, LoggerServiceInterface $loggerService) {
        $this->repository = $repository;
        $this->loggerService = $loggerService;
    }

    public function getById($id) {
        $this->loggerService->logText("$id requested");
        $object = $this->repository->getById($id);

        return $object;
    }

}

The key points are that the UserRepository does all the data-getting, the LoggedRepository just decorates that with some logging. But each repo just plays to its strengths, and lets the other one get on with its own strength. That's the decorator pattern. The CachedRepository was analogous to the Logged~ one, just doing caching instead of logging. And to get both a logged & cached or a cached & logged UserRepository, it's just a matter of a further layer of decoration. Roughly like a matryoshka doll. Of sorts. Anyway: there's that whole other article about that.

So what about using inheritance instead? Let's swap out the repository type here to just a PersonRepository (the reason for this is solely cos all this code is in the same namespace, so I had already used the User~ metaphor). The PersonRespository is the same as the User one:

class PersonRepository implements RepositoryInterface {

    public function getById($id) {
        return (object) [
            "id" => $id,
            "firstName" => "Number $id",
            "recordAccessed" => new \DateTime()
        ];
    }

}

(I'm still just faking the actual data-fetch operation, as it's irrelevant here. But imagine getById() gets a user via its ID from some data store).

One could consider that a LoggedPersonRepository is a specialisation of UserRepository: it's for the same purpose, it just throws logging into the mix:

class LoggedPersonRepository extends PersonRepository {

    protected $loggerService;

    public function __construct(LoggerServiceInterface $loggerService) {
        $this->loggerService = $loggerService;
    }

    public function getById($id) {
        $this->loggerService->logText("$id requested");
        $object = parent::getById($id);

        return $object;
    }

}

One can accurately say a LoggedPersonRepository IS A PersonRepository, so the inheritance seems sound, and if one thinks through scenarios, I think it passes a Liskov substitution principle test as well. So the design is - thusfar - sound. And looking at the code, this is actually more simple than the decorator pattern version because there's no need for this LoggedPersonRepository to take a PersonRepository argument, as it is a PersonRepository.

And it's samesame for a CachedPersonRepository, but I'll not bore you with the implementation detail. I'm guessing you can see what I mean.

But what about when we do add caching to our repositories? We'd need two more classes:

class CachedPersonRepository extends PersonRepository {

    protected $cacheService;

    public function __construct(CacheServiceInterface $cacheService) {
        $this->cacheService = $cacheService;
    }

    public function getById($id) {
        if ($this->cacheService->isCached($id)) {
            return $this->cacheService->get($id);
        }
        $object = parent::getById($id);

        $this->cacheService->put($id, $object);
        return $object;
    }

}

And:

class CachedLoggedPersonRepository extends LoggedPersonRepository {

    protected $cacheService;

    public function __construct(LoggerServiceInterface $loggingService, CacheServiceInterface $cacheService) {
        parent::__construct($loggingService);
        $this-&gt;cacheService = $cacheService;
    }

    public function getById($id) {
        if ($this-&gt;cacheService-&gt;isCached($id)) {
            return $this-&gt;cacheService-&gt;get($id);
        }
        $object = parent::getById($id);

        $this-&gt;cacheService-&gt;put($id, $object);
        return $object;
    }

}

IE: CachedPersonRepository is an entirely different class from a CachedLoggedPersonRepository. This is still sound inheritance, and stands up to the LSP test too; so from that side of things, the design is still "valid". This doesn't make it optimal though. Also if we wanted to just have the caching, or the caching just higher in the hierarchy than the logging? Two more classes again.

And say you then want to add an encryption layer? That's potentially another five different class variations:

EncrytedPersonRepository
EncryptedLoggedPersonRepository
EncryptedCachedPersonRepository
EncryptedCachedLoggedPersonRepository
EncryptedLoggedCachedPersonRepository

And if we want the encryption done at a different level than just the outer one... they you see how many more possible permutations there might potentially be a use case for. Now I'm not saying any one given system would need all these permutations (that'd be weird), but it does demonstrate that the approach doesn't scale so well. Using the decorator pattern, one needs just one class per task, for all variations of sequencing the operations. In this example one would need a PersonRepository, a LoggedPersonRepository, a CachedPersonRepository, and an EncryptedPersonRepository. Those four classes can be used to make a EncryptedLoggedCachedPersonRepository, or a CachedLoggedEncryptedPersonRepository, or any other implementation of a subset of those four notions.

I think my example of a LoggedCachedEncryptedPersonRepository is slightly egregious, and I don't want to use an appeal to extremes to make my case here. However I think - all things being equal - using the decorator pattern instead of inheritance to solve this sort of thing is going to be a more robust way of preempting potential scaling requirements; whilst still being a simple and recognised solution; and an easy way to facilitate a possible future situation, whilst not in any way actually going down the rabbit-hole of actually writing any code for that potential future.

All the code and sanity checks of the behaviour can be found in my github account: decorator.local.

Righto.

--
Adam

Wednesday, 30 March 2016

PHPUnit: I question the merits of my work

G'day:
We take our TDD very seriously on our London-based team: all (PHP) code is developed using TDD, which means we spend a lot of time working with PHPUnit. And given the nature of testing, the test code can get pretty bloody repetitive at times. We also leverage dependency injection for all our dependencies, and as we take a Clean Code approach, we sometimes end up with a whole bunch of dependency methods to mock-out when we're testing the class depending on the... ah... dependency (I'll submit that one for "Clumsiest Sentence of the Day").

Furthermore, it seemed to me that our test classes were getting cluttered with a very high percentage of PHPUnit boiler plate code facilitating the mocking. It seemed to me that PHPUnit's fluent interface is just a mechanism for clutter.

Consider this contrived situation:

namespace me\adamcameron\someApp\service;

use \me\adamcameron\someApp\repository\SomeRepository;

class SomeService {

    private $someRepository;

    public function __construct(SomeRepository $someRepository){
        $this->someRepository = $someRepository;
    }

    public function getTheThing($idOfTheThing){
        return $this->someRepository->getTheThing($idOfTheThing);
    }

    public function setTheThing($idOfTheThing, $valueOfTheThing){
        return $this->someRepository->setTheThing($idOfTheThing, $valueOfTheThing);
    }

    public function doesTheThingExist($idOfTheThing){
        return $this->someRepository->doesTheThingExist($idOfTheThing);
    }

}

This is all very contrived (in case you had not noticed from the class / method naming), but here we have SomeService which has three methods which depend on a repository to handle the actual data read/writes. When testing SomeService, we want to mock SomeRepository out completely as it's not being tested. It's the getMockedSomeRepository() method which is my focus here. Every method needs the same boilerplate code to mock it out: how many times it's called, what its name is:


namespace me\adamcameron\someApp\test\service;

use \me\adamcameron\someApp\service\SomeService;

class SomeServiceTest {
public function setup(){
        $mockedSomeRepository = getMockedSomeRepository();
        $this->someService = new SomeService($mockedSomeRepository);
    }

 public function testGetTheThing(){
        $result = $this->someService->getTheThing($this->testId);

        $this->assertEquals($this->mockedGetTheThingResult, $result);
    }

 public function testSetTheThing(){
        $this->someService->setTheThing($this->testId, $this->testThing);
    }

 public function testDoesTheThingExist(){
        $result = $this->someService->doesTheThingExist($this->testId);
        $this->assertEquals($this->mockedDoesTheThingExistResult, $result);
    }

 private function getMockedSomeRepository(){
  $mockedSomeRepository = $this
   ->getMockFactory('\me\adamcameron\someApp\repository\SomeRepository')
   ->disableOriginalConstructor()
   ->setMethods(["getTheThing", "setTheThing", "doesTheThingExist"])
   ->getMock();

  $mockedSomeRepository
   ->expects($this->once())
   ->method("getTheThing")
   ->with($this->testId)
   ->willReturn($this->mockedGetTheThingResult);

  $mockedSomeRepository
   ->expects($this->once())
   ->method("setTheThing")
   ->with($this->testId, $this->testThing);

  $mockedSomeRepository
   ->expects($this->once())
   ->method("doesTheThingExist")
   ->with($this->testId)
   ->willReturn($this->mockedDoesTheThingExistResult);

  return mockedSomeRepository;
 }

}

As an aside, don't ask why PHPUnit does it in that order, but that's the recommendation. I guess it's a metaphor for "expect the foo method to be called with these arguments and it will return this value". That reads OK when it's English, but when looking at the code, it's a bit like Yoda wrote it. Still: shrug.

Also there's the same set of variations on mocking the class itself: one either disables the constructor or passes it some arguments, and it needs to have the methods specified at this point (as well as individually when mocking the methods).

Back to considering the method mocking: I've demonstrated that it's not always the same set of expectations being set: on the mock for setTheThing() I do not check the return value, cos it doesn't have one. Similarly sometimes one doesn't want to test how many times a method is called (no expects()) or what arguments it receives (no with()); or sometimes willReturn might actually need to be a will() call (which allows for more complicated results, like using a callback which contrives the result from the arguments, and other good stuff).

But it's always the same variation on the same theme. And I did not like having to type ->expects($something), ->with($somethingElse) over and over. The "sensible" approach seemed to be abstract out the boilerplate code, and just pass the values into some helper which then had the boilerplate once.

With a bit of horsing around, I came up with this sort of thing:

class MockHelper extends \PHPUnit_Framework_TestCase {

    function mockObject($class, $methods=null, $constructorArgs=null){
        $mockObjectBuilder = $this->getMockBuilder($class);

        if (is_null($constructorArgs)){
            $mockObjectBuilder->disableOriginalConstructor();
        }else{
            $mockObjectBuilder->setConstructorArgs($constructorArgs);
        }

        if (!is_null($methods)){
            $mockObjectBuilder->setMethods(array_keys($methods));
        }
        $mockedObject = $mockObjectBuilder->getMock();

        if (is_null($methods)){
            return $mockedObject;
        }

        foreach ($methods as $method=>$mock){
            if (array_key_exists("expects", $mock)){
                $methodMocker = $mockedObject->expects($mock["expects"]);
            }else{
                $methodMocker = $mockedObject;
            }
            $mockedMethod = $methodMocker->method($method);

            if (array_key_exists("with", $mock)){
                call_user_func_array([$mockedMethod, 'with'], $mock["with"]);
            }

            if (array_key_exists("will", $mock)){
                $mockedMethod->will($mock["will"]);
            }

            if (array_key_exists("willReturn", $mock)){
                $mockedMethod->willReturn($mock["willReturn"]);
            }
        }
        
        return $mockedObject;
    }

}

An example call of this might be:

private function getMockedSomeRepository()
{
    $mockHelper = new MockHelper();
    $mockedSomeRepository = $mockHelper->mockObject(
        '\me\adamcameron\someapp\repository\SomeRepository',
        [
            "getTheThing" => [
                "with" => [$this->testId],
                "willReturn" => $this->mockedGetTheThingResult
            ],
            "setTheThing" => [
                "with" => [$this->testId, $this->testThing]
            ],
            "doesTheThingExist" => [
                "with" => [$this->testId],
                "willReturn" => $this->mockedDoesTheThingExistResult
            ]
        ]
    );

    return $mockedSomeRepository;
}

The conceit is that - as I mentioned above - the data defines the behaviour here, not a bunch of repetitive boilerplate code. I just state the class I want to mock, and - in one fell swoop - a data structure of the methods I want to mock and their mocked behaviour. I don't need to worry about how the mocking takes place, this is all done within mockObject(). Lovely. And meself and me colleague started to use this on our team's work.

Come the Friday PHP Team meeting, I mentioned it in passing to the rest of the bods on the broader PHP team (we were only using this in one of our sub-teams), and they were appropriately hesitant of code they didn't have a hand in, but seemed to agree with the pain of repeated boilerplate needing a solution, but seemed ambivalent about this particular solution. Then after some reflection our team lead made a couple of very salient points (my paraphrasing of his original sentiment):

If we hire a new dev, there's a chance they might already know PHPUnit, but what they definitely won't know is how to use this approach. Is this a good thing?
Where's the actual gain here? Is it actually less code? Have you solved the problem you were trying to solve?

Point 1 is relevant, but in reality in my experience bugger-all devs know the first thing about unit testing, so it's slightly optimistic to think this will be an issue. In reaction to point 2 I was like "well of course it is! It must be!". Then I checked.

Let's look at the two approaches side by side:

Using PHPUnit's mock builder:

private function getMockedSomeRepository()
{
    $mockedSomeRepository = $this
        ->getMockBuilder('\me\adamcameron\someapp\repository\SomeRepository')
        ->disableOriginalConstructor()
        ->setMethods(["getTheThing", "setTheThing", "doesTheThingExist"])
        ->getMock();

    $mockedSomeRepository
        ->method("getTheThing")
        ->with($this->testId)
        ->willReturn($this->mockedGetTheThingResult);

    $mockedSomeRepository
        ->method("setTheThing")
        ->with($this->testId, $this->testThing);

    $mockedSomeRepository
        ->method("doesTheThingExist")
        ->with($this->testId)
        ->willReturn($this->mockedDoesTheThingExistResult);

    return $mockedSomeRepository;
}

Using this mock helper:

private function getMockedSomeRepository()
{
    $mockHelper = new MockHelper();
    $mockedSomeRepository = $mockHelper->mockObject(
        '\me\adamcameron\someapp\repository\SomeRepository',
        [
            "getTheThing" => [
                "with" => [$this->testId],
                "willReturn" => $this->mockedGetTheThingResult
            ],
            "setTheThing" => [
                "with" => [$this->testId, $this->testThing]
            ],
            "doesTheThingExist" => [
                "with" => [$this->testId],
                "willReturn" => $this->mockedDoesTheThingExistResult
            ]
        ]
    );

    return $mockedSomeRepository;
}

The former is 24 lines of code, and the latter 22. Not exactly a win there. If we were pedantic and went down to byte level, it's 756 vs 702. Until one thinks that one needs to import the class that the helper is in, so adding in that - in my instance, with my namespace - brings the helper version to 752 bytes.

Now I'm not interested in the number of bytes... that's daft. But I am interested in how many lines of code we're saving... and I'm sorry but 22 vs 24 doesn't cut it for me. It might be 8% or so, but... this is not the sort of code that's ever gonna be reflective of hundreds or even dozens of lines of code saved in a test class file. Over all it might save hundreds of lines of code in our code base, but it doesn't matter when it's spread through hundreds of files. It only really matters when looking at the one method in front of one at that point in time.

In the end I am rather disappointed that I don't think this solution is a good one, and I have reverted to using PHPUnit's mock builder approach.

Another of my colleagues pointed out another flaw - and this was a flaw in my approach to using PHPUnit's mock builder in general - one doesn't necessarily want to mock the object and the methods in one fell swoop anyhow. One might - and probably does - want to mock the dependency object itself in setup(), but then it's on a test-by-test basis that one needs to provide the method-mocking details. It's annoying that one needs to call the initial setMethods() at object level, and can't simply have that inferred from calling something like mockMethod() on the mocked object, but that's generally OK.

From a personal perspective: one thing I am taking away from all this is that my work - which I was quite pleased with - was challenged, and I didn't get all defensive about it, which is an easy trap to fall into. I went "hmmm... yeah, best I look at that...", and... I turned out to be wrong. And it actually felt good to find out I was wrong (disappointing, but good). Also: in the discovery process I now know a bit more about how PHPUnit works, and I am a better dev for it. Also it gave me a good topic to write about.

--
Adam

About

I've been a web developer since 2001: 13yrs as a CFML developer; 6yrs as a PHP dev; and now back adjacent to the CFML community but focusing on code quality, design and testing. As of 2023, I am migrating my dev focus back to PHP again.

The code I write and discuss here is pretty much just looking at random conundrums I encounter in my day job.

I tend to be a bit "forthright" in my opinions, I am indelicate, and I tend to swear too much. This will come out occasionally here: I make no apology for it.

Everything said here is my own opinion. Feel free to disagree with me :-)