Monday 21 March 2016

Using a decorator pattern to reduce inappropriate code complexity

G'day:
This one stemmed from a challenge we had with some code at work, and the thought exercise as to how it should have been resolved. Unfortunately the code in question is in a library our team needs to use, but didn't have any say in the design, plus it's out the door now so we're kinda stuck with it. However hopefully we can reduce the chance of this sort of thing happening again.

We use a variation of the Repository Pattern to interface between our application and whatever mechanism provides the data for it. Our philosophy is that an application's repositories with take data from [something] and model it into the domain objects our application wants to deal with. No storage considerations, or storage schema requirements should leak into the application: no table or column references, JSON (we use a bunch of web services for a lot of our data fetching), or the like. So our repositories are adapters. I dunno how closely this sticks to the letter of the pattern as per Microsoft's intent, but... shrug... it makes sense to us.

We'd typically have this sort of flow:
  • a request is matched to a route;
  • a route is handled by a controller;
  • a controller calls one or more services to do whatever's necessary to fulfil the request;
  • data is passed to a Twig template which creates the response mark-up.

Typical MVC sort of stuff. Sometimes we go straight from controller to repo if it's a situation that there's no real business logic, but this has proven dangerous because it's very rare there's no business logic, so even if our services just bridge between controller and repo: so be it. Note that a lot of our business logic is done by the remote services we call, as we are just one client application using these services; there are many others, so we keep the logic as close to the centre of shared operations as possible.

Also note as I alluded to above: our repos do not perform business logic. They simply transform data from the application-neutral format (or stoage-specific format) to the our-application-specific format.

A very simple use repository might be as follows:

class UserRepository {

    private $dataSource;

    public function __construct($dataSource) {
        $this->dataSource = $dataSource;
    }

    public function getById($id) {
        $userAsJson = $this->dataSource->getById($id);

        $rawUser = json_decode($userAsJson);
        $user = new User($rawUser->id, $rawUser->firstName, $rawUser->lastName);
    
        return $user;
    }

}

The repo is initialised with a connection to the storage mechanism, and all it does it ask the storage mechanism for some data, then rehydrate the returned raw data into a domain object for the application to use. This means the only part of the application which knows about how the data is exposed to the application is the repository. If we change our storage from being web-serviced based to being a direct connection to an SQL data store; we simply change the repository. The rest of our app stays the same. Nice.

A while back I encountered code similar to this (below). I hasten to add all the code in this article is purpose-written for this article, and any resemblance to our actual code is only so that I can demonstrate the point. Anyway, here's a user repository:

class UserRepository {

    private $dataSource;
    private $loggerService;
    private $cacheService;

    public function __construct($dataSource, $loggerService, $cacheService) {
        $this->dataSource = $dataSource;
        $this->loggerService = $loggerService;
        $this->cacheService = $cacheService;
    }

    public function getById($id) {
        $this->loggerService->logText("$id requested");

        if ($this->cacheService->isCached($id)) {
            return $this->cacheService->get($id);
        }

        $userAsJson = $this->dataSource->getById($id);
        $rawUser = json_decode($userAsJson);
        $user = new User($rawUser->id, $rawUser->firstName, $rawUser->lastName);

        $this->cacheService->put($id, $user);

        return $user;
    }

}

'ullo 'ullo 'ullo. What's going on here then? Now as well as going about its business as a user repository, our repo also seems to know about logging and caching. Logging and caching having nothing to do with the repo's job, yet now close to 50% of its code is dealing with these. And we've tightly coupled this functionality into the repo itself (at least its external services doing the actual work). Really now the UserRepository should be called a LoggedCachedUserRepository.

It also means that to have a working UserRepository, I also need to have working caching, and working logging Whether I actually want them or not. Soon after this code was introduced we had another situation to use the UserRepository, but it was in a a different context, and the logging and caching services weren't available (nor in this case needed or wanted). We could still negate them kinda by creating a null logger and a null cacher, along these lines:

class NullLoggerService {

    public function logText($text) {
        echo "NullLoggerService used<br>";
    }

}


class NullCacheService {

    public function __construct(){
    }
    
    public function isCached($key){
        return false;
    }
    
    public function get($key){
        throw new \Exception();
    }
    
    public function put($key, $value){
    }
        
}

That's "fine", and that "works", except it's even more code to maintain to actually do less stuff. All cos we tightly coupled caching and logging to the repo in the first place.

Also... logging and caching are business logic, so kinda violate our general usage pattern of repositories anyhow.

But in some situations we do actually want the caching tier, and we do actually want the logging tier. It was legit for those operations to be being run. It's just not legit for the code to be in the repository like that. So what ought we have done?

Use the decorator pattern.

Firstly, we leave our actual UserRepository to busy itself with doing UserRepository stuff.

class UserRepository implements UserRepositoryInterface {

    private $dataSource;

    public __construct($dataSource) {
        $this->dataSource = $dataSource;
    }

    public function getById($id) {
        $userAsJson = $this->dataSource->getById($id);

        $rawUser = json_decode($userAsJson);
        $user = new User($rawUser->id, $rawUser->firstName, $rawUser->lastName);
    
        return $user;
    }

}

Notice how I've added an interface requirement there. I'd normally not bother to do that, but it helps demonstrate the point. Our UserService now requires a RepositoryInterface object:

class UserService {

    private $userRepository;

    public function __construct(UserRepositoryInterface $userRepository) {
        $this->userRepository = $userRepository;
    }

    public function getById($id) {
        return $this->userRepository->getById($id);
    }
}


As long as it's a UserRepositoryInterface, it doesn't care how it goes about implementing it. Next we add a decorator to handle the logging, but to only handler the logging. It hands the user-getting task to a dependent UserRepository that knows how to do that. And of course we already have one of those! Here's the code:

class LoggedUserRepository implements UserRepositoryInterface {

    private $repository;
    private $loggerService;

    public function __construct(UserRepositoryInterface $repository, LoggerServiceInterface $loggerService) {
        $this->repository = $repository;
        $this->loggerService = $loggerService;
    }

    public function getById($id) {
        $this->loggerService->logText("$id requested");
        $object = $this->repository->getById($id);

        return $object;
    }

}

Also note it still fulfills the UserRepositoryInterface interface, so can be passed to the UserService no problem. The UserService just sees a UserRepository. It doesn't care what other things it does... it just knows it can use it to call getById() and get a user back.

And so on to implement a CachedUserRepository:

class CachedUserRepository implements UserRepositoryInterface {

    private $repository;
    private $cacheService;

    public function __construct(UserRepositoryInterface $repository, CacheServiceInterface $cacheService) {
        $this->repository = $repository;
        $this->cacheService = $cacheService;
    }

    public function getById($id) {
        if ($this->cacheService->isCached($id)) {
            return $this->cacheService->get($id);
        }
        $object = $this->repository->getById($id);

        $this->cacheService->put($id, $object);
        return $object;
    }

}

This one knows about caching, and knows how to call the repo its decorating to do the actual "repo-ing". Again it implements the appropriate interface. Also note that we can pass this constructor a LoggedUserRepository (which itself is passed a UserRepository), and we end up with a repository which handles caching, logging, and... erm... I need a better word... repo-ing. But now all the concerns are nicely separated.

Obviously this takes a bit of wiring up, however it's very easy with a DI container. Here's the relevant bits:

$app["service.user"] = $app->share(function($app) {
    return new UserService($app["repository.user"]);
});


That's it, as far as the UserService goes: give it a UserRepository. Or definition for $app["repository.user"] could just be a UserRepository:

$app["repository.user"] = $app->share(function($app) {
    return new UserRepository($app["datasource.user"]);
});


Or it could do logging:

$app["repository.loggedUser"] = $app->share(function($app) {
    return new LoggedUserRepository($app["repository.user"], $app["service.logger"]);
});


Or a cached one:

$app["repository.cachedUser"] = $app->share(function($app) {
    return new CachedUserRepository($app["repository.user"], $app["service.cache"]);
});


Or they could be wired up another way, to do caching and logging; or logging and caching.

It's just a matter of init-ing the UserService with a different repo:

$app["service.user"] = $app->share(function($app) {
    return new UserService($app["repository.cachedUser"]);
});


Or, hey, another requirement might come along at some point... dunno what it might me... an audit process separate to the logging process or something. Then it's just a matter of creating a new AuditedUserRepository, and all it has to do is audit stuff, then use its dependent UserRepositoryInterface object to do the actual repository stuff (via a logging and caching tier, completely unbeknownst to it).

It also goes without saying (hopefully) that the testing of all this can be far more focused. Testing of each component is completely discrete from the the other components, and they don't need to worry about one another. They're irrelevant. It keeps the code simple and focused, and on-point.

My take-away from this is that if I start piling code or more dependencies into an existing class, I need to stop and think about whether that code is actually anything to do with what the name of the class suggests it should be busying itself with, and if not... I'm perhaps putting the code in the wrong place.

Righto.

--
Adam