Showing posts with label Mingo Hagen. Show all posts
Showing posts with label Mingo Hagen. Show all posts

Sunday, 11 July 2021

Another thought on controllers and where the buck should stop

G'day:

I just wrote an article on what a controller ought to limit itself to: "What logic should be in a controller? (and a wee bit of testing commentary)". But then I read an old question from Mingo on the article that inspired that, and I have some thoughts on that too.

In most of my code examples around controller methods, I have this sort of method signature:

function handleGet(rawArgs)

What I mean by "raw args" is "all the stuff from an HTTP request, including: query string parameters and arguments (URL scope if yer a CFMLer), request body keys and values (form scope), general request metadata (CGI scope), cookies, and headers. Kind of like how Symfony would do it in the PHP world.

In CFML land things are seldom (never?) this organised. One seems to get a hotchpotch of things possibly put into a request context or something, or possibly - as in CFWheels - actually nothing(!!!) gets passed into the controller method; you just need to know the magical place to go look for them. And by "them" I mean a struct that has the form and URL scope munged into it. Ugh. Anyhow, there's all these elements of an HTTP request and the application lifecycle (application, session, request scopes) available to yer CFML code somehow.

And the controller is the only place one should ever access those. Your business logic should never be tightly-coupled the notion of "this stuff came from a specific sort of HTTP request", or from a specific stage in the application's lifecycle. Or even be aware of the ideas of HTTP requests and the like.

Your model tier should just get "values". Ideally by the time you are applying any actual application logic to them, the values will have been modelled into their own objects, not simply a bunch of primitive values.

I guess one could consider a controller to be similar in role to a repository class, just the other way around. A repository encapsulates the mapping between [for example] storage records - which it fetches - and collections of objects - which it returns - to the business-logic tier. A controller is slightly skinnier than that, it takes the values needed from that selection of HTTP request components (URL scope, form scope, what-have-you), and just passes them as independent values - distinct from how they arrived - to the model. I guess it's not a direct parallel because the controller doesn't do the "mapping" part that is intrinsic to a repository; it just removes the context from the values it receives, and leaves it up to the model tier to know what to do with them. But anyhow, there's a clear separation of concerns here, and the separation is that only the controller should ever deal with these things: CGI scope, the value returned from GetHttpRequestData(), form scope, URL scope, cookie scope, application scope, session scope, request scope.

Righto.

--
Adam

What logic should be in a controller? (and a wee bit of testing commentary)

G'day:

This topic has come up for me twice from different directtions in the last week or so, so I'm gonna dump some thoughts. I've actually discussed this before in "I actively consider what I think ought to go into a controller", and the conclusion I came to doesn't quite fit with how I'm writing code now so I'm gonna revise it a bit.

To start with though, I'll reiterate this question: "should this go in the controller?", and I'll repeat Mingo's answer that is still pithy but spot on:

Isn't the answer always "No"?
~ mjhagen 2017

This is a good starting point. 95% of the time if yer asking yerself that question, Mingo has answered it for you there.

The example of what I'd put in a controller from that article is along these lines:

class ContentController {
    function handleGet(rawArgs){
        args = validationService.validate(rawArgs)
        
        content = contentService.getById(args.id)
        
        response = {
            articles = content.published,
            socialContent = {
                twitter = content.twitter,
                facebook = content.facebook
            }
        }
        return new Response(response)
    }
}

It's not the worst code I've written, but I now think this is wrong. The problem lies with how I had a habit of abusing the Anaemic Domain Model pattern in the past, where I had a bunch of really skinny service classes, and used them to apply behaviour to behaviourless model objects that were just bags of data. Not great.

Looking at that code now, I see these lines:

args = validationService.validate(rawArgs)
content = contentService.getById(args.id)

And I think "nah, those don't belong there. They belong in the model too". I'm doing too much "asking" when I should be "telling" (see "TellDontAsk" by Martin Fowler).

Basically the model here should know what it is to be "valid", so just give it the raw data and let it crack on with it.

My generic controller method these days would be formed along these lines:

function handleRequest(rawRequestValues) {
    try {
        dataFromModel = someModel.getSomeDataFromThisLot(rawRequestValues)
        
        renderedResponse = viewService.renderView("someView", dataFromModel)
        
        return new HtmlResponse(renderedResponse)
        
    } catch (ClientException e) {
        return new ClientErrorResponse(e)
    }
}

Here we clearly have a separation of controller, model and view. It's the controller's job to marshal getting values to a model, and getting the values from that to a view, deal with any error responses that might arise due to those two, or return what came back from the view tier as the response. That's it.

There's an assumption that the framework will deal with any unhandled exceptions there as a controlled 5xx type response. Also there could well be more catch statements, if different types of exception could bubble out of the model, for instance a ValidationException which returns details of validation failures in its response; or a 404 response being returned if a UserNotFoundException came back from some business-logic validation or whatever. But that's the pattern.

The key here is that the only time I'm using a value created by the model is to pass it to the view. I do not pass it to anything else in the interim, like some other model call. That action is not controller logic. It's business logic that we get an x and then pass it to a y. It should be encapsulated in the model.

On the other hand if there was more than one piece of view data to be derived directly from the incoming request values, then that would to me still possibly be legit to be in the controller, eg this is OK:

dataFromModel = someModel.getSomeDataFromThisLot(rawRequestValues)
moreDataFromDifferentModel = someModelOther.getSomeDifferentDataFromThisLot(rawRequestValues)

This would not be OK:

dataFromModel = someModel.getSomeDataFromThisLot(rawRequestValues)
moreDataFromDifferentModel = someModelOther.getSomeDifferentDataFromThisLot(dataFromModel.someValue)

It's a small distinction. But the thing to focus on more than that small example is just to be thinking "no" when you ask yerself "does this belong in the controller?". You're more likely to be right than wrong.


How do we apply that pattern to the example in the old article? Like this I think:

// UserContentController.cfc
component {

    function init(ViewService viewService, UserContentFactory userContentFactory) {
        variables.viewService = arguments.viewService
        variables.userContentFactory = arguments.userContentFactory
    }

    function getContent(rawArgs) {
        try {
            userContent = userContentFactory.getUserContent().loadContentByFilters(rawArgs)
            
            renderedResponse = viewService.renderView("userContentView", userContent)
            
            return new HtmlResponse(renderedResponse)
            
        } catch (ValidationException, e) {
            return new ClientErrorResponse(400, e)
        } catch (UserNotFoundException e) {
            return new ClientErrorResponse(404, e)
        }
    }
}

(I've changed what the controller is returning so as to still integrate the view tier into the example).

I've done away with the controller handling the validation itself, and left that to the model. If things don't pan out: the model will let the controller know. That's it's job. And it's just the controller's job to do something about it. Note that in this case I don't really need both catches. I could just group the exceptions into one ClientException, probably. But I wanted to demonstrate two potential failures from the logic in loadContentByFilters.


What's with this factory I'm using? It's just one of my idiosyncrasies. I like my models' constructors to take actual valid property values, like this:

// UserContent.cfc
component accessors=true invokeImplicitAccessor=true {

    property publishedContent;
    property twitterContent;
    property facebookContent;

    function init(publishedContent, twitterContent, facebookContent) {
        variables.publishedContent = arguments.publishedContent
        variables.twitterContent = arguments.twitterContent
        variables.facebookContent = arguments.facebookContent
    }

Our UserContent represents the data that are those content items. However we've not been given the content items, we've just been given a means to get them. So we can't just create a new object in our controller and slap the incoming values into them. We need to have another method on the UserContent model that works with what the controller can pass it:

function loadContentByFilters(required struct filters) {
    validFilters = validationService.validate(filters, getValidationRules()) // @throws ValidationException
    
    user = userFactory.getById(validFilters.id) // @throws UserNotFoundException
    
    variables.publishedContent = contentService.getUserContent(validFilters)
    variables.twitterContent = twitterService.getUserContent(validFilters)
    variables.facebookContent = facebookService.getUserContent(validFilters)
}

And this demonstrates that to do that work, UserContent needs a bunch of dependencies.

I'm not going to pass these in the constructor because they aren't 100% needed for the operation of a UserContent object, and I want the constructor focusing on its data. So instead these need to be injected as properties:

// UserContent.cfc
component accessors=true invokeImplicitAccessor=true {

    property publishedContent;
    property twitterContent;
    property facebookContent;

    function init(publishedContent, twitterContent, facebookContent) {
        variables.publishedContent = arguments.publishedContent
        variables.twitterContent = arguments.twitterContent
        variables.facebookContent = arguments.facebookContent
    }
    
    function setValidationService(ValidationService validationService) {
        variables.validationService = arguments.validationService
    }
    
    function setUserFactory(UserFactory userFactory) {
        variables.userFactory = arguments.userFactory
    }
    
    function setContentService(UserContentService contentService) {
        variables.contentService = arguments.contentService
    }
    
    function setTwitterService(TwitterService twitterService) {
        variables.twitterService = arguments.twitterService
    }
    
    function setFacebookService(FacebookService facebookService) {
        variables.facebookService = arguments.facebookService
    }

That's all a bit of a mouthful every time we want a UserContent object that needs to use alternative loading methods to get its data, so we hide all that away in our dependency injection set-up, and use a factory to create the object, set its properties, and then return the object:

// UserContentFactory.cfc
component {

    function init(
        ValidationService validationService,
        UserFactory userFactory,
        UserContentService contentService,
        TwitterService twitterService,
        FacebookService facebookService
    ) {
        variables.validationService = arguments.validationService
        variables.userFactory = arguments.userFactory
        variables.contentService = arguments.contentService
        variables.twitterService = arguments.twitterService
        variables.facebookService = arguments.facebookService
    }

    function getUserContent() {
        userContent = new UserContent()
        userContent.setValidationService(validationService)
        userContent.setUserFactory(userFactory)
        userContent.setContentService(contentService)
        userContent.setTwitterService(twitterService)
        userContent.setFacebookService(facebookService)
        
        return userContent
    }
}

The controller just needs to be able to ask the factory for a UserContent object, and then call the method it needs, passing its raw values:

userContent = userContentFactory.getUserContent().loadContentByFilters(rawArgs)

You'll noticed I kept the validation separate from the UserContent model:

function loadContentByFilters(required struct filters) {
    validFilters = validationService.validate(filters, getValidationRules()) // @throws ValidationException

(And then there's also this private method with the rules):

private function getValidationRules() {
    return {
        id = [
            {required = true},
            {type = "integer"}
        ],
        startDate = [
            {required = true},
            {type = "date"},
            {
                range = {
                    max = now()
                }
            }
        ],
        endDate = [
            {required = true},
            {type = "date"},
            {
                range = {
                    max = now()
                }
            }
        ],
        collection = [
            {callback = (collection) => collection.startDate.compare(collection.endDate) < 0}
        ]
    }
}

Validation is fiddly and needs to be accurate, so I don't believe how to validate some values is the job of the UserContent class. I believe it's just perhaps its job to know "what it is to be valid". Hence that separation of concerns. I could see a case for that private method to be its own class, eg UserContentValidationRules or something. But for here, just a private method is OK. Wherever those rules are homed, and whatever the syntax of defining them is, we then pass those and the data to be validated to a specialist validation service that does the business. In this example the validation service itself throws an exception if the validation fails. In reality it'd more likely return a collection of rules violations, and it'd be up to the model making the call to throw the exception. That's implementation detail not so relevant to the code here.


There's probably more off-piste code in this an on-~, but I think it shows how to keep yer domain / business logic out of your controllers, which should be very very light, and simply marshall the incoming request values to the places that need them to be able to come up with a response. That's all a controller ought to do.


Oh before I go. There's an attitude from some testing quarters that one doesn't test one's controllers. I don't actually agree with that, but even if I did: that whole notion is predicated on controllers being very very simple, like I show above. If you pile all (or any of ~) yer logic into yer controller methods: you do actually need to test them! Even in this case I'd still be testing the flow control around the try/catch stuff. If I didn't have that, I'd probably almost be OK if someone didn't test it. Almost.

Righto.

--
Adam

Thursday, 8 April 2021

TDD and external services

G'day

You might have noticed I spend a bit of my time encouraging people to use TDD, or at the very least making sure yer code is tested somehow. But use TDD ;-)

As an interesting aside, I recently failed a technical interview because the interviewer didn't feel I was strong enough at the testing side of things. Given what I see around the industry… that seems to be a moderately high bar yer setting for yerselves there, peeps. Or perhaps I'm just shit at articulating myself. Hrm. But anyway.

OK, so I rattled out a quick article a few days ago - "TDD & professionalism: a brief follow-up to Thoughts on Working Code podcast's Testing episode" - which revisits some existing ground and by-and-large is not relevant to what I'm going to say here, other than the "TDD & professionalism" being why I bang on about it so much. And you might think I bang on about it here, but I also bang on about it at work (when I have work I mean), and in my background conversations too. I try to limit it to only my technical associates, that said.

Right so Mingo hit me up in a comment on that article, asking this question:

Something I ran into was needing to access the external API for the tests and I understand that one usually uses mocking for that, right? But, my question is then: how do you then **know** that you're actually calling the API correctly? Should I build the error handling they have in their API into my mocked up API as well (so I can test my handling of invalid inputs)? This feels like way too much work. I chose to just call the API and use a test account on there, which has it's own issues, because that test account could be setup differently than the multiple different live ones we have. I guess I should just verify my side of things, it's just that it's nice when it's testing everything together.

Yep, good question. With new code, my approach to the TDD is based on the public interface doing what's been asked of it. One can see me working through this process in my earlier article "Symfony & TDD: adding endpoints to provide data for front-end workshop / registration requirements". Here I'm building a web service end point - by definition the public interface to some code - and I am always hitting the controller (via the routing). And whatever I start testing, I just "fake it until I make it". My first test case here is "It needs to return a 200-OK status for GET requests on the /workshops endpoint", and the test is this:

/**
 * @testdox it needs to return a 200-OK status for successful GET requests
 * @covers \adamCameron\fullStackExercise\Controller\WorkshopsController
 */
public function testDoGetReturns200()
{
    $this->client->request('GET', '/workshops/');

    $this->assertEquals(Response::HTTP_OK, $this->client->getResponse()->getStatusCode());
}

To get this to pass, the first iteration of the implementation code is just this:

public function doGet() : JsonResponse
{
    return new JsonResponse(null);
}

The next case is "It returns a collection of workshop objects, as JSON", implemented thus:

/**
 * @testdox it returns a collection of workshop objects, as JSON
 * @covers \adamCameron\fullStackExercise\Controller\WorkshopsController
 */
public function testDoGetReturnsJson()
{
    $workshops = [
        new Workshop(1, 'Workshop 1'),
        new Workshop(2, 'Workshop 2')
    ];

    $this->client->request('GET', '/workshops/');

    $resultJson = $this->client->getResponse()->getContent();
    $result = json_decode($resultJson, false);

    $this->assertCount(count($workshops), $result);
    array_walk($result, function ($workshopValues, $i) use ($workshops) {
        $workshop = new Workshop($workshopValues->id, $workshopValues->name);
        $this->assertEquals($workshops[$i], $workshop);
    });
}

And the code to make it work shows I've pushed the mocking one level back into the application:

class WorkshopsController extends AbstractController
{

    private WorkshopCollection $workshops;

    public function __construct(WorkshopCollection $workshops)
    {
        $this->workshops = $workshops;
    }

    public function doGet() : JsonResponse
    {
        $this->workshops->loadAll();

        return new JsonResponse($this->workshops);
    }
}

class WorkshopCollection implements \JsonSerializable
{
    /** @var Workshop[] */
    private $workshops;

    public function loadAll()
    {
        $this->workshops = [
            new Workshop(1, 'Workshop 1'),
            new Workshop(2, 'Workshop 2')
        ];
    }

    public function jsonSerialize()
    {
        return $this->workshops;
    }
}

(I've skipped a step here… the first iteration could/should be to mock the data right there in the controller, and then refactor it into the model, but this isn't about refactoring, it's about mocking).

From here I refactor further, so that instead of having the data itself in loadAll, the WorkshopCollection calls a repository, and the repository calls a DAO, which for now ends up being:

class WorkshopsDAO
{
    public function selectAll() : array
    {
        return [
            ['id' => 1, 'name' => 'Workshop 1'],
            ['id' => 2, 'name' => 'Workshop 2']
        ];
    }
}

The next step is where Mingo's question comes in. The next refactor is to swap out the mocked data for a DB call. We'll end up with this:

class WorkshopsDAO
{
    private Connection $connection;

    public function __construct(Connection $connection)
    {
        $this->connection = $connection;
    }

    public function selectAll() : array
    {
        $sql = "
            SELECT
                id, name
            FROM
                workshops
            ORDER BY
                id ASC
        ";
        $statement = $this->connection->executeQuery($sql);

        return $statement->fetchAllAssociative();
    }
}

But wait. if we do that, our unit tests will be hitting the DB. Which we are not gonna do. We've run out of things to directly mock as we're at the lower-boundary of our application, and the connection object is "someon else's code" (Doctrine/DBAL in this case). We can't mock that, but fortunately this is why I have the DAO tier. It acts as the frontier between our app and the external service provider, and we still mock that:

public function testDoGetReturnsJson()
{
    $workshopDbValues = [
        ['id' => 1, 'name' => 'Workshop 1'],
        ['id' => 2, 'name' => 'Workshop 2']
    ];

    $this->mockWorkshopDaoInServiceContainer($workshopDbValues);

    // ... unchanged ...

    array_walk($result, function ($workshopValues, $i) use ($workshopDbValues) {
        $this->assertEquals($workshopDbValues[$i], $workshopValues);
    });
}

private function mockWorkshopDaoInServiceContainer($returnValue = []): void
{
    $mockedDao = $this->createMock(WorkshopsDAO::class);
    $mockedDao->method('selectAll')->willReturn($returnValue);

    $container = $this->client->getContainer();
    $workshopRepository = $container->get('test.WorkshopsRepository');

    $reflection = new \ReflectionClass($workshopRepository);
    $property = $reflection->getProperty('dao');
    $property->setAccessible(true);
    $property->setValue($workshopRepository, $mockedDao);
}

We just use a mocking library (baked into PHPUnit in this case) to create a runtime mock, and we put that into our repository.

The tests pass, the DB is left alone, and the code is "complete" so we can push it to production perhaps. But we are not - as Mingo observed - actually testing that what we are asking the DB to do is being done. Because all our tests mock the DB part of things out.

The solution is easy, but it's not done via a unit test. It's done via an integration test (or end-to-end test, or acceptance test or whatever you wanna call it), which hits the real endpoint which queries the real database, and gets the real data. Adjacent to that in the test we hit the DB directly to fetch the records we're expecting, and then we compare the JSON that the end point returns represents the same data we manually fetched from the DB. This tests the SQL statement in the DAO, that the data fetched models OK in the repo, and that the model (WorkshopCollection here) applies whatever business logic is necessary to the data from the repo before passing it back to the controller to return with the response, which was requested via the external URL. IE: it tests end-to-end.

public function testDoGetExternally()
{
    $client = new Client([
        'base_uri' => 'http://fullstackexercise.backend/'
    ]);

    $response = $client->get('workshops/');
    $this->assertEquals(Response::HTTP_OK, $response->getStatusCode());
    $workshops = json_decode($response->getBody(), false);

    /** @var Connection */
    $connection = static::$container->get('database_connection');
    $expectedRecords = $connection->query("SELECT id, name FROM workshops ORDER BY id ASC")->fetchAll();

    $this->assertCount(count($expectedRecords), $workshops);
    array_walk($expectedRecords, function ($record, $i) use ($workshops) {
        $this->assertEquals($record['id'], $workshops[$i]->id);
        $this->assertSame($record['name'], $workshops[$i]->name);
    });
}

Note that despite I'm saying "it's not a unit test, it's an integration test", I'm still implementing it via PHPUnit. The testing framework should just provide testing functionality: it should not dictate what kind of testing you implement with it. And similarly not all tests written with PHPUnit are unit tests. They are xUnit style tests, eg: in a class called SomethingTest, and the the methods are prefixed with test and use assertion methods to implement the test constraints.

Also: why don't I just use end-to-end tests then? They seem more reliable? Yep they are. However they are also more fiddly to write as they have more set-up / tear-down overhead, so they take longer to write. Also they generally take longer to run, and given TDD is supposed to be a very quick cadence of test / run / code / run / refactor / run, the less overhead the better. The slower your tests are, the more likely you are to switch to writing code and testing later once you need to clear your head. In the mean time your code design has gone out the window. Also unit tests are more focused - addressing only a small part of the codebase overall - and that has merit in itself. Aso I used a really really trivial example here, but some end-to-end tests are really very tricky to write, given the overall complexity of the functionality being tested. I've been in the lucky place that at my last gig we had a dedicated QA development team, and they wrote the end-to-end tests for us, but this also meant that those tests were executed after the dev considered the tasks "code complete", and QA ran the tests to verify this. There is no definitive way of doing this stuff, that said.

To round this out, I'm gonna digress into another chat I had with Mingo yesterday:

Normally I'd say this:

Unit tests
Test logic of one small part of the code (maybe a public method in one class). If you were doing TDD and about to add a condition into your logic, you'd write a until test to cover the new expectations that the condition brings to the mix.
Functional tests
These are a subset of unit tests which might test a broader section of the application, eg from the public frontier of the application (so like an endpoint) down to where the code leaves the system (to a logger, or a DB, or whatever). The difference between unit tests and functional tests - to me - are just how distributed the logic being tests is throughout the system.
Integration tests
Test that the external connections all work fine. So if you use the app's DB configuration, the correct database is usable. I'd personally consider a test an integration test if it only focused on a single integration.
Acceptance tests(or end-to-end tests)
Are to integration tests what functional tests are to unit tests: a broader subset. That test above is an end-to-end test, it tests the web server, the application and the DB.

And yes I know the usages of these terms vary a bit.

Furthermore, considering the distinction between BDD and TDD:

  • The BDD part is the nicely-worded case labels, which in theory (but seldom in practise, I find) are written in direct collaboration with the client user.
  • The TDD part is when in the design-phase they are created: with TDD it's before the implementation is written; I am not sure whether in BDD it matters or is stipulated.
  • But both of them are design / development strategies, not testing strategies.
  • The tests can be implemented as any sort of test, not specifically unit tests or functional tests or end-to-end tests. The point is the test defines the design of the piece of code being written: it codifies the expectations of the behaviour of the code.
  • BDD and TDD tests are generally implemented via some unit testing framework, be it xUnit (testMyMethodDoesSomethingRight), or Jasmine-esque (it("does something right", function (){}).

One can also do testing that is not TDD or BDD, but it's a less than ideal way of going about things, and I would image result in subpar tests, fragmented test coverage, and tests that don't really help understand the application, so are harder to maintain in a meaningful way. But they are still better than no tests at all.

When I am designing my code, I use TDD, and I consider my test cases in a BDD-ish fashion (except I do it on the client's behalf generally, and sadly), and I use PHPUnit (xUnit) to do so on PHP, and Mocha (Jasime-esque) to do so on Javascript.

Hopefully that clarifies some things for people. Or people will leap at me and tell me where I'm wrong, and I can learn the error in my ways.

Righto.

--
Adam

Friday, 23 October 2020

ColdFusion: looking at an issue Mingo had with ehcache and cachePut and cacheGet

 G'day:

Bloody Coldfusion. OK so why am I writing about a ColdFusion issue? Well about 80% of it is "not having a great deal else to do today", about 10% of being interested in this issue Mingo found. And 10% it being an excuse to mess around with Docker a bit. I am currently - and slowly - teaching myself about Docker, so there's some practise for me getting a ColdFusion instance up and running on this PC (which I no-longer have any type of CFML engine installed on).

OK so what's the issue. Mingo posted this on Twitter:


Just in case you wanna run that code, here it is for copy and paste:
foo = { bar = 1 };
cachePut( 'foobar', foo );

foo.bar = 2;

writeDump( cacheGet( 'foobar' ) );


Obviously (?) what one would expect here is {bar:1}. What gets put into cache would be a copy of the struct right?

Well... um... here goes...

/opt/coldfusion/cfusion/bin $ ./cf.sh
ColdFusion started in interactive mode. Type 'q' to quit.
cf-cli>foo = { bar = 1 };
struct
BAR: 1

cf-cli>cachePut( 'foobar', foo );
cf-cli>foo.bar = 2;
2
cf-cli>writeDump( cacheGet( 'foobar' ) );
struct
BAR: 2

cf-cli>

... errr... what?

It looks like ColdFusion is like only putting a reference to the struct into cache. So any code changing the data in the struct is changing it in CFML as well as changing it in cache. This does not seem right.

I did a quick google and found a ticket in Adobe's system about this: CF-3989480 - cacheGet returns values by reference. The important bit to note is that it's closed with


Not a great explanation from Adobe there, fortunately Rob Bilson had commented further up with a link to a cached version of an old blog article of his, explaining what's going on: "Issue with Ehcache and ColdFusion Query Objects". It's a slightly different situation, but it's the same underlying "issue". Just to copy and paste the relevant bit from his article:

Update: It looks like this is actually expected behavior in Ehcache. Unfortunately, it's not documented in the ColdFusion documentation anywhere, but Ehcache actually has two configurable parameters (as of v. 2.10) called copyOnRead and copyOnWrite that determine whether values returned from the cache are by reference or copies of the original values. By default, items are returned by reference. Unfortunately we can't take advantage of these parameters right now as CF 9.0.1 implements Ehcache 2.0.


I decided to have a look what we could do about this on ColdFusion 2018, hoping that its embedded implementation of Ehcache has been updated since Rob wrote that in 2010.

Firstly I checked the Ehcache docs for these two settings: copyOnWrite and copyOnRead. This is straight forward (from "copyOnRead and copyOnWrite cache configuration"):


<cache name="copyCache"
    maxEntriesLocalHeap="10"
    eternal="false"
    timeToIdleSeconds="5"
    timeToLiveSeconds="10"
    copyOnRead="true"
    copyOnWrite="true">
  <persistence strategy="none"/>
  <copyStrategy class="com.company.ehcache.MyCopyStrategy"/>
</cache>


The docs also confirm these are off by default

Next where's the file?

/opt/coldfusion $ find . -name ehcache.xml
./cfusion/lib/ehcache.xml
/opt/coldfusion $

Cool. BTW I just guessed at that file name.

So in there we have this (way down at line 471):


<!--
Mandatory Default Cache configuration. These settings will be applied to caches
created programmtically using CacheManager.add(String cacheName).

The defaultCache has an implicit name "default" which is a reserved cache name.
-->
<defaultCache
    maxElementsInMemory="10000"
    eternal="false"
    timeToIdleSeconds="86400"
    timeToLiveSeconds="86400"
    overflowToDisk="false"
    diskSpoolBufferSizeMB="30"
    maxElementsOnDisk="10000000"
    diskPersistent="false"
    diskExpiryThreadIntervalSeconds="3600"
    memoryStoreEvictionPolicy="LRU"
    clearOnFlush="true"
    statistics="true"
/>


That looked promising, so I updated it to use copyOnWrite:



<defaultCache
    maxElementsInMemory="10000"
    eternal="false"
    timeToIdleSeconds="86400"
    timeToLiveSeconds="86400"
    overflowToDisk="false"
    diskSpoolBufferSizeMB="30"
    maxElementsOnDisk="10000000"
    diskPersistent="false"
    diskExpiryThreadIntervalSeconds="3600"
    memoryStoreEvictionPolicy="LRU"
    clearOnFlush="true"
    statistics="true"
    copyOnWrite="false"
/>


Whatever I put into cache, I want it to be decoupled from the code immediately, hence doing the copy-on-write.

I restarted CF and ran the code again:

/opt/coldfusion/cfusion/bin $ ./cf.sh
ColdFusion started in interactive mode. Type 'q' to quit.
cf-cli>foo = { bar = 1 };
struct

BAR: 1

cf-cli>cachePut( 'foobar', foo );
cf-cli>foo.bar = 2;
2

cf-cli>writeDump( cacheGet( 'foobar' ) );

struct
BAR: 1

cf-cli>


Yay! We are getting the "expected" result now: 1

Don't really have much else to say about this. I'm mulling over writing down what I did to get ColdFusion 2018 up and running via Docker instead of installing it. Let's see if I can be arsed...

Righto.

-- 

Adam

Friday, 14 April 2017

I actively consider what I think ought to go into a controller

G'day:
This is one of these ones where I state the obvious (probably), and expose myself as a fraud in that it's taken me this long to reach this conclusion. I can only take solace in that I'm clearly in "good" company. Where "good" == "a lot".

Yesterday I was going a code review, and came across a controller method which made one of my eyebrows raise as it was about 40 lines long. Now this isn't intrinsically bad in context. I have seen (and still see!) controller methods that are hundreds and hundreds of lines long. In the past I've seen controllers tipping the 1000-line mark. Those examples are clearly just wrong. And the developers who wrote them and/or continue to maintain them should be shot. I'm not even sure I'm over-stating it with that sentence. Hmmm... I probably am I guess. But they should be squarely and actively admonished, anyhow.

So in the scheme of things, 40 lines ain't bad. Especially when we try to keep our code wrapped within 80-100chars, so a few of the statements were multi-line. There was about say a dozen statements in it altogether. Not so bad.

Before I go on - and not only because the dev concerned will be reading this article (once I send them the link... pretty sure they don't know this blog exists) - I hasten to add that this code is not by any means bad. I'm probably gonna go "close enough" on the code review, once one bit of the code is refactored at least into a private method. I am not having a go at the dev concerned here. They are just my muse for the day. And I happen to think they're pretty good at their job.

There was one bit of the controller that stood out to me and I went "nuh-uh, that's gotta go". There was a foreach loop just before the response was returned which performed a calculation. That's business logic and it has no place in a controller. Fortunately we have an appropriate service class which we can lift that out and re-home in, and slap some tests on it, and our code is a bit better and a bit more maintainable.

(NB: yeah... no tests... we have a policy of not testing controller methods, but that's with the caveat there's no logic in them. We followed the "don't test controller methods" bit, but not the "provided there's no logic in them" bit here).

However once I spot a problem, I immediately start looking for more, so I assessed the rest of the code there. The controller did seem too fat. I wanted to know if all the logic did actually belong in the controller, and my instinct was: probably not.

At the same time as writing this article I'm chatting on Slack about it, and me mate Mingo offered the perennial answer to the question "should this go in the controller?"

Isn't the answer always "No"?
~ mjhagen 2017

(yes, he cited the attribution too, so I'm including that ;-)

It's pleasingly pithy, and almost ubiquitously agreed with; but in a casually dismissive way that is seldom heeded, I think. The problem is it's alluding to a mindset one should adopt as a guideline, but clearly the answer isn't "no", so I think people dismiss it. If one says "one should never do x", and it's clear that "never" is slight hyperbole, people will acknowledge the hyperbole and stretch it back the other way as far as they see fit to accommodate their current whim.

With this in mind (retroactively... Mingo just mentioned that now, and the code review was yesterday, and this article was drafted in my head last night), I stopped to think about a rule of thumb as to when something ought to go in a controller.

Broadly speaking I feel there are only three types of logic that belong in a controller:

  1. a call to validate / sanitise all incoming arguments so they're fit for purpose for fulfilling the request;
  2. one (preferably) or more (and increasingly less preferably the more there are) calls to get data - using those incoming arguments - to build the response;
  3. a call to assemble that data for the response;
  4. returning the response.

Yeah that's 4. I was not counting the return statement. An example of this might be:

class UserController {
    function handleGet(rawArgs){
        args = validationService.validate(rawArgs)
        
        user = userService.get(args.id)
        publishedContent = contentService.getForId(args.id, args.startDate, args.endDate)
        twitterContent = twitterService.getForId(args.id, args.startDate, args.endDate)
        facebookContent = facebookService.getForId(args.id, args.startDate, args.endDate)
        
        response = {
            user = user,
            publishedContent = publishedContent,
            socialContent = {
                twitter = twitterContent,
                facebook = facebookContent
            }
        }
        return new Response(response)
    }
}

I think that's OK, even though there's four service calls. All the logic is busying itself with what it is to fulfil that request. The controller simply dictates what data needs returning, and gets that data. there is no business logic in there. It's all controller logic.

A pseudo-code example where we're adding business logic into this might be:

// ...
publishedContent = contentService.getForId(args.id, args.startDate, args.endDate)
topFiveArticles = publishedContent.slice(5)

// ...

response = {
    // ...
    articles = topFiveArticles,
    // ...
}

Here the slice call - even though it's a native array method - is still business logic. If the controller method was getTopFiveArticles or something, I'd go back to considering it controller logic. So it's a fine line.

Back to the code i was reviewing, it was doing 1&2, then it was going off-piste with that foreach loop, then doing 3&4. Ignoring the foreach, I was still itchy, but wasn't seeing why. It seemed like it was roughly analogous to my first example above.

So what's the problem? Why was the code still making me furrow my brow?

Well there was just too much code. That set-off a warning bell. I wanted to reduce the number of calls to various services (or repositories as is the case here: data from slightly different sources), but initially I couldn't really justify it. Then I looked closer.

The code in question was more like this (I've got rid of the date args to focus more):

class ContentController {
    function handleGet(rawArgs){
        args = validationService.validate(rawArgs)
        
        user = userService.get(args.id)
        publishedContent = contentService.getForId(args.id)
        
        twitterContent = twitterService.getForId(user.twitterId)
        facebookContent = facebookService.getForId(user.facebookId)
        
        response = {
            articles = publishedContent,
            socialContent = {
                twitter = twitterContent,
                facebook = facebookContent
            }
        }
        return new Response(response)
    }
}


I hasten to add the code I was reviewing is nothing to do with users and content and social media stuff, this is just an example which is analogous but would make more sense to people not in the context of our business.

The difference is subtle. The difference is that the decision that the IDs for the social content are stored in the user object is business logic. It's not controller logic to "know" that's how that data is stored. the user object here is not part of the data that composes the response; it's part of some business logic that dictates how to get the data.

And this does indeed break that second rule:

  • a call to validate / sanitise all incoming arguments so they're fit for purpose for fulfilling the request;
  • [...] calls to get data - using those incoming arguments - to build the response;

The call to get the social media content is not using those arguments. It's using a value from a separate call:
user = userService.get(args.id)
// ...
twitterContent = twitterService.getForId(user.twitterId)
facebookContent = facebookService.getForId(user.facebookId)


Another tell tale here is that user is not even part of the response. It's just used to get those "foreign keys".

My rule of thumb in those four points above is still correct. But one needs to pay special attention to the second point. Any time one's inclined to fetch data "indirectly", the indirection is business logic. It might not be a loop or some conditional or switch block or something more obvious that manipulates the data before it's returned, but that indirection is still business logic. And does not belong a the controller.

So how should that controller look? Move the content acquisition out into a service that has the explicit purpose of getting a user's content - from wherever - via some key:

class ContentController {
    function handleGet(rawArgs){
        args = validationService.validate(rawArgs)
        
        content = contentService.getById(args.id)
        
        response = {
            articles = content.published,
            socialContent = {
                twitter = content.twitter,
                facebook = content.facebook
            }
        }
        return new Response(response)
    }
}


This follows the second point to the letter, and there's now no business logic in the controller. Also the controller legitimately does not really need unit testing now, but the logic in the ContentService can be. Also the controller method is down to a size that would not raise any eyebrows.

This is not an earth-shattering conclusion by any measure, but I'm quite pleased to fine-tune my position here, and hopefully it can help drive discussion about this in code reviews in future.

It'd be great to hear anyone else's thoughts on this?

--
Adam

Thursday, 28 January 2016

ColdFusion: help with clarifying if an issue exists with CF's ORM implementation

G'day:
If you know something about Hibernate and/or using ColdFusion's wrapper around same, perhaps you could cast your analytical eye over this, and help determine whether Mingo has - as it seems - found a bug in it, or whether Himavanth has some basis to what he's saying, and it's not just a case of him (willfully choosing to ~ ?) not understanding what Mingo is saying, and there's actually no problem. Can you guess which side I come down on?

Here's the detail:

4024472: ORM entities are incorrectly created when using single table inheritance with mappedSuperClass

Problem Description:

When you have a base component (base.cfc) with MappedSuperClass set to true, a
persisted component (animal.cfc) inheriting from that and another (dog.cfc)
inheriting from that one ColdFusion throws an exception (a Hibernate one).
To make this bug more fun, CF doesn't always throw this exception.

The reason why this happens becomes clear when you set
this.ormsettings.savemapping=true and take a look at the hbmxml files.

The XML in dog.hbmxml contains the fields from base.cfc:
<property name="name" type="string"><column name="name"/></property>


Steps to Reproduce:

Use the files included with this bug [I'll include 'em inline here].


// base.cfc
component mappedSuperClass="true"
{
  property fieldType="id" name="id" generator="increment";
  property fieldType="column" name="name";
}

// animal.cfc
component extends="base" persistent="true" table="animal" discriminatorColumn="type"
{
  property fieldType="column" name="isAlive" type="boolean" default=true;
}

// dog.cfc
component extends="animal" persistent="true" table="animal" discriminatorValue="dog"
{
  property fieldType="column" name="furColor" default="brown";
}

Actual Result:

An exception: Repeated column in mapping for entity: {entity-name} column: {column-name} (should be mapped with insert="false" update="false")

Expected Result:

Correctly generated hibernate entities.

Any Workarounds:

You can write your own .hbmxml files to work around the problem.

Further information


For clarity: the dog entity gets the following HBM XML generated:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE hibernate-mapping PUBLIC "-//Hibernate/Hibernate Mapping DTD 3.0//EN" "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">
<hibernate-mapping default-cascade="none" default-access="property" default-lazy="true" auto-import="true">
 <subclass discriminator-value="dog" entity-name="dog" extends="cfc:18502eb2916b8041cf1e.animal" lazy="true" name="cfc:18502eb2916b8041cf1e.dog" dynamic-update="false" dynamic-insert="false" select-before-update="false">
  <property name="name" type="string" unique="false" optimistic-lock="true" lazy="false" generated="never">
   <column name="name" />
  </property>
  <property name="furColor" type="string" unique="false" optimistic-lock="true" lazy="false" generated="never">
   <column name="furColor" />
  </property>
 </subclass>
</hibernate-mapping>


Notice the extra name property, that's not supposed to be there, because that should be inherited from the animal class.

My knowledge of ColdFusion's ORM is pretty theoretical as I've never used it beyond testing it, and my knowledge of Hibernate is scant (I've scanned a Hibernate book whilst testing the initial ColdFusion 9 implementation).

The most telling thing is that Lucee does what Mingo expects, and it works. That kinda suggests ColdFusion has got it wrong. It would seem too coincidental that Mingo's expectations are off, and it just so happens Lucee's implemented things in such a way that match Mingo's expectations. Occam's Razor 'n' all.

So perhaps you lot can eyeball this and the rest of the detail in Mingo's ticket, and offer your input?

Cheers.

--
Adam

Wednesday, 5 August 2015

ColdFusion: it's hard to maintain the CF docs when the language is so buggy

G'day:
I was baking an idea for an article about PHP 7's generator return values - which admittedly had got sidetracked in a thought experiment - but just as I pulled up a pew at the pub, someone said something about ColdFusion and scopes and wrong docs and I decided I need to look at it and fix the docs if they were wrong. FFS.

Here's a simple statement in the ColdFusion docs, regarding scopes ("About scopes"):

Evaluating unscoped variables

If you use a variable name without a scope prefix, ColdFusion checks the scopes in the following order to find the variable:
  1. Local (function-local, UDFs and CFCs only)
  2. Arguments
  3. Thread local (inside threads only)
  4. Query (not a true scope; variables in query loops)
  5. Thread
  6. Variables
  7. CGI
  8. Cffile
  9. URL
  10. Form
  11. Cookie
  12. Client

So that's all easy enough, and given it's a list of only a dozen items, one would think it would be easy enough to test to make sure the docs matched ColdFusion's actual behaviour. And that ColdFusion's behaviour is actually correct.