Monday 7 May 2018

In defence of public properties

G'day:
Recently I was re-writing one of our cron job scripts. It's the sort of thing that runs at 2am, reads some stuff from the DB, analyses it, and emails some crap to interested parties. We usually try to avoid complete rewrites, but this code was written... erm... [cough insert some period of time ago when it was apparently OK to write really shit code with no code quality checks and no tests]... and it's unmaintainable, so we're biting the bullet and rewriting coherently whilst adding some business logic changes into it.

The details for the email - to, from, subject, data to build the body from - come from three different sources in total, before sending to the mailer service to be queued-up for sending. So I knocked together a simple class to aggregate that information together:

class SpecialInterestingReportEmail {

    public $from;
    public $to;
    public $subject;
    public $body;

    public static $bodyTemplate = "specialInterestingReportEmail.html.twig";
    
    function __construct($from, $to, $subject, $body) {
        $this->from = $from;
        $this->to = $to;
        $this->subject = $subject;
        $this->body = $body;
    }
}

I then have a factory method in an EmailMessageService (not entirely sure that's the best name for it, but [shrug]) which chucks all the bits and pieces together, and returns an email object:

function createSpecialInterestingReportEmail($special, $interesting) {
    $body = $this-twigService->render(
        SpecialInterestingReportEmail::bodyTemplate,
        [
            "special" => $special,
            "interesting" => $interesting
        ]
    )
    return new SpecialInterestingReportEmail(
        $this->addressesForReport->from,
        $this->addressesForReport->to,
        $this->reportSubject,
        $body
    );
}

I don't think this design is perfect, but in the context of what we're doing it's OK.

Notice:
All interaction I mention below with my colleagues is paraphrased, embellished, or outright changed to make my point. I am not quoting anyone, and take this as a work of fiction. It's also drawn from previous similar conversations I've had on this topic.

One of my code reviewers was horrified:

Code Reviewer: "You can't have public properties! That's bad OOP!"
Me: "Why?"
CR: "You're breaking encapsulation. You need to do it like this [proceeds to tutor me on how to write accessor methods, because, like, I needed a tutorial on design anti-patterns]".
Me: "Why? What are we gaining by making those properties private, just to force us to write a bunch of boilerplate code to then access them? And how is that a good change to make to this code that now exists and is fulfilling the requirement? This is code review... we've already done the planning for this. This class is just a single container to aggregate some data into a conceptual 'email message' object. It doesn't need any behaviour - it doesn't need any methods - it's just for moving data between services".
CR: "But you can't break encapsulation!"
Me: "Why not? Other than parroting dogma read from some OOP 101 book, why is this a problem? And how's it solved by accessor methods?"
CR: "Well someone could directly change the values to be wrong!"
Me: "Well... they probably shouldn't do that then. That'd be dumb. They could equally put the wrong values straight into the constructor too. It'd still be just as wrong. Look... we don't need to validate the data - that seems to be your concern here? - as it's just a script reading known-good values from the DB, and sending them to our email queue. The code's right there. There's no scope for the data to accidentally be wrongified. And if it was... the tests would pick it up anyhow".
CR: "But what if other code starts using this code?"
Me: "What... yer saying 'what it some other application starts using this cronjob as some sort of library?' Why would that happen? This is not a public API. It makes no pretence of being a public API. If anyone started using this as an API, they deserve everything they get".
CR: "But they might".
ME: "OK, let's say some of this code is gonna be needed somewhere else, and we need to make it into a library. At that point in time, we'd extract the relevant code, consider stuff like encapsulation, data hiding, providing a public interface etc. But that would be different code from this lot".
CR: "But you should write all code like it will be used that way".
Me: "Again: why? This code is not going to be used this way. It just isn't. And anyhow, what yer suggesting is YAGNI / coding-for-an-unknown-future anyhow. We don't gain anything for the purposes of the current requirement chucking pointless boilerplate code into these classes. That's not an improvement".

And it went on.

One problem I encounter fairly frequently - both at work and in the wild - is people who will read about some concept, and thenceforth That Concept Is Law. Someone has Said It, therefore it applies to every single situation thereafter. They don't ever seem to bother trying to reason why the "law" was stated in the first place, what about it makes it a good rule, and when perhaps it's not necessary.

I look at these things and think "will this benefit my current task?". Generally it does because these things don't acquire acceptance without scrutiny and sanity-checking. But sometimes, however, it doesn't make sense to follow the dogma.

In this situation: it does not help.

In a different situation, if I was writing a separate API which handled the Email object creation, and other apps were gonna use it, I'd've possibly tightened-up the property access. But only possibly. My position on such things is to be liberal with what one permits to be done with code. If all my theoretical accessor method was gonna achieve was to return a private value... I'd really consider just leaving it public instead, and allow direct access. Why not?

There's a risk that later I might need to better control access to those properties, but... I'd deal with that at the time: these things can be handled as/when. It's even easy enough to have transitionary code from direct-access to accessor-access using __get and __set. I know these are frowned-upon, but in transitionary situations: they're fine. So one could seamlessly patch the API for code already consuming it via direct access with that transitionary approach, and then in the next (or some subsequent ~) version, make the breaking change to require using accessors, eg:

v1.0 - direct access only.
v1.1 - add in transitionary code using __get and __set. Advise that direct access is deprecated and will be removed in the next release. Also add in accessor methods.
v2.0 - remove direct access.

It doesn't even need to be v2.0. Just "some later version". But for the sake of making sure the transitionary code is temporary, better to do sooner rather than later. The thing is that software versioning is there for a reason, so it's OK to only introduce necessary coding overhead when it's needed.

Another thing that occurred to me when I was thinking about this. Consider this code:

$email = [
    "from" => "from@example.com",
    "to" => "to@example.com",
    "subject" => "Example subject",
    "body" => "Example body"
];

$fromAddress = $email["from"];

Perfectly fine. So how come this code is not fine:

$email = new Email(
    "from@example.com",
    "to@example.com",
    "Example subject",
    "Example body"
);

$fromAddress = $email->from;

Why's it OK to access an array directly, but it's not - apparently - OK to give the collection of email bits a name (Email), and otherwise use it the same way?

I can't think of one.

Rules are great. Rules apply sensibly 95% of the time. But when one reads about a rule... don't just stop there... understand the rule. The rules are not there simply for the sake of enabling one to not then think about things. Always think about things.

Righto.

--
Adam

PS: completely happy to be schooled as to how I am completely wrong here. This is half the reason I wrote this.