Adam Cameron's Dev Blog: Code Review

Showing posts with label Code Review. Show all posts

Sunday, 25 April 2021

On code review

G'day:

I'm pretty big on code review; I see it as a critical part of the process of developing solution for our clients, and making our own and our colleagues' lives easier. It's a useful communications exercise, and it's a useful technical exercise.

I've used a variation of these notes with a variety of teams in the past, but I've never - until recently - got around to writing a decent (semi-) generic document about it. I've polished things a bit here, and thought I'd get it down in a public-facing fashion. There are references in here to systems I am familiar working with like Bitbucket and Jira and PHP. But I think the guidance carries across, irrespective of what combination of tooling one uses; if not in the precise mechanics, then in the processes the mechanics facilitate.

The CSS for the <h4> through <h6> is not as good as it could be. It's annoying that Blogspot gets me down as far as an <h3> even before we get to the title of the article (the date gets the <h2> for some reason), but: so be it. Sorry about that (you'll see what I mean as you get further down).

Introduction

Before code can be considered ready for release, the dev team needs to give the work a once-over or a "second set of eyes" to make sure it meets code quality guidelines, is appropriately tested, and that it fulfils the requirements of the user story that the code addresses, using the domain design / architecture / approach originally agreed before coding started.

What code review is

It is part of the collaborative team approach to delivering value to the client, just like the initial planning stage is before any code is even written, and any pair-programming that takes place during the initial development cycle. Given a lot of the physical coding work is done in separately from the rest of the team, code review is a "re-union" exercise where the devs get back together to see how their solution has ended up. It's important to remember that any given code is our code, not an individual developer's code.

Code review gives the coding dev a stage to show how they've implemented the solution, draw attention to anything noteworthy or needs flagging for any other reason to the rest of the team. And it gives the reviewers a chance to check things out, and potentially spot anything in the work that might need addressing still, or tweaking, or done slightly differently maybe, or whatever.

Another thing code review is is a learning exercise for everyone concerned. Some nuance of the code, or a suggestion to amend the code could be new knowledge to one or all people on the review. Or it might just clarify how some less-often-trod part of the codebase works. And if nothing else, it'll add to everyone's familiarity with the code in question, and it might be you working on it next. Win.

What code review is not

I have experienced some code reviews that have taken an adversarial approach: pitting the coding developer against the reviewing developers, with the reviewers going "you got this wrong", "this doesn't work", "nup", etc, and trying to seem more clever than their colleagues, with arguments going back and forth in the review comments. This is not what code review is about.

It's also not about people checking up on each other. Nor is it about more senior people pronouncing to less senior people.

We are on the same team, working on the same stories, with the same end-goal in mind.

Where the disconnect happens

Still. If I've done some work and someone else adds a comment saying "this could be done differently because [reasons]", it's easy to get a wee bit defensive / protective of our work (for a moment you're claiming the team's work as your own because it happened to be you who typed it in). This is pretty normal. I mean when I'm doing my own hobby-work and something goes unexpected wrong, I'm usually pretty quick to blame the language or the framework or some library, or something other than just myself for getting something slightly wrong, and forgetting that it's OK to be wrong. I like to think that the blame-casting phase of this is a pretty short cycle these days, and I'm quickly back to going "OK, so what do I need to do to sort this out?". Code review is the same.

I have a good quote here:

An amateur is defensive. The professional finds learning (and even, occasionally, being shown up) to be enjoyable; they like being challenged and humbled, and engage in education as an ongoing and endless process.

Ryan Holiday, Ego is the Enemy

Anyway, bottom line: code review is participants on the same team working together.

Code review mindset

A pull request is a way to communicate application changes to the team. The comments that are added to the code in the pull request have good intentions with the intent of improving our work, and the application. Everyone should be open for a discussion like that. However because written communication can lack some nuance of a spoken in-person conversation, remember to over-compensate with the text and make sure it reads as polite, constructive, and respectful. I've found that sometimes even being too terse in a comment ("needs a test") can be seen as being abrupt, rather than just being to-the-point. Also remember to address the code, not the person. If something's not clear: perhaps ask about it rather than telling about. There's no harm in asking for clarification in something. Code is a communications mechanism - it's designed to articulate concepts between people - so if something needs clarification, this could indeed be something that needs to be addressed in the code. Or you might just've not clocked what the code was doing initially and just need a quick pointer from the dev, before going "duh, of course".

By the same token it's important for the recipient of a comment to frame things the same way. If you read a comment as being adversarial or confrontational, you are probably just be using the wrong internal voice to read it. Read it again with a neutral or collaborative voice. It's hardly ever the case that the comment writer was committing a personal attack to writing, and in a code review comment. It's more likely the person chose the wrong wording or didn't take time to polish-up the comment before pressing send. Devs are notoriously bad at articulating themselves in writing, after all, right?

Comments

If you're reviewing and making a comment, read it back to yourself before pressing submit. Would you like to be on the receiving end of it?

If you're replying to a comment, act likewise.

Before adding a reply to a reply (… to a reply… to a reply…), remember it's a code review not a discussion thread. Consider getting on a call and actually talking to the person. This can really quickly solve the misunderstanding, but it is a really good way of defusing any tensions that could possibly be building.

That said, if some call to action or resolution arose from the voice-conversation, make sure to follow up in the code review with a summary comment. That information is important.

Compromise

Some back-and-forth discussions are really useful and interesting! Some aren't. Either way, they're probably not what you all ought to be doing right at that moment. Give some thought as to whether a) the discussion is actually moving us closer to getting the work approved; b) needs to happen right then and there; c) via a text-based discussion. Possibly just go "yeah actually let's just run with it" (whether "it" is how the code already is, or if it's the fix being suggested). Or possibly go "OK, but let's consider revisiting this in another story later… it might be worth it".

Code review logistics

All work going into the codebase must be reviewed and approved by at least one other person (not the dev who wrote the code!). Bitbucket can be configured to disabled the Merge button until there are n approvals.
If a review yields any rework, this should be undertaken before any other new work (ie: a different new story) is commenced. One should focus on completing existing story before starting the next story. "Stop starting and start finishing".
From a reviewer's perspective: reviewing our colleagues existing work ought to generally take priority over continuing to develop our own new code. At any given moment work that is closer to be completed should be the focus, over starting or continuing newer work. This is not to say that one must drop everything when a code review comes in. What it does mean is that when re-commencing from a break (arriving at work, after or lunch or some other intermediary break), and before refocusing on new code: check what code review needs to be done, and deal with that first.
That said, it's important to minimise task-switching, so be pragmatic about this. But bear in mind that it is also part of our job to work as a team as well as work as an individual. This is not a situatin of "my work" and "their work": it doesn't matter happened to type-in the code, it is all our work.

Handling code review tasks

Tasks are a feature in BitBucket that allows a code review comment to have a "task" attached to it. This indicates work that needs to be done before the code review can be completed.

(Inadvertently an example of a comment that is probably a bit abrupt-sounding!)

Bitbucket can be configured to disable the Merge button whilst there are outstanding tasks.

As a reviewer, if the nature of an observation you make seems to require action, add a task to the comment. This signifies that you're not just making the observation, you mean it needs action.

As the developer, when addressing a task, mark it "done" in one of two situations:

whatever was requested has been done;
after further discussion, it's concluded there's no action to take. But add a comment to that effect in reply to the task comment.

Do not mark a task done unless one of those two resolutions have been made. IE: don't simply decide to not address a task because [reasons]. Get agreement from the reviewer as to the legitimacy of those reasons first.

Code review guidelines

Before the code gets there

The branch name, commit comments (and they should all have comments!), and pull request should be prefixed with the ticket number for the story the work is for. This is just to make it easier to tie everything together later. The important one of these is the pull request name should have the ticket reference in it so the reviewers can check what they're reviewing. In the past I've used a rule that the pull request title should be "JRA-1234 The Title Of The Ticket" (ie: exactly the title of the ticket). Hopefully Bitbucket has integrations with the ticketing system to it can automatically like through to the ticket just from the ID being there.
If there's no ticket, the work should not have been even started, let alone submitted for code review.
The ticket needs to have a story in it. The reviewers need to be able to to know what the work is setting out to do, otherwise how can they know it's fulfilled all its cases? For that matter, how did the dev who did the work?
Consider squashing your commits when doing a pull request (or configure Bitbucket to do it automatically). By the time the code is going into master, it's no longer useful to log the sequence the work was done it. And we still have the commit log in your own repo anyhow.

Before the review starts

There ought to be an automated build that ran code quality checks and unit tests. Hopefully there's integration between the CI tool and Bitbucket so the pull request automatically indicates a green status, or the dev should include a link to the green status report.
If that's not green: don't review anything. However the dev possibly did not notice the build is broken, so give them a (gentle) nudge.
As the developer, go through the pull request and put your own comments in, where necessary. Sometimes the code can legitimately not be obvious, or it just helps to contextualise which part of the story the code is addressing.

What to look for in a review

General

Does the work fulfil all of the requirements of the story on the ticket. This might seem obvious, but sometimes a dev can forget something. Or perhaps their reading of the requirement is different from yours, so what's been delivered might not meet actual expectations.
Does the work go off-piste anywhere? It should only address the requirement of the ticket. Not "oh, whilst I was in this file I saw something I could fix in this other bit of code". That can be done later on another ticket. Or: not at all. So check that the work has maintained focus.
As a reviewer do you understand the code. This is vital. Remember code is a tool for communicating between developers, and it's not doing its job if you don't understand it. Code can be as fancy as you like, but if no-one can understand it, it should get flagged in code review.
All code needs to adhere to any prescribed guidelines:
- In-house guidelines (whatever they are, but all teams should have them).
- Probably by implication, external community-recognised guidelines such as PHP's PSR-12. People who are not currently in the team will need to understand this code later on, so let's write code that newcomers have more of a chance of understanding.

Specific

Are all the test cases in place and implemented?
Some code-sizing boundaries in coding guidelines docs won't literally be checked by the automated quality tests, it's up to the reviewer to think about it. A method will be rejected by the automated tests if it's - for example - over 100 lines long. But if the target is 20, warning bells should be going off at 40. If it's 80 lines long, this probably warrants a closer look. The size restrictions are not there to be gamed (101 bad; 99 good), they're just something we can get code analysis tooling (eg: PHPMD) to flag up. It's down to the humans to agree whether the method is A-OK at 80 lines.
Equally if new work takes a method that is 50 lines long and takes it up to 60 lines… it was already too big so the dev ought to have extracted the piece of logic they needed to work on into a separate method, and done the work there. Or if they new work definitely belongs in the mainline of the method, then extract a different section of the method (so employing the boy scout rule). There is seldom any good reason for a method that is already oversized to get even bigger.
Do class / method / variable names make sense? Are they spelt right? Yeah, it matters.
Are classes / methods doing just one thing?
- A controller should not be getting stuff from the database or implementing business logic.
- A method called getAccount should be getting the account. Not creating it if it's missing and then returning it.
- A repository class should not be implementing logging, etc.
- A class that hits the DB and also sets constants for other classes to use is doing two things. One or other of them does not belong in that class.
- Any method that takes a Boolean parameter is almost certainly doing two things: one if the flag is true; another if it's false. This is definitely a warning sign.
- Look at the levels of indentation. Every level suggests that the method is doing too much.

The value of code review

The codebase will be better.
You might have learned some new coding techniques. You have improved as a developer.
You know more about the codebase, which reduces the bus factor for the company.
You help your colleagues stay on track with their work.
You get to practise your written communication skills.
We get a better historical record of why work was done.
Team cohesion. It's a team-building exercise.
Reading code. It's good practise reading other people's code.
Writing code. It's good practise writing code other people can read.

Let me know what you think about code review. Any considerations I've missed?

Righto.

--
Adam

Saturday, 5 March 2016

CFML: go have a look at a Code Review

G'day:
I've already put more effort into this than a Saturday morning warrants, so I'll keep this short and an exercise in copy and past.

Perennial CFMLer James Mohler has posted some code on Code Review: "Filtering the attributes for a custom tag". I'll just reproduce my bit of the review, with his code by way of framing.

Go have a look, and assess our code as an exercise.

James's functions:

Original:

string function passThrough(required struct attr)   output="false"  {

    local.result = "";

    for(local.myKey in arguments.attr)    {
        if (variables.myKey.left(5) == "data-" || variables.myKey.left(2) == "on" || variables.myKey.left(3) == "ng-")  {

            local.result &= ' #local.myKey.lcase()#="#arguments.attr[local.myKey].encodeForHTMLAttribute()#"';
        } // end if 
    }   // end for

    return local.result;
}

His reworking:

string function passThrough(required struct attr)   output="false"  {

    arguments.attr.filter(
        function(key, value) { 
            return (key.left(5) == "data-" || key.left(2) == "on" || key.left(3) == "ng-");
        }
    ).each(
        function(key, value)    {
            local.result &= ' #key.lcase()#="#value.encodeForHTMLAttribute()#"';
        }
    );  

    return local.result;
}

Now for my code review (this is a straight copy and paste from my answer):

OK, now that I've had coffee, here's my refactoring:
function extractCodeCentricAttributesToMarkupSafeAttributes(attributes){
    var relevantAttributePattern = "^(?:data-|ng-|on)(?=\S)";

    return attributes.filter(function(attribute){
        return attribute.reFindNoCase(relevantAttributePattern);
    }).reduce(function(attributeString, attributeName, attributeValue){
        return attributeString& ' #attributeName#="#attributeValue#"';
    }, "");
}
Notes on my implementation:

I could not get TestBox working on my ColdFusion 2016 install (since fixed), so I needed to use CF11 for this, hence using the function version of encodeForHTMLAttribute(). The method version is new to 2016.

we could probably argue over the best pattern to use for the regex all day. I specifically wanted to take a different approach to Dom's one, for the sake of comparison. I'm not suggesting mine is better. Just different. The key point we both demonstrate is don't use a raw regex pattern, always give it a meaningful name.

looking at the "single-expression-solution" I have here, the code is quite dense, and I can't help but think Dom's approach to separating them out has merit.

Code review notes:

your original function doesn't work. It specifies variables.myKey when it should be local.myKey. It's clear you're not testing your original code, let alone using TDD during the refactoring process. You must have test coverage before refactoring.

the function name is unhelpfully non-descriptive, as demonstrated by Dom not getting what it was doing. I don't think my function name is ideal, but it's an improvement. I guess if we knew *why

you were doing this, the function name could be improved to reflect that.

lose the scoping. It's clutter.

lose the comments. They're clutter.

don't abbrev. variable names. It makes the code slightly hard to follow.

don't have compound if conditions like that. It makes the code hard to read. Even if the condition couldn't be simplified back to one function call and you still needed multiple subconditions: extract them out into meaningful variable names, eg: isDataAttribute || isOnAttribute || isNgAttribute

key and value are unhelpful argument names

slightly controversial: but unless it's an API intended to be used by third-parties: lose the type checking. It's clutter in one's own code.

there's no need for the output modifier for the function in CFScript.

don't quote boolean values. It's just false not "false".

Unit tests for this:
component extends="testbox.system.BaseSpec" {

    function beforeAll(){
        include "original.cfm";
        include "refactored.cfm";
        return this;
    }

    function run(testResults, testBox){
        describe("Testing for regressions", function(){
            it("works with an empty struct", function(){
                var testStruct = {};
                var resultFromOriginal = passThrough(testStruct);
                var resultFromRefactored = extractCodeCentricAttributesToMarkupSafeAttributes(testStruct);
                expect(resultFromRefactored).toBe(resultFromOriginal);
            });
            it("works with an irrelevant attribute", function(){
                var testStruct = {notRelevant=3};
                var resultFromOriginal = passThrough(testStruct);
                var resultFromRefactored = extractCodeCentricAttributesToMarkupSafeAttributes(testStruct);
                expect(resultFromRefactored).toBe(resultFromOriginal);
            });
            it("works with each of the relevant attributes", function(){
                var relevantAttributes = ["data-relevant", "onRelevant", "ng-relevant"];
                for (relevantAttribute in relevantAttributes) {
                    var testStruct = {"#relevantAttribute#"=5};
                    var resultFromOriginal = passThrough(testStruct);
                    var resultFromRefactored = extractCodeCentricAttributesToMarkupSafeAttributes(testStruct);
                    expect(resultFromRefactored).toBe(resultFromOriginal);
                }
            });
            it("works with a mix of attribute relevance", function(){
                var testStruct = {notRelevant=7, onRelevant=11};
                var resultFromOriginal = passThrough(testStruct);
                var resultFromRefactored = extractCodeCentricAttributesToMarkupSafeAttributes(testStruct);
                expect(resultFromRefactored).toBe(resultFromOriginal);
            });
            it("works with multiple relevant attributes", function(){
                var testStruct = {"data-relevant"=13, onRelevant=17, "ng-relevant"=19};
                var resultFromOriginal = passThrough(testStruct);
                var resultFromRefactored = extractCodeCentricAttributesToMarkupSafeAttributes(testStruct);
                expect(resultFromRefactored).toBe(resultFromOriginal);
            });
        });

    }

}
Use them. NB: the includes in beforeAll() simply contain each version of the function.

One thing I didn't put as I don't think it's relevant to this code review is that one should be careful with inline function expressions. They're not themselves testable, so if they get more than half a dozen statements or have any branching or conditional logic: extract them into their own functions, and test them separately.

Oh, and in case yer wondering... yes I did write the tests before I wrote any of my own code. And I found a coupla bugs with my planned solution (and assumptions about the requirement) in doing so. Always TDD. Always.

Anyway, all of this is getting in the way of my Saturday.

Righto.

--
Adam

About

I've been a web developer since 2001: 13yrs as a CFML developer; 6yrs as a PHP dev; and now back adjacent to the CFML community but focusing on code quality, design and testing. As of 2023, I am migrating my dev focus back to PHP again.

The code I write and discuss here is pretty much just looking at random conundrums I encounter in my day job.

I tend to be a bit "forthright" in my opinions, I am indelicate, and I tend to swear too much. This will come out occasionally here: I make no apology for it.

Everything said here is my own opinion. Feel free to disagree with me :-)