Saturday, 5 May 2018

PHP: perpetual misuse of array type declarations

G'day:
This has been irking me enough recently to cause me to dust off this old blog.

For a while PHP has had limited argument type checking on function arguments, and this was improved in PHP 7. The docs cover it - Function arguments › Type declarations - so I'll not repeat them here. But we can have this code:

function takesArray(array $a){
    return $a;
}

var_dump(takesArray(["tahi", "rua", "toru", "wha"]));

Output:

array(4) {
  [0]=>
  string(4) "tahi"
  [1]=>
  string(3) "rua"
  [2]=>
  string(4) "toru"
  [3]=>
  string(3) "wha"
}

And if we try to pass something other than an array:

<?php
function takesArray(array $a){
    return $a;
}

try {
    $a = takesArray("NOT_AN_ARRAY");
} catch (\Throwable $t) {
    echo $t;
}

We get this:

TypeError:
Argument 1 passed to takesArray() must be of the type array, string given,
called in [...][...] on line 7 and defined in [...][...]:2
Stack trace:
#0 [...][...](7): takesArray('NOT_AN_ARRAY')
#1 {main}

Fair enough. No complaint there.

What I have an issue with is that now I see a lot of devs slapping "array" as an parameter type in every method signature that takes an array. What's wrong with that you probably ask? Well... in most situations they're doing this, it's not simply an array that is the requirement. It's generally one of these two situations:

function getFirstName(array $name){
    return $name['firstName'];
}

$firstName = getFirstName(["lastName" => "Cameron", "firstName" => "Adam"]);

echo $firstName;

OK, this is a really contrived example, but it's simple and demonstrates the point. Here the function takes an array, and returns the firstName from the array. But note that the requirement of the argument is not "it's an array", as stated. It very specifically needs to be "an associative array that has a key firstName". "array" is not correct here. It's not at all helpful. If one is needing to do this sort of thing, then create a class Name, and specify that. Because that's what you actually want here:

class Name {
    public $lastName;
    public $firstName;
    
    function __construct($lastName, $firstName) {
        $this->lastName = $lastName;
        $this->firstName = $firstName;
    }
}

function getFirstName(Name $name){
    return $name->firstName;
}

The second bogus usage of array as a type declaration is with indexed arrays. Consider this:

function filterOutEarlierDates(array $dates, int $year){
    $cutOff = new \DateTime("$year-01-01");

    $filtered = array_filter($dates, function (\DateTimeInterface $date) use ($cutOff) {
        $diff = $date->diff($cutOff);
        return $diff->invert === 1;
    });
    
    return array_values($filtered);
}

$dobs = [
    new \DateTime("1968-12-20"),
    new \DateTime("1970-02-17"),
    new \DateTime("2011-03-24"),
    new \DateTime("2016-08-17")
];
$cutOff = 2011;

$youngins = filterOutEarlierDates($dobs, $cutOff);

var_dump($youngins);


Here filterOutEarlierDates claims it takes an array. Well it does. Again this lacks the necessary precision for the requirement. It doesn't take an array. It takes - specifically - an array of DateTimeInterface objects. If you give it anything else than that, the code breaks. As type checking is specifically to guard against that, specifying array is simply neither correct or even helpful here.

Specifying array as an argument type check does have a place. When what the function needs actually is an array. And just an array. Any old array.

function array_some(array $array, callable $callback) {
    foreach ($array as $key => $value) {
        $result = $callback($value, $key);
        if ($result) {
            return true;
        }
    }
    return false;
}

$colours = ["whero", "karaka", "kowhai", "kakariki", "kikorangi", "poropango", "papura"];

$containsSixLetterWords = array_some($colours, function ($colour) {
   return strlen($colour) === 6; 
});

var_dump($containsSixLetterWords);

$numbers = [
    "one" => "tahi",
    "two" => "rua",
    "three" => "toru",
    "four" => "wha",
    "five" => "rima",
    "six" => "ono",
    "seven" => "whitu",
    "eight" => "ware",
    "nine" => "iwa",
    "ten" => "tekau"
];

$hasShortKeys = array_some($numbers, function ($number, $key) {
   return strlen($key) <= 3; 
});

var_dump($hasShortKeys);

Here's a PHP equivalent of JS's some method. It takes an array (any array) and a callback, and applies the callback to each array entry until the callback returns true, then it exits. This function correctly claims the first argument value needs to be an array. Without any other sort of qualification: it just needs to be an array.

Actually there's a similar issue with my type checking of the $callback parameter there. It can't be any callable: it needs to be a callable that takes a coupla arguments and returns a boolean. In C# I guess we'd use a Delegate for that, but there's no such construct in PHP. So I should probably settle for giving the argument a clear name ($callback probably isn't good enough here), and dispense with the type check.

But even then I question the merits of type-checking array here. What if instead of just an array, we had a collection object that implements \Iterator, eg:

class ColourCollection implements \Iterator {
    private $position = 0;
    private $colours;  

    public function __construct($colours) {
        $this->position = 0;
        $this->colours = $colours;
    }

    public function rewind() {
        $this->position = 0;
    }

    public function current() {
        return $this->colours[$this->position];
    }

    public function key() {
        return $this->position;
    }

    public function next() {
        ++$this->position;
    }

    public function valid() {
        return isset($this->colours[$this->position]);
    }
}

$rainbow = new ColourCollection($colours);

If we run this variation of the code, we get:

Fatal error:
Uncaught TypeError: Argument 1 passed to array_some() must be of the type array, object given,
 called in [...][...] on line 48 and defined in [...][...]:2
Stack trace:
#0 [...][...](48): array_some(Object(ColourCollection), Object(Closure))
#1 {main}
  thrown in [...][...] on line 2

If we take the type check out: the code works. So how is the array type check really helping us here? It's not.

To be honest I think I'm being a bit edge-case-y with that last example. I don't see a lot of objects that implement the iterator class: we just stick with arrays instead. But it is a legit consideration I guess.

I would use array as a type check in that array_some situation - where the argument actually just needs to be an array, with no special other qualities about it. But this is very rare. In other cases where the argument value needs to be "some specific type or array", then I think it's dead wrong to have an array type check there.

Righto.

--
Adam