Wednesday 24 June 2015

PHP 7: Three-way comparison operator (<=>)

G'day:
Here's another new feature of PHP 7, there's not much to it, but it seems to work, and is almost sensibly implemented.

The three-way comparison operation (you can piss off if you think I'm going to call it a "spaceship operator") - <=> - is used when comparing two values. It returns <=-1, 0 or >=1 depending on whether the first operand is "less than", "equal to" or "greater than" the second operand.

For strings we already have strcmp(), but there's no expedient way of doing this sort of comparison with other data types.

First things first, let's put it through its paces to see what sort of results we get with various tests:

// basic.php

function compareSets($sets, $heading=''){
    echo "<h2>$heading</h2>";
    echo '<pre>';
    foreach($sets as $set){
        foreach($set as $op1){
            foreach($set as $op2){
                printf('%s <=> %s = %d<br>', json_encode($op1), json_encode($op2), $op1 <=> $op2);
            }
        }
        echo '<hr>';
    }
    echo '</pre>';
}


// simples
$strings = ['mmm', 'nnn'];
$integers = [17, 19];
$floats = [M_E, pi()];
$booleans = [false,true];

$simple = [$strings, $integers, $floats, $booleans];
compareSets($simple, 'Simple values');


// arrays
$empty = [];
$short = [2];
$sequence = [1,2,4];
$order = [1,4,2];
$value = [4,1,1];

$emptyCheck = [$empty, $short];
$lengthVsFirst = [$short, $sequence];
$sameLengthDiffOrder = [$sequence, $order];
$sameLengthHigherValueFirst = [$sequence, $value];

$arrays = [$emptyCheck, $lengthVsFirst, $sameLengthDiffOrder ,$sameLengthHigherValueFirst];

compareSets($arrays, 'Arrays');


// objects
class C {
    public $p;
    function __construct($p){
        $this->p = $p;
    }
}
class D {
    public $p;
    function __construct($p){
        $this->p = $p;
    }
}

$low = new C('a');
$high = new C('z');
$c = new C('instance of C');
$d = new D('instance of D');

$sameClass = [$low, $high];
$differentClasses = [$c, $d];

$objects = [$sameClass, $differentClasses];
compareSets($objects, 'Objects');

That seems like a mouthful, but it's most support code for the tests:

  • a function to do a "truth table" for each combination of operands in a set;
  • some general comparisons of simple values;
  • and some to check how arrays are handled;
  • and with some objects
The results have only a coupla wee surprises:

Simple values

"mmm" <=> "mmm" = 0
"mmm" <=> "nnn" = -1
"nnn" <=> "mmm" = 1
"nnn" <=> "nnn" = 0

17 <=> 17 = 0
17 <=> 19 = -1
19 <=> 17 = 1
19 <=> 19 = 0


2.718281828459 <=> 2.718281828459 = 0
2.718281828459 <=> 3.1415926535898 = -1
3.1415926535898 <=> 2.718281828459 = 1
3.1415926535898 <=> 3.1415926535898 = 0


false <=> false = 0
false <=> true = -1
true <=> false = 1
true <=> true = 0



Arrays

[] <=> [] = 0
[] <=> [2] = -1
[2] <=> [] = 1
[2] <=> [2] = 0


[2] <=> [2] = 0
[2] <=> [1,2,4] = -2
[1,2,4] <=> [2] = 2
[1,2,4] <=> [1,2,4] = 0



[1,2,4] <=> [1,2,4] = 0
[1,2,4] <=> [1,4,2] = -1
[1,4,2] <=> [1,2,4] = 1
[1,4,2] <=> [1,4,2] = 0



[1,2,4] <=> [1,2,4] = 0
[1,2,4] <=> [4,1,1] = -1
[4,1,1] <=> [1,2,4] = 1
[4,1,1] <=> [4,1,1] = 0



Objects

{"p":"a"} <=> {"p":"a"} = 0
{"p":"a"} <=> {"p":"z"} = -1
{"p":"z"} <=> {"p":"a"} = 1
{"p":"z"} <=> {"p":"z"} = 0


{"p":"instance of C"} <=> {"p":"instance of C"} = 0
{"p":"instance of C"} <=> {"p":"instance of D"} = 1
{"p":"instance of D"} <=> {"p":"instance of C"} = 1
{"p":"instance of D"} <=> {"p":"instance of D"} = 0



  • There's nothing surprising when it comes to the simple values.
  • With arrays it seems like longer arrays beat higher values at the same index, which is a bit odd. I'd perhaps expect each array element to be compared in turn until there's a "winner":this is how words are sorted when applying alphabetical ordering, so I think that's a reasonable real-world precedent. I don't think a numerical-based ordering makes sense here as that only works on numerical values.
  • There's a slight "eyebrow raise" at the result where it returns as -2 / 2, and I've not quite got to the bottom of that yet. But it demonstrates the return values are not simply -1 / 0 / 1, but - as I said above - <=-1, 0 or >=1. I'll look a bit further into what might result in non ±1 / 0 values later.
  • And I'm not quite sure how two objects - unless they're both superpositional cats - can be both greater-than and less-than each other at the same time. I'm calling "bug" on that, but I'll look into that further too.
One omission PHP 7 has in this implementation is that it should have also shipped a Comparable interface which one can implement on a class, giving it a compare() method which is called when two objects are compared. Even if it only works for same-type objects, that'd be something. PHP second-guessing (and making dubious decisions) as to how to sort objects is a bit short-sighted IMO.



So when do we use this? The most useful situation is when providing a comparator function for a sorting operation.

The usort() function is used to apply a custom comparator to an array for sorting purposes. One needs to use a custom comparator when sorting stuff like objects. Example:

First, we have class, and an array of objects:

// usort.php

class Score {
    private $value;

    function __construct($value){
        $this->value = $value;
    }

    function getValue(){
        return $this->value;
    }
}

function getScores(){
    return $scores = array_map(function($x){return new Score($x);}, [97,2,47,73,29]);
}

$scores = getScores();

Next we have a comparator function which compares the value we wish to sort on:

$oldScoreSorter = function($e1, $e2){
    $e1Value = $e1->getValue();
    $e2Value = $e2->getValue();

    if ($e1Value == $e2Value) return 0;
    if ($e1Value > $e2Value) return 1;
    return -1;
};

Then we use it to sort the $scores array:

echo '<h4>Before</h4><pre>';
var_dump($scores);
echo '</pre>';

usort($scores, $oldScoreSorter);
echo '<h4>After (using old approach)</h4><pre>';
var_dump($scores);
echo '</pre>';

The result of this is everying sorted A-OK:

Before

array(5) {
  [0]=>
  object(Score)#3 (1) {
    ["value":"Score":private]=>
    int(97)
  }
  [1]=>
  object(Score)#4 (1) {
    ["value":"Score":private]=>
    int(2)
  }
  [2]=>
  object(Score)#5 (1) {
    ["value":"Score":private]=>
    int(47)
  }
  [3]=>
  object(Score)#6 (1) {
    ["value":"Score":private]=>
    int(73)
  }
  [4]=>
  object(Score)#7 (1) {
    ["value":"Score":private]=>
    int(29)
  }
}

After (using old approach)

array(5) {
  [0]=>
  object(Score)#4 (1) {
    ["value":"Score":private]=>
    int(2)
  }
  [1]=>
  object(Score)#7 (1) {
    ["value":"Score":private]=>
    int(29)
  }
  [2]=>
  object(Score)#5 (1) {
    ["value":"Score":private]=>
    int(47)
  }
  [3]=>
  object(Score)#6 (1) {
    ["value":"Score":private]=>
    int(73)
  }
  [4]=>
  object(Score)#3 (1) {
    ["value":"Score":private]=>
    int(97)
  }
}

However using the new operator, the comparator is much simply:

$newScoreSorter = function($e1, $e2){
    return $e1->getValue() <=> $e2->getValue();
};

(the results are the same using this one, so I'll spare you).

So it's not exactly something you'd be using every day, but I think the syntax looks sensible, and it does a simple job well. -ish.

If I work out what's going on with the comparing arrays by length rather than value, or how two disparate objects are both greater than and less than each other at the same time, I'll report back. Or if you know the answers already: lemme know!

Righto.

--
Adam