Friday, 16 December 2016

PHP: adolescent weirdness with arrays

G'day:
This was put on my radar the other day, by Alexander Lisachenko:


Well put. The code concerned is thus:

<?php
$A = [1 => 1, 2 => 0, 3 => 1];
$B = [1 => 1, 3 => 0, 2 => 1];

var_dump($A > $B);
var_dump($B > $A);


var_dump($A < $B);
var_dump($B < $A);

And it outputs...

C:\temp>php array.php
bool(true)
bool(true)
bool(true)
bool(true)

C:\temp>

Well that's quite... contrary.

PHP Team member Kalle Sommer Nielsen explained that this sort of comparison (between arrays) is "unsupported" in PHP.

Further conversation ensued, which you can read on Twitter, rather than me repeating it here.

My thoughts on this are that it's completely fine for PHP to not support this sort of comparison, but in that case the way it should deal with it is actively not support it: raise an IllegalOperationException or some such (some extension of LogicException, anyhow). It should not just issue forth illogical and misleading results. Kalle explains some of the background to the situation, which was helpful.

Another issue I have is that if we conclude that asking "is this array greater than that array" doesn't simply result in an error, and it must answer "true" or "false", then the answer ought to be false. Kalle contended it didn't matter, but my parallel is "cheese is greater than blue (true or false)?". It's a nonsensical question, But the answer is not "true".

The conclusion was: well it's done now... one shouldn't try to do these things, and whilst PHP might not handle it in an ideal fashion, it's a sufficient edge-case to not warrant remedial action. All fair enough.

I had a quick look at what PHP does with various comparisons:

<?php

$i1 = 5;
$i2 = 7;
echo "$i1 > $i2: " . booleanAsString($i1 > $i2) . PHP_EOL;
echo "$i1 < $i2: " . booleanAsString($i1 < $i2) . PHP_EOL;

$f1 = 5.5;
$f2 = 7.7;
echo "$f1 > $f2: " . booleanAsString($f1 > $f2) . PHP_EOL;
echo "$f1 < $f2: " . booleanAsString($f1 < $f2) . PHP_EOL;

$s1 = "a";
$s2 = "z";
echo "$s1 > $s2: " . booleanAsString($s1 > $s2) . PHP_EOL;
echo "$s1 < $s2: " . booleanAsString($s1 < $s2) . PHP_EOL;

$s1 = "A";
$s2 = "a";
echo "$s1 > $s2: " . booleanAsString($s1 > $s2) . PHP_EOL;
echo "$s1 < $s2: " . booleanAsString($s1 < $s2) . PHP_EOL;

$b1 = true;
$b2 = false;
echo sprintf("%s > %s: ", booleanAsString($b1), booleanAsString($b2)) . booleanAsString($b1 > $b2) . PHP_EOL;
echo sprintf("%s < %s: ", booleanAsString($b1), booleanAsString($b2)) . booleanAsString($b1 < $b2) . PHP_EOL;

$o1 = (object) ["i"=>5];
$o2 = (object) ["i"=>7];
echo sprintf("%s > %s: ", json_encode($o1), json_encode($o2)) . booleanAsString($o1 > $o2) . PHP_EOL;
echo sprintf("%s < %s: ", json_encode($o1), json_encode($o2)) . booleanAsString($o1 < $o2) . PHP_EOL;

$a1 = ["i"=>5];
$a2 = ["i"=>7];
echo sprintf("%s > %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 > $a2) . PHP_EOL;
echo sprintf("%s < %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 < $a2) . PHP_EOL;


$a1 = [5];
$a2 = [7];
echo sprintf("%s > %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 > $a2) . PHP_EOL;
echo sprintf("%s < %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 < $a2) . PHP_EOL;

$a1 = [5=>5];
$a2 = [7=>7];
echo sprintf("%s > %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 > $a2) . PHP_EOL;
echo sprintf("%s < %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 < $a2) . PHP_EOL;

$a1 = [1 => 1, 2 => 0, 3 => 1];
$a2 = [1 => 1, 3 => 0, 2 => 1];
echo sprintf("%s > %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 > $a2) . PHP_EOL;
echo sprintf("%s < %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 < $a2) . PHP_EOL;

$a1 = [1 => 1, 2 => 0, 3 => 1];
$a2 = [1 => 1, 3 => 1, 2 => 0];
echo sprintf("%s > %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 > $a2) . PHP_EOL;
echo sprintf("%s < %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 < $a2) . PHP_EOL;


function booleanAsString($expression){
    return $expression ? "true" : "false";
}

I don't think any of that lot needs explaining: it's straight forward enough. There's a few baseline control expressions which apply the operator to numeric operands, which is clearly gonna have predictable results. Then I check with strings (including case differences), booleans, and then arrays (indexed and associative). The output:

5 > 7: false
5 < 7: true
5.5 > 7.7: false
5.5 < 7.7: true
a > z: false
a < z: true
A > a: false
A < a: true
true > false: true
true < false: false
{"i":5} > {"i":7}: false
{"i":5} < {"i":7}: true
{"i":5} > {"i":7}: false
{"i":5} < {"i":7}: true
[5] > [7]: false
[5] < [7]: true
{"5":5} > {"7":7}: false
{"5":5} < {"7":7}: false
{"1":1,"2":0,"3":1} > {"1":1,"3":0,"2":1}: true
{"1":1,"2":0,"3":1} < {"1":1,"3":0,"2":1}: true
{"1":1,"2":0,"3":1} > {"1":1,"3":1,"2":0}: false
{"1":1,"2":0,"3":1} < {"1":1,"3":1,"2":0}: false

Things work reasonably sensibly until we get to associative arrays, at which point PHP starts spouting nonsense.

I had a look around for something in the docs saying "don't do this", but what I did find kinda indicates there might indeed be a problem here (at least in the docs, if not in PHP's behaviour).

I've extracted this from "Comparison with Various Types":


Array with fewer members is smaller, if key from operand 1 is not found in operand 2 then arrays are uncomparable, otherwise - compare value by value (see following example)
(obviously that's my emphasis)

But this isn't so. See this example:

$a1 = ["a" => 1, "b" => 0, "c" => 1];
$a2 = ["a" => 1, "c" => 0, "b" => 1];
echo sprintf("%s > %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 > $a2) . PHP_EOL;
echo sprintf("%s < %s: ", json_encode($a1), json_encode($a2)) . booleanAsString($a1 < $a2) . PHP_EOL;

All the keys are present in both: "a", "b", "c", but the comparison doesn't work. I suspect it's the case of  poor wording in the docs, and it's down to whether the keys are in the same order too.

Also I'm not sure that suggesting they're "uncomparable" is helpful wording here. To me that indicates one would expect an exception if one tried. It should say "then the result is unreliable, and the operation should not be used".

I also note there's a big, highlighted warning about comparing floats:



Perhaps the same treatment should be given to array comparisons. If it tripped-up Mr Lisachenko - who seems to be pretty bloody good & experienced PHP dev - then it's gonna trip-up a lot of people.

Bottom line I agree that the handling here is "unfortunate", but it is pretty predictable given how PHP tends to handle not-obvious things, so one kinda oughta expect it. Equally it could be remediated by a documentation improvement, and I don't think it's worth anyone putting any effort into to "fixing" it.

Righto.

--
Adam