Sunday 5 April 2015

PHP: new to me: the ... "operator"

G'day:
OK, so I'm back in PHPland, and as luck would have it I encountered two new interesting things this week. New to me, anyhow. This is actually the one I found second, but as it's part of the solution for the other one, I'll write this one up first.

Now people are are au fait with PHP need to remember I'm a newbie @ PHP, so this article will be shy of any real insight... it's just me teaching myself stuff.



PHP has an operator "..." (yeah, three dots, like an ellipsis, and kinda meaning the same...). For some reason some writing I've read on this refer to it as the "splat" operator, which is a bit daft. The reason why it's called that is that Ruby has an operator which does much the same thing, but uses the * sign, and they call that the splat operator. But that kinda makes sense: it looks like something's gone splat. Well as much as this looks like Elvis, anyhow: ?: (that is to say: not at all. But people are... odd like that). Anyhow, it's dumb to refer to ... as the splat operator, so I shall not.

Actually it's also wrong to call it an operator too, IMO. Because it only works in very limited situations, so really is more just like parser syntactic sugar, I think. To me, an operator is what relates one or more operands together as part of an expression. However this construct can only be used to indicate behaviour, it cannot be used in an expression. Still that's splitting hairs.

What's it for? Well it's part of PHP's attempt at implementing variadic functions. CFMLers will probably be wont to say "a what?", because variadic functions aren't special in CFML. That's just how functions work out of the gate in CFML. Well: user-defined ones, anyhow. A variadic function is one that can be defined as taking a variable number of arguments, not simply the list of arguments predefined in the function signature. In CFML one can define what arguments a function can take, but one can also specify them as being optional (and have defaults), one can pass the arguments by name in any order, and one can pass arguments which are in no way specified in the function signature:

function f(required numeric x, SomeClass y, boolean z=true){
    dump(arguments)
}

f(z=false, a="something else", x=1)

And that would output:


That seems really mundane to a CFML dev, but it's actually pretty special. Things to note here for the non-CFMLers is that the y argument is nowhere to be seen, as it wasn't passed in (which is OK cos it was not marked as required), and the completely additional a argument gets passed into the function no problem. Also note that I'm passing the arguments here by name, so I don't need to worry about the order they're provided. In CFML one can either pass them in order (so: no names), or named. Unfortunately it's one or the other, not a mix of the two; but the benefits of this seldom come up, so that's not such a problem.

Also note we could have passed the arguments as one struct:

f(argumentCollection={z=false, a="something else", x=1})

(yielding the same output as the example above). Obviously the struct can be done inline like that - which has little merit, really - or can be prepared separately. CFML's pretty flexible there.

Note that CFML (at least in Lucee) also allows for ordered arguments to be passed as an argument collection, using an array. I wrote this up yesterday: "CFML: using an array as an argument collection". ColdFusion allows for mapping positional arguments in a struct with appropriately numeric keys. I cover this in that article too.

Now note that PHP does already allow one to pass arguments which are in addition to the ones specified in the method signature:

// passingUnspecifiedArguments.php

function acceptsAdditionalArgs($arg1, $arg2){
    echo "Defined arguments:\r\n";
    var_dump(get_defined_vars());

    echo "\r\nAll arguments:\r\n";
    var_dump(func_get_args());
}

acceptsAdditionalArgs("arg1 value", "arg2 value", "arg3 value", "arg4 value");

This outputs:

Defined arguments:
array(2) {
  ["arg1"]=>
  string(10) "arg1 value"
  ["arg2"]=>
  string(10) "arg2 value"
}

All arguments:
array(4) {
  [0]=>
  string(10) "arg1 value"
  [1]=>
  string(10) "arg2 value"
  [2]=>
  string(10) "arg3 value"
  [3]=>
  string(10) "arg4 value"
}



So that's cool. The only thing missing is the ability to name these additional arguments, rather than just having them availed by numeric index.

The ... operator goes some way to effecting this argument naming. Well: in that one can specify an additional argument in the function signature that accepts all additional arguments passed to the function as an array. Less than ideal, but quite a common handling of this particular dilemma. And it has merits of its own, I guess. Here's an example:

// functionDefinitions.php

function namedArgsThenParams($arg1, $arg2, ...$params){
    echo "Defined arguments:\r\n";
    var_dump(get_defined_vars());

    echo "\r\nAll arguments:\r\n";
    var_dump(func_get_args());
}


namedArgsThenParams('arg1 value', 'arg2 value', 'arg3 value', 'arg4 value');

Output:

Defined arguments:
array(3) {
  ["arg1"]=>
  string(10) "arg1 value"
  ["arg2"]=>
  string(10) "arg2 value"
  ["params"]=>
  array(2) {
    [0]=>
    string(10) "arg3 value"
    [1]=>
    string(10) "arg4 value"
  }
}

All arguments:
array(4) {
  [0]=>
  string(10) "arg1 value"
  [1]=>
  string(10) "arg2 value"
  [2]=>
  string(10) "arg3 value"
  [3]=>
  string(10) "arg4 value"
}


The difference is that in the third argument, we prefix the argument name with .... This indicates that all additional arguments are placed into that argument name. And indeed this is borne out by the dump there: the third and fourth value are in the $params argument value.

Interesting, looking at the second dump: all the args are still considered to have been passed separately. IE: the third and fourth are discrete arguments, not an array as the third argument. This is quite good flexibility here I reckon.

Note that the ... usage must be in the last argument position. This does not compile:

function paramsThenNamedArgs(...$params, $arg1, $arg2){
}

It really should, and it's a bit leaden of PHP to not support that. Ruby supports it just fine, as I discuss when doing some Ruby tutorials: "Ruby: doing a second tutorial @ codeschool.com".

The other situation in which one can use the ... operator is when passing arguments to a function. This is actually what I found to be bloody useful the other day (this'll be tomorrow or Tuesday's article, probably).

function usingFixedArgSet($arg1, $arg2, $arg3){
    echo "Defined arguments:\r\n";
    var_dump(get_defined_vars());

    echo "\r\nAll arguments:\r\n";
    var_dump(func_get_args());
}

echo "Passing correct number\r\n";
$args = ['arg1 value', 'arg2 value', 'arg3 value'];
usingFixedArgSet(...$args);

Here we are using the ... in the function call. It's used to pass a predefined array of arguments through to a function which has explicitly predefined arguments.

The output is:

Passing correct number
Defined arguments:
array(3) {
  ["arg1"]=>
  string(10) "arg1 value"
  ["arg2"]=>
  string(10) "arg2 value"
  ["arg3"]=>
  string(10) "arg3 value"
}

All arguments:
array(3) {
  [0]=>
  string(10) "arg1 value"
  [1]=>
  string(10) "arg2 value"
  [2]=>
  string(10) "arg3 value"
}


Note that each of the elements of the array have been correctly passed as the appropriate named argument. Cool.

One can also pass too many arguments this way:

echo "Passing too many\r\n";
$args = ['arg1 value', 'arg2 value', 'arg3 value', 'arg4 value'];
usingFixedArgSet(...$args);

Output:

Passing too many
Defined arguments:
array(3) {
  ["arg1"]=>
  string(10) "arg1 value"
  ["arg2"]=>
  string(10) "arg2 value"
  ["arg3"]=>
  string(10) "arg3 value"
}

All arguments:
array(4) {
  [0]=>
  string(10) "arg1 value"
  [1]=>
  string(10) "arg2 value"
  [2]=>
  string(10) "arg3 value"
  [3]=>
  string(10) "arg4 value"
}


This behaves the same as any situation in which one passes too many arguments to a UDF.

Similarly - and I'll spare you the code and the output - if one passes too few arguments, one gets an error just as if one had used explicit arguments.

This goes a long way towards behaving like an argument collection in CFML, except for one fatal, inconvenient and pretty bloody stupid shortcoming. One can only use an numerically-indexed array for the value passed into the ... operator. One cannot pass an associative array (which would make a lot of sense). This doesn't work:

echo "Passing named args\r\n";
$args = ['arg1'=>'arg1 value', 'arg2'=>'arg2 value', 'arg3'=>'arg3 value'];
usingFixedArgSet(...$args);

That just results in an error:

PHP Catchable fatal error:  Cannot unpack array with string keys in C:\github-scratch\php\www\experiment\operators\ellipsis\passingNamed.php on line 14


It would have been so easy for PHP to take an understanding of this and map key names to arg names, but... no. Yet another example of PHP trying to do something slightly tricky and... getting it wrong.

Oh well.

As to why I said this construct was not really an operator: well these are the only two situations I can find wherein it'll work. I cannot come up with a scenario where it works (or could be made to work) in an expression, which - to me - is a fundamental requirement for something to be considered an operator. it's just syntactical sugar in my book. This is not meant to detract from its merit: I'm just quibbling the terminology. I now expect Sean to correct me on this. Ahem.

All in all I think PHP's implementation of ... is a good initial proof-of-concept to request-for-discussion implementation. However it strikes me as being a bit witlessly/unaspirationally implemented.

The next PHP article will have me putting this into practice, and I'm moderately pleased with the results.

Righto.

--
Adam