Sunday, 21 September 2014

PHP: iterators

G'day:
I'm now at the pub (see "PHP: generators"), and have just started my third Guinness. I've not yet worked out what this does to my writing style, so that could be a good or a bad thing. [shrug]: it's my blog.

Anyway, In the article linked to above, I talked about generators, and in the process I mentioned PHP has the notion of iterators, but didn't really have a close look at what those are in the context of PHP. Obviously we all know what an iterator is, but I figured part of my learning process had better be to get a good handle on PHP's implementation.



Firstly... this is where PHP demonstrates itself as being a bit more of a mature language than CFML is. Yes, I know they "came up" at about the same time, but CFML spent a lot of time going out of its way to dumb itself down beyond all belief (thanks for that, Macromedia and Adobe), before Railo came along and jollied things along a bit. So whilst PHP was progressing like a maturing language should, CFML acted like one of those adolescents who didn't see any point in living beyond the age of 18.

(OK, I just detected the difference in my writing when I am doing it at the pub ;-)

My point being that both CFML and PHP has the notion of interfaces, but PHP is slightly more grown up in that it provides interfaces that one's code can implement so as to better interact with other language features. ColdFusion (and Railo, admittedly) seem to be lack-lustre here an provided interfaces and then went "yeah, but... um... dunno why you'd want that": interfaces are simply something we'd use in our own code, not something that CFML itself really kind of "got". I've actually mentioned this in the past to Adobe in the context of their custom serialisers, but it never got any distance at all.

PHP has provided an interface Iterator, which if one implements it on a class, then the class can be used where PHP itself would normally expect an iterable object. Let me demonstrate.

The Iterator interface demands five methods be implemented:
  • current(): returns the current element in the collection
  • key(): returns the current key in the collection
  • next(): moves to the next element in the collection
  • rewind(): moves back to the first element in the collection
  • valid(): returns whether there's an element at the current position in the collection (eg: next() could move past the end of the collection).
If one implement Iterator and those methods (and one'll get a compile error if one doesn't implement the methods, having stated the class does implement Iterator), then one can use an object in situations wherein PHP needs an iterable object. Like a foreach() loop (which normally one would give an array).

Here's a demo.

<?php
// Numbers.class.php

class Numbers implements Iterator {
    
    private $numbers;
    private $index;

    public function __construct(){
        $this->numbers = [];
        $this->index = null;
    }

    public function push($english,$maori){
        array_push($this->numbers, ["english"=>$english, "maori"=>$maori]);
        $this->index = sizeof($this->numbers) - 1; //stoopid zero-indexed arrays :-|
        return $this;
    }

    public function current(){
        return $this->numbers[$this->index];
    }

    public function next(){
        $this->index++;
    }

    public function key(){
        return $this->index;
    }

    public function valid(){
        return array_key_exists($this->index, $this->numbers);
    }

    public function rewind(){
        $this->index = 0;
    }

}

NB: I really hate zero-indexed arrays. It's so leaden after using human-natural one-indexed arrays in CFML. And, yes, I know plenty of modern languages get it wrong and use zero-indexed arrays, but they're all wrong too.

Anyway, here I have a class which holds an array of numbers in both English and Maori. I've implemented the most simple versions of current() / key() / next() / rewind() / valid() possible. And here's some basic tests:

<?php
// numbers.php

require "Numbers.class.php";
require "../../debug/dBug.php";

echo "<h3>before populated</h3>";
$numbers = new Numbers();
echo sprintf("valid(): %b<br>", $numbers->valid());

echo "<hr><h3>after adding one</h3>";
$numbers->push("one", "tahi");
echo sprintf("valid(): %b<br>", $numbers->valid());
echo sprintf("key(): %d<br>", $numbers->key());

echo "<hr><h3>after adding 2-4</h3>";
$numbers    -> push("two", "rua")
            -> push("three", "toru")
            -> push("four", "wha");
echo sprintf("valid(): %b<br>", $numbers->valid());
echo sprintf("key(): %d<br>", $numbers->key());
echo "current()";
new dBug($numbers->current());

echo "<hr><h3>after rewind()</h3>";
$numbers->rewind();
echo sprintf("key(): %d<br>", $numbers->key());
echo sprintf("valid(): %b<br>", $numbers->valid());

There's no surprises here, so I won't dwell on it. Just proof of concept stuff. Output:

before populated

valid(): 0



after adding one

valid(): 1
key(): 0


after adding 2-4

valid(): 1
key(): 3
current()
$numbers->current() (array)
englishfour
maoriwha

after rewind()

key(): 0
valid(): 1


But here's an example of the actual iteration in action:

echo "<hr><h3>while loop</h3>";
while ($numbers->valid()){
    $number = $numbers->current();
    echo sprintf("English: %s; Maori: %s<br>", $number["english"], $number["maori"]);
    $numbers->next();
}

Output:

while loop

English: one; Maori: tahi
English: two; Maori: rua
English: three; Maori: toru
English: four; Maori: wha


This demonstrates the fact that the iterations works. Here's the good bit though: foreach() usually takes an array - when I'm using it anyhow - but one can iterate over an object as well and it will output the public properties, like this similar situation:

<?php
// Days.class.php

class Days {
    
    public $days;
    public $index;

    public function __construct($days){
        $this->days = $days;
    }

}


<?php
// days.php
require "Days.class.php";

$days = new Days(["Rāhina","Rātū","Rāapa","Rāpare","Rāmere","Rāhoroi","Rātapu"]);

echo "<h3>foreach loop</h3>";
foreach ($days as $key=>$value){
    echo "$key ";
}

Ouptut:

foreach loop

days index


So it does the usual thing of iterating over the public properties.

But because our Numbers class implements Iterator, foreach() knows to use the Iterator methods to iterate over our Numbers object:

echo "<hr><h3>foreach loop</h3>";
foreach ($numbers as $number){
    echo sprintf("English: %s; Maori: %s<br>", $number["english"], $number["maori"]);
}

(The output is the same as the previous iteration loop, so I'll spare you the repetition).

That's cool. And there's a direct analogy back to CFML here. If you give an object to a for/in loop, you'll get much the same as with the Days example above. Not much use to anyone. But in CFML there's no sense of an Iterator interface to implement to change this behaviour. Someone should raise tickets for this sort of thing for Railo an ColdFusion (normally I'd do this, but I need to start moving on, and leave it to people still in the CFML community to chase this stuff up).

I tried to go one step further and see if array_walk() would accept a Numbers object and iterate over it properly:

<?php
// array_walk.php
require "Numbers.class.php";
require "../../debug/dBug.php";

$numbers = new Numbers();
$numbers    -> push("one", "tahi")
            -> push("two", "rua")
            -> push("three", "toru")
            -> push("four", "wha");


array_walk($numbers, function(){
    new dBug(func_get_args());
});


Result:
func_get_args() (array)
0
array
0
array
englishone
maoritahi
1
array
englishtwo
maorirua
2
array
englishthree
maoritoru
3
array
englishfour
maoriwha
1Numbersnumbers
func_get_args() (array)
03
1Numbersindex

So this did not work, but that's fair enough: the function is designed for taking an actual array, after all. There's probably a case for a generic "walk" function though. Equally, this would be another requirement of the Iterator interface if CFML was to implement it: provision for each() (and perhaps go the whole hog: each(), sort(), filter(), map(), reduce() etc... perhaps there's more than one interface in there: Iterable, Sortable, etc).

The last thing I checked was to see how serious PHP was about its return types for the interface methods. I don't like the way the do/while loop needed to work: calling valid() and next() separately. It'd be great if next() returned a boolean or something, to compress the logic a bit. As PHP is less typeful than CFML, I wondered if I could get next() to at least return this, so I could chain the valid() call. I made a quick variation of the Numbers class, instead implementing Colours:

<?php
// Colours.class.php

class Colours implements iterator {
    
    private $colours;
    private $index;

    // [__construct() & current() removed for clarity]

    public function next(){
        $this->index++;
        return $this;
    }

    // [key() and valid() removed for clarity]

    public function rewind(){
        $this->index = 0;
        return $this;
    }

}


and the test:

<?php
// colours.php
require "Colours.class.php";

$colours = new Colours(["Whero","Karaka","Kowhai","Kakariki","Kikorangi","Poropango","Papura"]);

echo "<h3>while loop</h3>";
do {
    $colour = $colours->current();
    echo "$colour ";
} while ($colours->next()->valid());


And that worked fine! Output:


while loop

Whero Karaka Kowhai Kakariki Kikorangi Poropango Papura

For completeness I checked if foreach() still worked:

echo "<hr><h3>foreach loop</h3>";
foreach ($colours as $colour){
    echo "$colour ";
}


And it did (same output as above).

I think that covers all the superficial bases of PHP's handling of the Iterator interface. I'm pleasantly surprised at their implementation here, as it seems to have kept the stupidity to a minimum.

Sláinte

--
Adam
(who is now on his fifth Guinness, and starting to write his next article...)