## Saturday, 11 October 2014

### PHP: Inspired by Duncan: palindromes via generators

G'day:
My mate Dunc is making a habit of being my muse @ the moment. I guess it's cos we're both learning PHP at the same time, and he's blogging about it as well. This morning he has a new article following his foray into translating his CFML (and Python occasionally) knowledge into PHP: "Project Euler: problem 36 (PHP)".

The Project Euler exercises are just a series of algorithmic problems which one can work through. I'm not signed up to the site, but I guess one can also submit answers or something. I'm just basically following along as Duncan does the hard yards.

Problem 36 is as follows:

The decimal number, 585 = 10010010012 (binary), is palindromic in both bases.

Find the sum of all numbers, less than one million, which are palindromic in base 10 and base 2.

(Please note that the palindromic number, in either base, may not include leading zeros.)

Duncan's taken a fairly brute force approach: looping to 1000000 and cherry-picking the results that are a) decimal palindromes, and then also b) binary ones. He outputs them, and tallies them up. Hopefully he doesn't mind me reproducing his code:

``````<?php
\$sum = 0;

function isPalindrome(\$x)
{
return \$x == strrev(\$x);
}

for (\$i = 1;\$i <= 999999; \$i++) {
if (isPalindrome(\$i) && isPalindrome(decbin(\$i))) {
echo \$i . " " . decbin(\$i) . "<br>";
\$sum += \$i;
}
}

echo \$sum;
``````

This solution is short, and it works. So that's a tick in the box.

However I have a real problem with loops that actually don't do anything for most of their iterations. For example here, there's only 19 of the iterations we're interested in: the rest we exclude. So 99.99% - literally! (to two decimal places anyhow) - of that loop is a waste of time.

We had a more egregious example of this at work the other day. We had an array of potential matches, and we looped over them to find which one we wanted, then exited. That is not an exercise for a loop. If one is doing a loop, then I feel the code in the loop ought to apply to say 80% (80/20 rule picked arbitrarily here) of the iterations. For the work exercise the answer was to apply a filter to the array to get the one we wanted, then just use it.

Anyway, I had a think about the exercise at hand. We don't need to check all million numbers: we already know that the only numbers that match are palindromic ones. And we know how a palindrome is formed, so we can automate this. I also thought this is a dead sitter for a generator, so I decided to write a palindrome generator.

My approach to creating palindromes is as follows:

• count from zero
• for each number, two palindromes can be inferred:
• simply reflect the number itself, eg: `123 => 123321`
• similar, but pivot around the last digit, eg: `123 => 12321`
• but those two options will not be adjacent to each other in the sequence (eg: `12421` and many others will come between `12321` and `123321`), so whilst the shorter pivoted one could be the next in the sequence, the reflected ones likely won't be, so we store those ones in an overflow array
• we check to see if the current pivoted palindrome is actually greater than any in the overflow array, and if it is, we drain the overflow of any palindromes that are less than the pivoted one
• then we use the pivoted one
Here's a desk check of that logic to demonstrate what I mean. I'm going to start from 10, as it's clearer sooner. In this example the columns are current digit: pivoted palindrome; overflow array:

``` 10: 101; 1001 11: 111; 1001, 1111 12: 121; 1001, 1111, 1221 13: 131; 1001, 1111, 1221, 1331 14: 141; 1001, 1111, 1221, 1331, 1441 15: 151; 1001, 1111, 1221, 1331, 1441, 1551 16: 161; 1001, 1111, 1221, 1331, 1441, 1551, 1661 17: 171; 1001, 1111, 1221, 1331, 1441, 1551, 1661, 1771 18: 181; 1001, 1111, 1221, 1331, 1441, 1551, 1661, 1771, 1881 19: 191; 1001, 1111, 1221, 1331, 1441, 1551, 1661, 1771, 1881, 1991 [...] 99: 999: [lots], 9889, 9999 100: 10001 which is greater than everything in the overflow, so we need to drain those first 101: 10101; 101101 [etc] ```

So here's the code for the generator:

``````// Palindrome.class.php

class Palindrome {
public static function createPalindromeSequence(\$max=-1){
\$n = 0;
\$reserveStack = [];
while (true){
\$shortPalindrome = \$n . substr(strrev(\$n),1);

while (count(\$reserveStack) && \$shortPalindrome > \$reserveStack){
yield array_shift(\$reserveStack);
}
if (\$max != -1 && \$shortPalindrome > \$max) return;
yield \$shortPalindrome;
\$n++;
\$reserveStack[] = \$n . strrev(\$n);
}
}

}
``````

Now we just use that as the source of the loop, and away we go:

``````// euler36.php
\$sum = 0;
foreach(Palindrome::createPalindromeSequence(1000000) as \$x){
\$asBinary = decbin(\$x);
if (Palindrome::isPalindrome(\$asBinary)){
echo \$x . " " . \$asBinary . "<br>";
\$sum += \$x;
}
}
echo \$sum . "<hr>";
``````

BTW, `isPalindrome()` is just this:

``````public static function isPalindrome(\$x){
return \$x == strrev(\$x);
}
``````

And that's fine, that works:

``` 0 01 13 115 1017 1119 100133 10000199 1100011313 100111001585 1001001001717 10110011017447 11101000101119009 1000110011000115351 1110111111011132223 11111011101111139993 100111000011100153235 110011111111001153835 110100100100101173737 10010000000001001585585 10001110111101110001872187```

But I'm not happy with it. The question here is to take the series of palindromes, and derive a sum of them (based on a secondary rule). To me this is a reduction process (or a filter and a sum, but I chose the reduction approach). I like to avoid generic loops where there's a more appropriate kind of iterative process to use. So I refactored this code to use `array_reduce()`:

``````// euler36_array_reduce.php
\$sum = array_reduce(\$decPalindromes, function(\$reduction, \$current){
\$asBinary = decbin(\$current);
if (Palindrome::isPalindrome(\$asBinary)){
echo \$current . " " . \$asBinary . "<br>";
\$reduction += \$current;
}
return \$reduction;
}, 0);
``````

I think that logic is more clear.

But... performance? Dunc's code is far more simple, and that often results in better performance. I don't have much of an idea how generators perform, and I also suspect `array_reduce()` is not the fastest approach to things. I added a test harness around each example and had a loop at the timing:

``````\$start = microtime(true);

// code here

\$end = microtime(true);

printf("Execution time: %dms", (\$end - \$start) * 1000);

``````
Typical performance is as follows:

Duncan's code: 871ms
My code, first version: 17ms
My code with `array_reduce()`: 19ms

(I did a lot of runs, and they were always like that or thereabouts).

OK, so I think Duncan's being penalised for his brute force approach here. Running that condition a million times adds up. I think if the situation was such that there was more like 999981 valid items in the iteration, his approach would make sense. But given there's only utlimately 19... it's too much work to be doing to get the result.

Also it's cool there's not much overhead in using `array_reduce()`, so there's no reason to not use that approach.

Oh, one footnote: Duncan excludes 0 from his version, and I don't: zero is as much a palindrome as digits one, two three, etc are. I think he is being too literal about the "leading zeros" in the instructions. I read that as an indicator that something like "020" is not a valid answer. The 0 in zero is not a "leading zero", IMO. It does not impact the bottom line though.

And another one. I'm not that happy with the logic in my palindrome generator: it's needing to back-up far too many entries in the "overflow" for my liking. I think there must be a better approach here. Lemme know if you come up with a better way!

Is it odd that I found that quite a fun exercise for a Saturday morning, all before I had a coffee, or even got out of bed?

Righto.

--