Thursday, 18 August 2016

Breaking: Groovy and Clojure answers for that array-look-up code quiz

G'day:
Well it's not really that breaking really... what I mean is a coupla other people posted some answers to last week's quiz after I wrote up the results ("Looking at the code from that quiz last week"). I was gonna just append them to the bottom of that earlier article, but hen no-one would see them, and that seemed a bit disrespectful. Also for this blog which is still mostly read by CFML devs, it's some exposure to other languages you might want to take the time to look at, and these are good examples why.

Sean

I was wondering what happened to Sean's Clojure example, but he was off leading his life instead of clearing away the cobwebs from my blog, so didn't notice the quiz initially. But here's his Clojure code (fully annotated for us non-Clojurians):

;; simplest solution to find first match -- note that `filter` returns
;; a lazy chunked sequence so it will not search the entire vector
;; however, the chunks are 32 elements in size so it will search up
;; to 31 elements beyond the first match
(first (filter #(re-find #".+at" %) ["at" "cat" "scat" "scratch"]))

;; Clojure has `some` but it returns the result of applying the predicate
;; not the original element so we need to write a "smarter" predicate:

;; will not work in all cases:
(some #(re-find #".+at" %) ["at" "cat" "scat" "scratch"])
;; this: (some #(re-find #".+at" %) ["scratch" "at" "cat" "scat"])
;; produces this: "scrat" -- oops!

;; will work with extended predicate:
(some #(when (re-find #".+at" %) %) ["at" "cat" "scat" "scratch"])

;; or we can use reduce with an early return -- the `reduced` value:
(reduce (fn [_ s] (when (re-find #".+at" s) (reduced s))) nil ["at" "cat" "scat" "scratch"])

;; a note about notation: #(.. % ..) is shorthand for (fn [x] (.. x ..))
;; i.e., an anonymous function with one argument

That looks like a bunch of code, but it's also four examples:

(first (filter #(re-find #".+at" %) ["at" "cat" "scat" "scratch"]))

(some #(re-find #".+at" %) ["at" "cat" "scat" "scratch"])

(some #(when (re-find #".+at" %) %) ["at" "cat" "scat" "scratch"])

(reduce (fn [_ s] (when (re-find #".+at" s) (reduced s))) nil ["at" "cat" "scat" "scratch"])


Now I've only had the most superficial look at Clojure, but even I can just read what's going on in that code. So that's cool. I've been off my game recently with my out-of-hours tech stuff - in case you hadn't noticed - and I really want to finish finding my motivation to get back to it, and look at more Clojure. I think it's a good thing to look at for a perennial CFMLer or PHPer as its quite the paradigm shift, but still seems pretty easy to get at least a superficial handle on, and then work from there.

Tony

Tony's done a Groovy example. Every time I see Groovy, it just seems cool. Check this out:

print( ['a', 'at', 'cat', 'scat', 'catch'].find { it ==~ '.+at' } )

That's it. Done. 67 characters, most of it data. 25 characters of actually "doing stuff", including more whitespace than I'd usually use for this sort of thing. Doesn't that make you want to use Groovy?

Anyway, that's that. I just wanted to share that code with y'all.

Righto.

--
Adam

Tuesday, 16 August 2016

Looking at the code from that quiz last week

G'day:
So last Fri I asked this:

[...] here's a code quiz.

Rules:
  • You have an array of strings, eg: ['a', 'at', 'cat', 'scat', 'catch'].
  • Return the first value that matches a regex pattern, eg: '.+at' would match cat, scat, catch; but we want cat returned.
  • Do not use any looping statements (eg: do/for/while etc).
  • Bear in mind functions are not statements ;-)
  • The array could be very long, the strings could be very long, and the pattern could be complex. But the desired value could be early in the array.
  • Use any language.

I didn't have much of a reason for asking. I had to do something similar at work, except the match criteria were more complicated, and I tried a few ways and didn't like any of them. When processing a "plural" data structure - and array or a struct or some other sort of collection - I like to avoid generic looping statements if I can, as I don't think they're terribly declarative. If one needs to somehow transform a collection into something else, I prefer to use collection-iteration methods or functions like map, reduce, filter etc. I initially did this sort of thing (JavaScript):

var words = ["a", "at", "cat", "scat", "catch"];

var match = words.reduce(function(match, word){
    if (match !== undefined) return match;
    if (word.match(/.+at/)){
        return word;
    }
}, undefined);

console.log(match);

That's all good in theory. If one is taking a collection and the data processing of it returns a single (different) value, then reduce makes sense. The problem is that reduce iterates over the entire collection whether one needs to or not, so after I've found "cat" I'm just wasting effort looking at "scat" and "catch". Now I'm not one to worry about wasting cycles and that sort of micro-optimisation, but it still didn't sit well with me.

So next I considered using the "early exit" iteration method, some. I can stop that when I've finished. The problem is that some returns a boolean. And I needed a cat. But I could solve that with some closure:

var match;
words.some(function(word){
    if (word.match(/.+at/)){
        return match = word;
    }
});
console.log(match);

Note that return statement contains an assignment, and that evaluates to true, thus exiting the collection iteration (some ends as soon as the callback returns true).

That's all well and good, except I actually needed this done in PHP and it doesn't have a some function.

In the end I just used a foreach loop and was done with it:

$words = ["a", "at", "cat", "scat", "catch"];

$match = null;
foreach($words as $word) {
    if (preg_match("/.+at/", $word) === 1) {
        $match = $word;
        break;
    }
}

echo $match;

Still: all this got me thinking it was an interesting exercise. Well: kinda. So posted the quiz to see what people came up with.

I added the constraint of no looping statements to push people towards using something more interesting. It was an artificial constraint.

Anyway, enough about me. Let's have a look at other people's code.

Isaiah

First up was Isaiah with a Ruby answer. This is short and sweet:

puts %w(a at cat scat catch).detect{|w| w =~ /.+at/ }

(I added the puts so I could check it output something)

detect is exactly what I needed for this. As per the detect docs:

Passes each entry in enum to block. Returns the first for which block is not false.
So like a combination of some and reduce, really. Cool. This answer is gonna be hard to beat (NB: I am only looking at the answers for the first time now!).

Brad

Next was Brad with a CFML answer:

echo( ['a', 'at', 'cat', 'scat', 'catch'].filter( function( i ) { return reFind( '.+at', i ) } ).first() )

This one only runs on Lucee. This slight revision works on CF2016 as well:

words = ['a', 'at', 'cat', 'scat', 'catch'];
writeOutput( words.filter( function( i ) { return reFind( '.+at', i ); } )[1] );

Brad's answer falls into the same trap my reduce version did: it keeps iterating after it could stop.

I do like how terse Brad's answer is here. Although it's borderline (borderline) unreadable. It demonstrates CFML can do some good stuff though.

Jesse

Jesse's JavaScript solution is next, once again going for the terseness prize:

console.log(['a', 'at', 'cat', 'scat', 'catch'].find((e)=>/.+at/.test(e)));

(again... the console.log is my addition).

I didn't know about the find method I have to admit, so that's cool. This is the equivalent of Isaiah's Ruby answer.

Bonus points for using an arrow function there!

Tyler

Tyler has done another JavaScript solution, this time using filter:

function getFirstArrayRegexMatch(array, regex){
  return (
    array.filter(function(element){
      if(element.match(regex)){ return element; }
    })[0]
  );
}

var a = ["a","at","cat","scat","catch"]
,   r = /.+at/;

document.write(getFirstArrayRegexMatch(a, r));

Bonus point for putting it in a function.

Ryan

Ryan's given me two JavaScript answers: one short-hand, one long-hand with a bunch of testing.

var firstMatch = (input, pattern) => input.find(item => item.match(pattern));

console.log(firstMatch(['a', 'at', 'cat', 'scat', 'catch'], '.+at'));

Similar to Jesse's answer.

function f (input, pattern) {
    var reverse = str => str.split('').reverse().join('');
    var reversedIndex = (l, i) => l - (i + 1);
    
    var j = '\n' + input.join('\n') + '\n',
        i = j.search(pattern);

    if (i == -1) return undefined;

    var end = j.indexOf('\n', i);
    var start = reversedIndex(j.length, reverse(j).indexOf('\n', reversedIndex(j.length, end) + 1));
    return j.substr(start + 1, end - start - 1);
}

console.log(f(['a', 'at', 'cat', 'scat', 'catch'], '.+at'));
console.log(f(['a', 'at', 'cat', 'scat', 'catch'], '.+cat'));
console.log(f(['a', 'at', 'cat', 'scat', 'catch'], '.t'));
console.log(f(['a', 'at', 'cat', 'scat', 'catch'], 'a'));
console.log(f(['a', 'at', 'cat', 'scat', 'catch'], '.+xat'));

This shows that Ryan paid attention to the requirements. I specifically said the answer might be late in the array, and he's catering for that here.

Plus tests! Including unhappy path tests!

Ryan knows how to get bonus points from me, eh?

Choop

Choop's used CFML. It's interesting how most of my readership is still made up of CFMLers, but most of them here have chosen not to use it. Well: a bit interesting.

/* lucee or adobe CF 11+  */
mystrings = ['a', 'at', 'cat', 'scat', 'catch'];
mytest = '.+at';
function firstMatch( strings, regex ) {
    var test = arguments.regex;
    var passing = arguments.strings.filter( function ( check ) {
        var matches = REMatch( test, check );
        return matches.Len() > 0;
    });
    return passing[ 1 ];
}
WriteOutput( firstMatch( mystrings, mytest ) );

Choop's eschewed terseness in favour of writing good clear code.

Mingo

Next up is Mingo's answer. I clearly spoke too soon before as he's stuck with CFML too. He's taken the reduce route, similar to my approach:

  arrayOfStrings = [ 'a', 'at', 'cat', 'scat', 'catch' ];
  regex = '.+at';
  writeOutput( arrayOfStrings.reduce( function( result='', item ){ return len( result ) ? result : item.reFind( regex ) ? item : ''; } ) );

Adam Tuttle made a knowing comment against this one:

I thought about nested ternaries too but figured Adam would chastise me for it. ;)
And yer bloody right! ;-)

I'm all good for a single ternary expression, but as soon as there's more than one... I think everyone's getting confused pretty quickly?

Adam

And indeed Adam's own JavaScript entry is up next. Props for using ES2015 constructs. I especially like how it can now do interpolated strings. Like CFML has been doing for about 20yrs.

"use strict"

function firstMatch ( pattern, data ) {
  return data.reduce( ( prev, curr ) => {
      if ( prev.length > 0 ) { return prev }
      if ( pattern.test( curr ) ) { return curr }
      return prev
  }, '')
}

const pattern = /.+at/
let data = ['a', 'at', 'cat', 'scat', 'catch']

console.log( `the answer is: ${firstMatch( pattern, data )}` )

His logic is very similar to my own reduce example too. (Oh, I needed to add "use strict" to that code to get it to run on my node install. Not sure why... it might be old).

As I am writing this up, I have noticed two entries stuck in my moderation queue. Sorry about that.

Eric

Here's Eric's CFML answer:

    function first( arr, predicate, defaultValue ) {
        if ( arrayIsEmpty( arr ) ) {
            if ( ! isNull( defaultValue ) ) {
                return defaultValue;
            }

            throw( "Cannot return the result because the array is either empty or no value matched the predicate with no default value provided." );
        }

        if ( isNull( predicate ) ) {
            return arr[ 1 ];
        } else {
            arguments.arr = arr.filter( predicate );
            structDelete( arguments, "predicate" );
            return first( argumentCollection = arguments );
        }
    }
    
    answer = first( ['a', 'at', 'cat', 'scat', 'catch'], function( str ) {
        return reFind( '.+at', str );
    } );
    
    writeOutput( answer );

Good to see some validation going on in there. I also really like how his own function takes a callback to do the check for what constitutes a match. Bloody nice that.

Quan Tran

Coincidentally the most interesting answer is the last one (it would not have been last had it not got stuck in the moderation queue, that said). Here's Quan Tran's recursive CFML answer:

// http://blog.adamcameron.me/2016/08/code-quiz.html
function regexSearchArray(regexString,arrStrings){
    var localArrStrings = arrStrings;
    
    if (not arrayLen(localArrStrings))
        return;
    else if (refind(regexString, localArrStrings[1]))
        return localArrStrings[1];
    else{
        ArrayDeleteAt(localArrStrings,1);
        return regexSearchArray(regexString,localArrStrings);
    }
}
arrStrings = ['a', 'at', 'cat', 'scat', 'catch'];
writeoutput(regexSearchArray('.+at',arrStrings));

That is a very inventive way of getting around my "no loop statements" rule. Very impressed.



That's quite a few entries for a wee quiz on this blog (the readership of which has pretty much died since I moved from CFML, even allowing for the distinct lack of content recently). So thanks for that.

I think all these answers had merit and discussion points and had something interesting about them. I like how terse Jesse and others managed to get their answers. I liked how Isaiah used Ruby instead of the usual suspects (for this blog, anyhow... not much Ruby going on around here). I especially like how Ryan provided more tests.

But the winner is Quan Tran with his recursive solution. It might not be the most performant, but it's def the most interesting.

Cheers all. I have a few other dumb-arse quiz questions I might continue to ask. We'll see.

Righto.

--
Adam

Friday, 12 August 2016

Code quiz

G'day:
Apropos of nothing, here's a code quiz.

Rules:
  • You have an array of strings, eg: ['a', 'at', 'cat', 'scat', 'catch'].
  • Return the first value that matches a regex pattern, eg: '.+at' would match cat, scat, catch; but we want cat returned.
  • Do not use any looping statements (eg: do/for/while etc).
  • Bear in mind functions are not statements ;-)
  • The array could be very long, the strings could be very long, and the pattern could be complex. But the desired value could be early in the array.
  • Use any language.

The prize:
Nothing really. Well: nothing at all.

Put yer answer in a Gist (or similar) and the URL to the Gist in a comment. IE: do not post code in a comment.

Righto.

--
Adam

Thursday, 28 July 2016

PHP: I finally get around to seeing how PHP interacts with a DB

G'day:
This is "interesting" in that I've been a PHP dev for closing in on two years now, but as yet I've not had a requirement to connect to a DB and... like... query stuff. Crazy! This is all down to everything we consume coming from web services rather than a DB. We had other teams to do the DB stuff behind the web services. And now I find myself actually on one of those teams. So I better learn how to do my job.

A word of warning on this one. It's gonna be one of those "just me pootling about testing how stuff works", and not really offering much insight on things given this is my first "go" at any of this stuff. It's more a (delayed) stream of consciousness of my experimentation. Delayed cos I wrote the code last night, but am only getting a chance to write it up now.

OK, so first things first, one needs to enable a DB driver... and they're not enabled by default. This is in php.ini:

;extension=php_bz2.dll
extension=php_curl.dll
;extension=php_fileinfo.dll
;extension=php_gd2.dll
;extension=php_gettext.dll
;extension=php_gmp.dll
extension=php_intl.dll
;extension=php_imap.dll
;extension=php_interbase.dll
;extension=php_ldap.dll
extension=php_mbstring.dll
;extension=php_exif.dll      ; Must be after mbstring as it depends on it
;extension=php_mysqli.dll
;extension=php_oci8_12c.dll  ; Use with Oracle Database 12c Instant Client
extension=php_openssl.dll
;extension=php_pdo_firebird.dll
;extension=php_pdo_mysql.dll ; uncomment this
;extension=php_pdo_oci.dll
;extension=php_pdo_odbc.dll
;extension=php_pdo_pgsql.dll
;extension=php_pdo_sqlite.dll
;extension=php_pgsql.dll
;extension=php_shmop.dll



The second thing I needed to do was to give php.ini a fully-qualified path to my extensions directory:

; Directory in which the loadable extensions (modules) reside.
; http://php.net/extension-dir
; extension_dir = "./"
; On windows:
;extension_dir = "ext"
extension_dir = "C:\apps\php\7\ext"


Sometimes I need to do this on a PHP install on Windows; sometimes not. And it doesn't even seem to be uniform on a given machine! Last night I needed to give this machine the full path to the ext dir; I reversed this this evening whilst writing this, and it didn't give me problems just being "ext".

Anyway, if you get this error:

Fatal error: Uncaught PDOException: could not find driver

Then it's the driver not being enabled, or PHP not being able to find the extension.

Right. So some code.

First things first, to connect to a DB one needs a connection to do so. I've wrapped this up in a class for my purposes, to get it out of the way of the rest of my code:

class ConnectionFactory {

    private static $host = 'localhost';
    private static $port = '3306';
    private static $dbName = 'scratch';

    public static function createConnection() {
        $connectionString = sprintf('mysql:host=%s;port=%s;dbname=%s', self::$host, self::$port, self::$dbName);
        $dbConnection = new \PDO($connectionString, Credentials::$login, Credentials::$password);

        return $dbConnection;
    }
}

And to get my connection:

$dbConnection = ConnectionFactory::createConnection();

Oh yeah, my credentials are hidden away in another class still:

class Credentials {

    public static $login = 'scratch';
    public static $password = 'scratch';
}

I def don't want to be sharing that secret information around the place.

So the connection could easily be created with one line of code, inline:

$dbConnection = new \PDO("mysql:host=localhost;port=3306;dbname=scratch", "scratch", "scratch");

Phew. One way or another we got there.

OK, so to run a query from there is pretty easy:

$numbers = $dbConnection->query('SELECT * FROM numbers');

And from there I can do stuff with my numbers:

foreach ($numbers as $row) {
    printf('ID: %s: English: %s, Maori: %s%s', $row['id'], $row['en'], $row['mi'], PHP_EOL);
}

This outputs:

ID: 1: English: one, Maori: tahi
ID: 2: English: two, Maori: rua
ID: 3: English: three, Maori: toru
ID: 4: English: four, Maori: wha
ID: 5: English: five, Maori: rima
ID: 6: English: six, Maori: ono
ID: 7: English: seven, Maori: whitu
ID: 8: English: eight, Maori: waru
ID: 9: English: nine, Maori: iwa
ID: 10: English: ten, Maori: tekau

It doesn't really get much easier than that.

And I'm sorry to my old CFMLer readers who still hang on to <cfquery> being a thing... no. Creating a connection and just running a query in two statements is easier than horsing around with admin config and tags and shit like that. And the code is nicer (a rare occasion of me saying something complimentary about PHP's "mise-en-scène").

That's a very overly simplistic example though. What about doing some filtering via a parameter in a WHERE clause?

The query method seems to just be very quick way of passing a static string to the DB and having it processed and returned. That's about it. To pass params, we need to do slightly more work. We need to prepare a statement and execute it:

$preparedStatement = $dbConnection->prepare('SELECT * FROM numbers WHERE id <= :upperThreshold');
$preparedStatement->execute(['upperThreshold' => $argv[1]]);

$numbers = $preparedStatement->fetchAll();

(note that one doesn't need to fetchall: one can just get the result row-at-a-time with fetch, but that's something for another day).

And this runs thus, when I give it a "4" (using the same "view" code as I did last time):


>php C:\src\preparedStatement.php 4
ID: 1: English: one, Maori: tahi
ID: 2: English: two, Maori: rua
ID: 3: English: three, Maori: toru
ID: 4: English: four, Maori: wha

Process finished with exit code 0

That was easy.

Aside:

I thought passing data values in the SQL statement was a failing of CFML developers. I mean this sort of thing:

<cfquery>
    SELECT *
    FROM table
    WHERE someColumn = '#dataValue#'
</cfquery>

Unfortunately I am seeing this shit in PHP code too. The devs are saying "but we sanitise the value first so there's no SQLi risk". Completely missing the point that the SQL statement and its parameters are supposed to be kept separate so the statement can be compiled.

Sigh.


As well as just passing the parameter values to the execute call, one can also bind values or variables to a parameter. Binding a value isn't that interesting:

$preparedStatement = $dbConnection->prepare('SELECT * FROM numbers WHERE id <= :upperThreshold');
$preparedStatement->bindValue(':upperThreshold', $upperThreshold, PDO::PARAM_INT);
$preparedStatement->execute();

The added bonus here is one can enforce a type on the value being bound too.

Binding a variable is more interesting. Check this out:
$upperThreshold = 1;

$dbConnection = ConnectionFactory::createConnection();
$preparedStatement = $dbConnection->prepare('SELECT * FROM numbers WHERE id <= :upperThreshold');
$preparedStatement->bindParam(':upperThreshold', $upperThreshold, PDO::PARAM_INT);

$upperThreshold = 10;

$preparedStatement->execute();
$numbers = $preparedStatement->fetchAll();

Here I'm binding the variable $upperThreshold to the param. But it's binding a reference so by the time we run this query, the bound value is 10, not the 1 that it was when the bind was first made.

It occurs to me now (as I type) that I've not tested binding a not-yet-declared variable. Let's try a variation:
$preparedStatement = $dbConnection->prepare('SELECT * FROM numbers WHERE id <= :upperThreshold');
$preparedStatement->bindParam(':upperThreshold', $upperThreshold, PDO::PARAM_INT);

$upperThreshold = 4;

$preparedStatement->execute();

Here I bind the $upperThreshold variable before I even declare it. And this all works fine!

>php.exe C:\src\preparedStatementBindParamWithNoVariable.php
ID: 1: English: one, Maori: tahi
ID: 2: English: two, Maori: rua
ID: 3: English: three, Maori: toru
ID: 4: English: four, Maori: wha

Process finished with exit code 0

So that's quite cool. Kinda.

The next thing I looked at was a simple transaction proof of concept:

$paramArray = ['id' => $argv[1]];

$dbConnection = ConnectionFactory::createConnection();

$dbConnection->beginTransaction();
$deleteStatement = $dbConnection->prepare('DELETE FROM numbers WHERE id=:id');
$deleteStatement->execute($paramArray);

$selectStatement = $dbConnection->prepare('SELECT * FROM numbers WHERE id <= (:id + 1)');
$selectStatement->execute($paramArray);
$numbers = $selectStatement->fetchAll();

echo 'In transaction:' . PHP_EOL;
include __DIR__ . '/../view/numbers.php';

$dbConnection->rollBack();

$selectStatement->execute($paramArray);
$numbers = $selectStatement->fetchAll();

echo 'After rollback:' . PHP_EOL;
include __DIR__ . '/../view/numbers.php';

Here we do this:

  • start a transaction;
  • delete a record;
  • run a query which shows the deletion having taken place;
  • rollback the transation;
  • perform the same query again, and the deletion has been rolled back.

Easy. I'd rather see some capability to explicitly hold onto the transaction until it was released. I don't think the act of rolling back should intrinsically mark the resolution of the transaction? One should be able to perform actions and rollback as often as one wants in one single transaction, shouldn't one? It's not something I've ever had to try, that said.

Oh, and the output:


>php.exe C:\src\transaction.php 4
In transaction:
ID: 1: English: one, Maori: tahi
ID: 2: English: two, Maori: rua
ID: 3: English: three, Maori: toru
ID: 5: English: five, Maori: rima
After rollback:
ID: 1: English: one, Maori: tahi
ID: 2: English: two, Maori: rua
ID: 3: English: three, Maori: toru
ID: 4: English: four, Maori: wha
ID: 5: English: five, Maori: rima

Process finished with exit code 0

Excellent.

Lastly I had to remind myself how to create a stored proc in MySQL, and I managed a pretty stupid one, but one which works for my example:

delimiter $$

CREATE DEFINER=`root`@`localhost` PROCEDURE `getThingsById`(id INT)
BEGIN
  SELECT * FROM numbers n WHERE n.id = id;
  SELECT * FROM colours c WHERE c.id = id;
  SELECT * FROM days d WHERE d.id = id;
END$$

This returns three record sets, each containing the number, colour or day of the week of the passed-in ID. It's nonsense, I know.

To call a proc, one still uses a prepared statement:

$id = $argv[1];

$dbConnection = ConnectionFactory::createConnection();

$preparedStatement = $dbConnection->prepare('call getThingsById(:id)');
$preparedStatement->execute(['id' => $id]);

$resultSet = $preparedStatement->fetchAll();
include __DIR__ . '/../view/general.php';

$preparedStatement->nextRowset();

$resultSet = $preparedStatement->fetchAll();
include __DIR__ . '/../view/general.php';

$preparedStatement->nextRowset();

$resultSet = $preparedStatement->fetchAll();
include __DIR__ . '/../view/general.php';

general.php:

foreach ($resultSet as $row) {
    printf('ID: %s: English: %s, Maori: %s%s', $row['id'], $row['en'], $row['mi'], PHP_EOL);
}


This is all pretty familiar, except the thing to note is the nextRowset call, which moves between the recordsets returned by the proc call.

Oh yeah... I keep forgetting... the output:


>php.exe C:\src\php\doctrine.local\scripts\callProc.php 3
ID: 3: English: three, Maori: toru
ID: 3: English: yellow, Maori: kowhai
ID: 3: English: Wednesday, Maori: Rāapa


That's all I've looked at so far. All in all it all seems uncharacteristically sensibtle for PHP, I must say.

And now I've been sitting in this hotel bar typing for the best (?) part of four hours. Time to focus on finishing this beer and going to bed I think.

Righto.

--
Adam

Wednesday, 27 July 2016

PHP / Silex / Dependency injection: should I use a reference to a method?

G'day:
This is another example of where I want to put a question out there, but it needs more space that Twitter will allow. I guess I'll put it on Stack Overflow too.

We use Silex and Pimple's DI container. Generally in our service providers we expose references to objects, eg:

$app['service.something'] = $app->share(function ($app) {
    return new SomethingService();
});

Then usage of that is predictable:

$result = $app['service.something']::someMethod(1, 2, 3);

But here's the thing. As with the case above, the method is actually a static one. So it seems "odd" to be calling it on an instance of the SomethingService, rather than on the class. As coincidence would have it, this is the only public method in SomethingService too.

So we've done this sort of thing in our service provider:

$app['something'] = $app->protect(function ($arg1, $arg2, $etc) {
    return SomeClass::someMethod($arg1, $arg2, $etc);
});

Usage:

$result = $app['something'](1, 2, 3);

Now this works OK, but I have a coupla hesitations:
  • the name $app['something'] is a bit noun-y. As it's a method it ought to be verb-y IMO, eg: $app['doTheThing']. I guess that's small fish.
  • I'm just not sure if it's particularly "semantic" to be exposing just methods like that. All usage of Pimple I've seen has been to expose dependent objects.
I think we'll run with what we've got, but I kinda want to get more thoughts on if this is the right approach. And if so (or if not) why not. There might be some gotchas here I'm not thinking of, or something.

Cheers.

--
Adam

Monday, 25 July 2016

PHP: re-call the same method for the same result, or use an intermediary variable?

G'day:
I could not fit this in a Twitter message, so I'll clutter up this joint instead. And, hey, it could use some content, I know.

I recently came across some PHP code (this happens to me a lot these day), which was along these lines:

$someObj->someMethod()['someKey']->doSomething();
$someObj->someMethod()['someOtherKey']->doSomethingElse();
$someObj->someMethod()['anotherKey']->performAnotherAction();
$someObj->someMethod()['lastKey']->doLastThing();

Obviously I've changed the names to protect the innocentterloper, but this is pretty much what's going on in the code. On four consecutive statements the same method is being called four times, returning the same result, which is then used for further processing.

If this was my code, I'd see no point in repeating myself all over again, instead just ask the question once, and reuse the answer:

$descriptiveName = $someObj->someMethod();
$descriptiveName['someKey']->doSomething();
$descriptiveName['someOtherKey']->doSomethingElse();
$descriptiveName['anotherKey']->performAnotherAction();
$descriptiveName['lastKey']->doLastThing();

To me calling a method is work, and I'd like to avoid work if possible. Just as my colleagues.

However this is not the only occurrence of the former approach I've seen, so perhaps I am missing something? Is there some benefit to desirable behaviour from calling the same method for the same answer repeatedly?

Note I'm not so fussed about performance in situations like this, although I would have expect calling a method multiple times would have an inconsequential performance hit compared to an intermediary variable. Is this not the case? But like I said... whether or not it might be is not really a consideration here. It just seems like doing work for the sake of it?

Thoughts?

For the sake of conversation and broadening everyone's knowledge... in your own language of choice if it's not PHP: what would your answer to this be?

--
Adam

Monday, 18 July 2016

REST & nouns & verbs: analysing the right problem

G'day:
I've recently changed teams at work, and have shifted from "safe" code bases that I've either been working on daily for over a year, or new applications that I've had a hand in from the ground up. Now I'm doing maintenance and adding features to existing code-bases written by different teams for different requirements, developed to different coding guidelines. So that's all a bit of a rude awakening that I am still settling into. I'm definitely out of my comfort-zone, and having to ask different questions than I usually might. That said: it's a good move and I kinda like that sort of thing, so cool!

For better of for worse, we use a lot of REST web services even for our internal processes. I personally believe this is architecturally questionable, but as no-one thought to ask me the question, I didn't get to partake in that particular discussion. To be fair to the situation, it was a decision made before I was part of the department; and - equally - I'm not entirely convinced of my position that it's a daft idea. Just mostly convinced ;-) Anyway, as the over-used truism declares: we are where we are.

Anyway, I find myself doing some maintenance work on one of our web services. This particular service handles the back-end processing for our web applications which make image file uploads. As part of the file upload we need to do a few things:


  • upload the file and stick it in the appropriate place;
  • store some user-provided data about the file (name, description, etc);
  • create a few different resizings of the image for various purposes;
  • contrive some metadata on all the file variations based on [stuff].
  • distribute all the files across our CDN.


The first two of those steps are fast and can be done in real time. The file operations (resizing and distribution) take some time, and obviously the metadata extraction cannot be peformed until the files actually exist.

We have these split into two processes: one that always occurs immediately that uploads the master file and stores it and its user-enter data; then a second process which handles all the slow stuff.  This is just so we can provide fast feedback to the UI. Note that all this is still an atomic operation: both processes need to run.

On the first iteration of the implementation of this the two processes were fired off by the UI. Basically the upload form was submitted and the main request would handle the upload and the data-write, and then a second request was fired off by AJAX. This is not a robust solution as it isn't atomic. It's possible that the first call can be made but the second one doesn't for some reason. From an architectural point of view it's just not up to the view layer to make this call either; the versioning and metadata processing is an adjunct to the main file-handling process on the server: it's more the second half of the same process, not a separate process in its own right. Realistically the two are only separated at all because of performance considerations. From the perspective of the consumer of the web service: it should be one call to the API.

This was driven home to us by a second web app needing to do the same operations with the same web service, and the bods on that team knew about the first call but not the second.

So my task was to remediate this.

The solution we adopted was to have the front-end web app only call the one process, POSTing to http://api.example.com/file, passing along the file-upload form information as the POST body: this is what the UI-driven process was already doing. I simply augmented that to put an async job into our queuing system. The queue job receives the ID of the created file, and then it simply makes a second REST call, also POSTing to another end point: http://api.example.com/process. All good.

A first observation here is that "process" really isn't a great name for a REST end point. An end point should have meaning unto itself, and "process" is meaningless unless one knows it's tightly coupled to an initial end point relating to a /file: "Ah: so it's processing the file". One shouldn't need this sort of context when it comes to a web service end point. Sadly this end point is now out in the wild. Well: here "the wild" is just internal apps, but we can't simply change the URL to be something better without scheduling updates to those apps too. I'll be back-logging work to get that sorted out.

Part of my work here was to boyscout-rule the controllers of the "process" end point to move business logic out of the controller (yeah, don't ask), and into a service tier, which called a repository tier etc.

Some of the business logic I moved was to extract the decision as to which request parameter to use for the ID. Apparently we have some legacy code somewhere which passes the file ID as another argument name - I do not have the code in front of me right now so I can't remember what it is - so there was some logic to determine whether to use the legacy argument or the new one. Arguably this is perhaps still the job of the controller - it's dealing with stuff from the request object, which I can sell to myself as being controlller logic - but we're pretty stringent that we have as little logic in the controller as possible. And as I already needed a service method to do "Other Stuff", so I pushed this back into helper in the service too: something like extractDistributionCriteria(array $allCriteria). I don't want to be passing the entire request object into the service, so I extracted the POST params and passed those, using Silex's $request->request->all() method, and that returns the POST params as an array. Other than that, I was just shuffling logic around, so everything should be grand.