Wednesday 26 March 2014

Using a sort iteration function to implement a custom ranking

G'day:
It occurred to me last night that - in all my examples of using a callback-based sort on a collection - I am always just using the natural ordering of a field; eg I sort an array of structs on a key in the struct which needs either alphabetical or numeric sorting. EG:

``````numbers = [
{maori="wha", digit=4},
{maori="toru", digit=3},
{maori="rua", digit=2},
{maori="tahi", digit=1}
]

dump(numbers.sort(function(v1,v2){
return sgn(v1.digit - v2.digit)
}))
``````

This is selling the functionality short, somewhat.

Here's an example of using a callback sort to effect sorting on a custom sequence: not numeric or alphabetical, but any custom sequence one might want. I achieve this by using closure to create a custom sort callback, then using the callback in the sort operation:

``````function customRankings(required array options, required string field, boolean throwOnNoMatch=false){
return function(v1,v2){
var v1Place = options.findNoCase(v1[field])
var v2Place = options.findNoCase(v2[field])

if (v1Place && v2Place){
return sgn(v1Place - v2Place)
}
if (throwOnNoMatch){
throw(type="IllegalValueException", message="Incorrect value in sort field", detail="Sort field must contain only #options.toString()#")
}

if (v1Place){
return -1
}

if (v2Place){
return 1
}

if (isNumeric(v1[field]) && isNumeric(v2[field])) {
return sgn(v1[field] - v2[field])
}

return compareNoCase(v1[field], v2[field])
}
}

medalSorter = customRankings(["gold","silver","bronze"], "place")

raceResults = [
{lane=1, name="Angus", place="silver"},
{lane=2, name="Barbara", place="gold"},
{lane=3, name="Colin", place="5th"},
{lane=4, name="Ethel", place="4th"},
{lane=5, name="Faisal", place="bronze"}
]

dump(raceResults.sort(medalSorter))
``````

There are three sections to this:
1. a function which returns a sort callback based on the rankings provided;
2. creating a sorter for the given rankings;
3. sorting data by the sorter.
Looking closer at the function:

``````function customRankings(required array options, required string field, boolean throwOnNoMatch=false){
return function(v1,v2){
var v1Place = options.findNoCase(v1[field])
var v2Place = options.findNoCase(v2[field])

if (v1Place && v2Place){
return sgn(v1Place - v2Place)
}
if (throwOnNoMatch){
throw(type="IllegalValueException", message="Incorrect value in sort field", detail="Sort field must contain only #options.toString()#")
}

if (v1Place){
return -1
}

if (v2Place){
return 1
}

if (isNumeric(v1[field]) && isNumeric(v2[field])) {
return sgn(v1[field] - v2[field])
}

return compareNoCase(v1[field], v2[field])
}
}
``````

It takes up to three arguments:
1. an array of options to sort on. The order in the array represents highest to lowest (eg: ["gold","silver","bronze"]).
2. A field in each structure to use for sorting. This example only works on an array of structs, obviously.
3. Optionally whether to allow any value in that field, or to throw an error if there's a different value.
The logic then is fairly simple:
• find the values in the array of options
• if they're both found, sort on those two as per the normal sorting rules
• if they're not both found, and if we're flagged to raise an error if this happens, do so
• otherwise if either item contains one of the sort options, it is ranked higher than the other value
• finally - so neither value is one of the options - either treat them numerically if possible, otherwise as strings, and sort on that.
Once we've done that, it's just a matter of calling that function to create a custom sorter, then using it.

We use closure in `customRankings()` to bind references to the options and field values when creating `medalSorter()`. Note that we only need to specify those when we call `customRankings()`... they're enclosed within `medalSort()` at that point, so `medalSorter()` "remembers" what the options and field were. Hence it's specifically a `medalSorter()` not simply some generic sorter. So it pretty much binds the input data and the code to process the data together in one. Also note that the function returned also takes two arguments: `v1` and `v2` which are what the callback passed to an array sort will receive.

Let's revisit the actual `sort()` call:

``````raceResults.sort(medalSorter)
``````

`sort()` takes a function. Often we see the function being passed inline as a function expression, eg:

``````numbers.sort(function(v1,v2){
return sgn(v2.digit - v1.digit)
})
``````

But `sort()` doesn't take a function expression, per se, it just takes a function. A function expression returns a function, but we can use an already-existing function if we want. In fact we could also have done without creating the `medalSorter()` function at all, and simply do this:

``````raceResults.sort(customRankings(["gold","silver","bronze"], "place"))
``````

Here the value passed to `sort()` is neither an existing function nor a function expression, but we call a function which returns a function. So `sort()` still receives the function it needs.

One thing that's not immediately clear with my example above is that `customRankings()` can be reused to provide more custom sorters. We can build on that code above:

``````raceResults = [
{lane=1, name="Angus", place="silver"},
{lane=2, name="Barbara", place="gold"},
{lane=3, name="Colin", place="5th"},
{lane=4, name="Ethel", place="4th"},
{lane=5, name="Faisal", place="bronze"}
]
medalSorter = customRankings(["gold","silver","bronze"], "place")
dump(raceResults.sort(medalSorter))

rugbyRankings = [
{country="Australia", ranking="3rd"},
{country="England", ranking="4th"},
{country="New Zealand", ranking="1st"},
{country="South Africa", ranking="2nd"}
]
ordinalSorter = customRankings(["1st","2nd","3rd","4th"], "ranking")
dump(rugbyRankings.sort(ordinalSorter))
``````

So here we're using `customRankings()` a second time to create another custom sorter. The code for `customRankings()` might be a bit long, but once it's there: using it is a simple one liner. Which is pretty cool, I reckon.

Oh, and the output for all this is:

Array
1
Struct
LANE
 number 2
NAME
 string Barbara
PLACE
 string gold
2
Struct
LANE
 number 1
NAME
 string Angus
PLACE
 string silver
3
Struct
LANE
 number 5
NAME
 string Faisal
PLACE
 string bronze
4
Struct
LANE
 number 4
NAME
 string Ethel
PLACE
 string 4th
5
Struct
LANE
 number 3
NAME
 string Colin
PLACE
 string 5th

Array
1
Struct
COUNTRY
 string New Zealand
RANKING
 string 1st
2
Struct
COUNTRY
 string South Africa
RANKING
 string 2nd
3
Struct
COUNTRY
 string Australia
RANKING
 string 3rd
4
Struct
COUNTRY
 string England
RANKING
 string 4th

I hope this goes some way to better demonstrate how cool these iteration functions are. And - for a change - actually using closure with them too.

--