Sunday, 9 August 2015

CFML: an exercise in converting a nested set tree into an adjacency list tree

G'day:
One of the bods on the CFML Slack channel - who coincidentally is a fellow Kiwi - Marty Pine got me interested in this whilst we were chatting about some code last night. I tried to get my brain around the problem at the time, but I was 3-4 pints into a pub session at the time, and it was a bit too complex. I dunno how much use I was to him solving his problem.

However this got me thinking today, during my fortnightly "four hours to kill @ Shannon Airport" stints. As an exercise for myself, I decided to knock together some code to convert a nested set tree into an adjacency list tree.

I couldn't quite solve this one in my head... it requires recursion and my brain doesn't recurse very well (just "curse": f*ckin' fine. Just not recurse). So I've been trying out various versions of my solution for a coupla hours now, but I've finally decided I can't take any more code out, and I've only got one wee niggle with my solution.

So here's the data:

tree = [
    {id=1, left=1, right=28},
        {id=2, left=2, right=17},
            {id=3, left=3, right=10},
                {id=4, left=4, right=5},
                {id=5, left=6, right=7},
                {id=6, left=8, right=9},
            {id=7, left=11, right=16},
                {id=8, left=12, right=13},
                {id=9, left=14, right=15},
        {id=10, left=18, right=27},
            {id=11, left=19, right=22},
                {id=12, left=20, right=21},
            {id=13, left=23, right=26},
                {id=14, left=24, right=25}
];

I've just indented it like that to make the hierarchy clearer.



The problem with a vanilla nested set is that whilst it's dead easy to work out a node's full ancestory or its full descendant subtree, but it's not so easy to just find a node's parent or immediate children. On the other hand, an adjacency list is excellent at finding what's adjacent (oddly enough), so the parent and children of a node; but it's not so excellent at finding the hierarchy in either direction, without using recursion. To this end, I usually use a hybrid approach: have a nested set, but as well as the left/right, I also maintain a parent. But... anyhow.

So the deal is here I want to convert this to a nested structure that is based around the parent/child relationship. Well: a parent node knows its children in this case.

The rule I'm using here is that a node's children (if any) are the nodes that "stretch" across the left/right gap of the parent. So the first child has a child.left of parent.left+1; the next immediate child has a nextChild.left which is previousChild.right+1 (if that makes sense... perhaps just look at the data). And for each child... I get its children via exact the same mechanism (calling it recursively). And if I then just "remember" everything at the right time as I traverse the tree... I'll get a nice recursive data structure.

Leveraging CFML's iteration functions, this is surprisingly easy:

function convertNestedSetToAdjacencyList(tree, node=tree[1]){
    var nextChildLeft = node.left + 1;
    return {
        id = node.id,
        children = tree.filter(function(node){
            if (node.left != nextChildLeft){
                return false;
            }
            nextChildLeft = node.right + 1;
            return true;
        }).map(function(child){
            return convertNestedSetToAdjacencyList(tree,child);
        })
    };
}

That's a total of two statements in the function:

  1. setting nextChildLeft;
  2. returning the result.

Obviously the "returning the result" bit has a coupla iteration method calls, but they're still pretty basic!

  1. run a filter to get the children.
  2. remap the children  calling the function again for each one.

The "tricky" thing here - and something that I've never attempted before - is that the filter method progressively changes what it filters on, to just get the "next" child, based on the fact that the next child is immediately to the right of the previous one. And that's a simple calculation.

Dumping-out the result of this gives the following:


This does indeed create an adjacency list hierarchy. Job done.

One thing I thought about is that for a given child, I do not need to filter the entire tree to find its children; just the descendants of said child. I added in a filter in on the recursive bit thus:

convertNestedSetToAdjacencyList(tree.filter(function(potentialDescendant){
    return node.left < potentialDescendant.left && node.right > potentialDescendant.right;
}),child)

So that only passes the descendant subtree into the recursive call. But that was actually doing slightly more work than just using the full tree. I suspect it depends on the size of the tree (or its depth or its width?) as to whether there's a gain to be had here.

I also figured I should put a reference back to the parent into each child too, so the hierarchy is traversable in both directions, which is easily achievable doing this:

function convertNestedSetToAdjacencyList(tree, node=tree[1],parent=0){
    var nextChildLeft = node.left + 1;
    return {
        id = node.id,
        parent = parent,
        children = tree.filter(function(node){
            if (node.left != nextChildLeft){
                return false;
            }
            nextChildLeft = node.right + 1;
            return true;
        }).map(function(child){
            return convertNestedSetToAdjacencyList(tree,child, node.id);
        })
    };
}

Or a variation on that theme in which one passes the entire struct for this level into the recursion as the parent, depending on what one needs. Still: this is all variations on the same theme, so I won't witter on.

Well that wiled away the time waiting for my flight. I might go find my gate now, and sit down there.

Righto.

--
Adam