Showing posts with label Code Puzzle. Show all posts
Showing posts with label Code Puzzle. Show all posts

Tuesday 6 July 2021

TDD, code puzzle, recursing reductions

G'day:

I couldn't think of a title for this one, so: that'll do.

A few days ago Tom King posted this on the CFML Slack channel:

I'm sure I've asked this before, but has anyone got a good function to take complex values and parse out to a curl style string? i.e, https://trycf.com/gist/0c2672ac4a02d4c52d1e3ee1c9a408e7/lucee5?theme=monokai
The code in the gist is thus:
original = {
    'payment_method_types' = ['card'],
    'line_items' = [
        {
            "price_data" = {
                "product_data" = {
                    "name" = "Something"
                },
                "currency" = 'gbp',
                "unit_amount" = 123
            },
            "quantity" = 1
        }
    ]
};

writeDump(var=original, label="Original");

wantedResult["payment_method_types[0]"] = 'card';
wantedResult["line_items[0][price_data][product_data][name]"] ='Something';
wantedResult["line_items[0][price_data][currency]"] = 'gbp';
wantedResult["line_items[0][price_data][unit_amount]"] = 123;
wantedResult["line_items[0][quantity]"] = 1;

writeDump(var=wantedResult, label="Want this:");

So he wants to flatten out the key-path to the values. Cool. I'll give that a bash. Note: we've actually already solved this, but I'm now going back over the evolution of my solution, following my TDD steps. This is more a demo of using TDD than it is solving the problem: don't worry about how much code their is in this article, it's mostly incremental repetition, or just largely copy-and-pasted-and-tweaked tests. I will admit that for my first take on this I just wrote the one test that compared my result to his final expectation, and it took me far more time to sort it out than it ought to. I did the whole thing again last night to prep the code for this article, and using more focused TDD I solved it faster. Obviously I was benefiting from already kinda remembering how I did it the first time around, but even the first time around I could see how to do it before I even wrote any code, I just pantsed around making mistakes the first time due to not focusing. TDD on my second attempt helped me focus.

Last things first: the final test

I've got the "client's" final requirement already, so I'll create a test for that now, and watch my code error because it doesn't exist (my code, that is). I'll link the test case description through to the current state of the code in Github. I'll tag each step.

it("fulfils the requirement of the puzzle", () => {
    input = {
        "payment_method_types" = ["card"],
        "line_items" = [
            {
                "price_data" = {
                    "product_data" = {
                        "name" = "Something"
                    },
                    "currency" = "gbp",
                    "unit_amount" = 123
                },
                "quantity" = 1
            }
        ]
    }
    expected = {
        "payment_method_types[0]" = "card",
        "line_items[0][price_data][product_data][name]" = "Something",
        "line_items[0][price_data][currency]" = "gbp",
        "line_items[0][price_data][unit_amount]" = 123,
        "line_items[0][quantity]" = 1
    }

    actual = flattenStruct(input)

    expect(actual).toBe(expected)
})

This errors with No matching function [flattenStruct] found, because I ain't even started yet! Note: for this exercise, I'm just building the function in the test class itself, given it's just one function, and it only needs to exist to fulfil this blogging/TDD exercise.

This test will keep erroring until I've finished. This is fine. I am not going to look at that test code again until I have finished the task though.

It handles simple value key/value pairs

I'm not going to be crazily pendantic with the TDD here. For example in this case I'm rolling in "creating the function and solving the first step of the challenge" into one step. I've already got a test failing due to the lack of the function existing, so I can watch it's progress for some of the micro-iterations I am missing out. So for this step I'm gonna iterate over the struct, and deal with the really easy cases: top level key/value pairs that are simple values and do not require any recursion to work. This is also the "bottom" level of the recursion I'll move on to, so know I already need it anyhow. Maybe making the decision that I need to use recursion to sort this out is pre-emptive, but I'm buggered if I know any other way of doing it. And taking recursion as a given, I always like to start with the lowest level solution: the bit that doesn't actually recurse.

Here's my test for this:

it("handles top-level key/values with simple values", () => {
    input = {
        "one" = "tahi",
        "two" = {"second" = "rua"},
        "three" = {
            "third" = {"thrice" = "toru"}
        }
    }
    expected = {
        "[one]" = "tahi"
    }

    actual = flattenStruct(input)

    expect(actual).toBe(expected)
})

And the implementation to make it pass:

function flattenStruct(required struct struct) {
    return struct.reduce((flattened, key, value) => {
        if (isSimpleValue(value)) {
            return {"[#key#]" = value}
        }
        return flattened
    }, {})
}

That's lovely, and the test passes, but there's a bug in it which I didn't notice the first time around.

It handles multiple top-level key/values with simple values

it("handles multiple top-level key/values with simple values", () => {
    input = {
        "one" = "tahi",
        "first" = "tuatahi",
        "two" = {"second" = "rua"},
        "three" = {
            "third" = {"thrice" = "toru"}
        }
    }
    expected = {
        "[one]" = "tahi",
        "[first]" = "tuatahi"
    }

    actual = flattenStruct(input)

    expect(actual).toBe(expected)
})

That fails with Expected [{[one]={tahi}, [first]={tuatahi}}] but received [{[one]={tahi}}]. It's cos I'm a dropkick and not appending each iteration to the previous:

return {"[#key#]" = value}

// should be

return flattened.append({"[#key#]" = value})

Once I fixed that: the two TDD tests are green. Obviously the initial one is still broken cos we ain't done.

It handles complex values

Now all I needed was to add some structs to the sample data for the test:

it("handles complex values", () => {
    input = {
        "one" = "tahi",
        "first" = "tuatahi",
        "two" = {"second" = "rua"},
        "three" = {
            "third" = {"thrice" = "toru"}
        }
    }
    expected = {
        "[one]" = "tahi",
        "[first]" = "tuatahi",
        "[two][second]" = "rua",
        "[three][third][thrice]" = "toru"
    }

    actual = flattenStruct(input)

    expect(actual).toBe(expected)
})

And write a quick implementation:

function flattenStruct(required struct struct) {
    flatten = (flattened, key, value, _, prefix="") => {
        var prefixedKey = "#prefix#[#key#]"
        if (isSimpleValue(value)) {
            return flattened.append({"#prefixedKey#" = value})
        }

        return flattened.append(value.reduce(
            function (flattened, key, value, _, prefix=prefixedKey) {
                return flattened.append(flatten(flattened, key, value, _, prefix))
            },
            {}
        ))
    }

    return struct.reduce(flatten, {})
}

This warrants some explanation:

  • We need to extract all the logic we currently have within the outer function into its own function. We need to do this because when we recurse, we need it to call itself, so it needs a name.
  • The recursion works by saying to each level of the recursion "the previous level had this key, so append your keys to this", so we implement this, and default the prefix to be an empty string to start with.
  • We're leveraging a nice feature of CFML here in that one can pass additional arguments to a function, and "it'll just work", so we pass that after all the other arguments. We're not using the fourth parameter to the reduce callback which contains the original collection being iterated over, so we reflect that by calling the parameter _ (this is a notional standard for this sort of thing).
  • When we get to the "bottom" of the recursion we will always have a simple value for the value (otherwise we're not done recursing), so we prepend its key with the parent key (which will be all the ancestor keys combined).

I also now needed to update my previous test case expectations to include the structs coming out in the result too. In hindsight, I messed up there: I should only have had test data for the case at hand, rather than having some more data that I knew was specifically going to collapse in a heap.

But it also has a bug. The first test completely broke now: Can't cast Complex Object Type Struct to StringUse Built-In-Function "serialize(Struct):String" to create a String from Struct

I find the situation here a bit frustrating. The signature for the callback function for Struct.reduce is this:

function(previous, key, value, collection)

And that's what I have in the code above.

However. I had overlooked that the signature for the callback function for Array.reduce is this:

function(previous, value, key, collection)

It had never occurred to me that key, value and value, key were reversed between the two. Sigh. So my single handling of complex values there doesn't work because I'm passing Array.reduce the key and the value in the wrong order.

It's reasonably easy to fix though.

It handles arrays

I've added an array into the test data:

it("handles arrays values", () => {
    input = {
        "one" = "tahi",
        "first" = "tuatahi",
        "two" = {"second" = "rua"},
        "three" = {
            "third" = {"thrice" = "toru"}
        },
        "four" = ["fourth", "quaternary", "wha", "tuawha"]
    }
    expected = {
        "[one]" = "tahi",
        "[first]" = "tuatahi",
        "[two][second]" = "rua",
        "[three][third][thrice]" = "toru",
        "[four][1]" = "fourth",
        "[four][2]" = "quaternary",
        "[four][3]" = "wha",
        "[four][4]" = "tuawha"
    }

    actual = flattenStruct(input)

    expect(actual).toBe(expected)
})

And added logic to check if the value is a struct or an array before recursing. The variation just being to expect the callback parameters to be in a different order:

function flattenStruct(required struct struct) {
    flatten = (flattened, key, value, _, prefix="") => {
        var prefixedKey = "#prefix#[#key#]"
        if (isSimpleValue(value)) {
            return flattened.append({"#prefixedKey#" = value})
        }

        if (isStruct(value)) {
            return flattened.append(value.reduce(
                function (flattened={}, key, value, _, prefix=prefixedKey) {
                    return flattened.append(flatten(flattened, key, value, _, prefix))
                },
                {}
            ))
        }
        if (isArray(value)) {
            return flattened.append(value.reduce(
                function (flattened={}, value, index, _, prefix=prefixedKey) {
                    return flattened.append(flatten(flattened, index, value, _, prefix))
                },
                {}
            ))
        }
    }

    return struct.reduce(flatten, {})
}

With this amendment all the TDD tests pass, but the original test - which I was kinda expecting to also pass now - wasn't. I had a coupla bugs in my implementation still, basically from me "not reading the requirements"

It qualifies the keys correctly and zero-indexes the arrays

Yes yes, very bad to fix two things at once, I know. I didn't notice that Tom's requirements were to present the flattened keys like this (taken from the previous test): four[0], whereas I'm doing the same one like this: [four][1]. So I'm not supposed to be putting the brackets around the top-level item, plus I need to present the arrays with a zero index. This is easily done, so I did both at once. I figure I'm kinda OK doing this cos I am starting with a failing test afterall. And I also added a more complex test case in (mixing arrays, structs, simple values some more) to make sure I had nailed it:

it("qualifies the keys correctly and zero-indexes the arrays", () => {
    input = {
        "one" = "tahi",
        "first" = "tuatahi",
        "two" = {"second" = "rua"},
        "three" = {
            "third" = {"thrice" = "toru"}
        },
        "four" = ["fourth", "quaternary", "wha", "tuawha"],
        "five" = [
            {"fifth" = "rima"},
            ["tuarima"],
            "quinary"
        ]
    }
    expected = {
        "one" = "tahi",
        "first" = "tuatahi",
        "two[second]" = "rua",
        "three[third][thrice]" = "toru",
        "four[0]" = "fourth",
        "four[1]" = "quaternary",
        "four[2]" = "wha",
        "four[3]" = "tuawha",
        "five[0][fifth]" = "rima",
        "five[1][0]" = "tuarima",
        "five[2]" = "quinary"
    }

    actual = flattenStruct(input)

    expect(actual).toBe(expected)
})

And the fix:

function flattenStruct(required struct struct) {
    flatten = (flattened, key, value, actual, prefix="") => {
        var offsetKey = isArray(actual) ? key - 1 : key
        var qualifiedKey = prefix.len() > 0 ? "[#offsetKey#]" : offsetKey
        var prefixedKey = "#prefix##qualifiedKey#"

        if (isSimpleValue(value)) {
            return flattened.append({"#prefixedKey#" = value})
        }

        if (isStruct(value)) {
            return flattened.append(value.reduce(
                function (flattened={}, key, value, actual, prefix=prefixedKey) {
                    return flattened.append(flatten(flattened, key, value, actual, prefix))
                },
                {}
            ))
        }
        if (isArray(value)) {
            return flattened.append(value.reduce(
                function (flattened={}, value, index, actual, prefix=prefixedKey) {
                    return flattened.append(flatten(flattened, index, value, actual, prefix))
                },
                {}
            ))
        }
    }

    return struct.reduce(flatten, {})
}

Quite simply:

  • If the data we are reducing is an array, we reduce the index by one;
  • and we only put the brackets on if we've already got a prefix value.
  • We're using that _ parameter we were not using before, so I've given it a name.

Now everything works:

Or does it?

It handles the collection being an arguments scope object

I handed this implementation (or something very similar) to Tom with a flourish, and a coupla hours later he was puzzled about what sort of object the arguments scope is. It behaves like both a struct and an array. For some things. And it's ambiguous in others. When Tom was passing an arguments scope to this function, it was breaking again:

The arguments scope passes an isArray check, but its reduce method handles it like a struct, so they key being passed is the parameter name, not the positional index like we'd be expecting with an array that it's just claimed to us that it is.

Let's have a test for this:

it("handles arguments objects", () => {
    input = {
        "one" = "tahi",
        "first" = "tuatahi",
        "two" = {"second" = "rua"},
        "three" = {
            "third" = {"thrice" = "toru"}
        },
        "four" = ["fourth", "quaternary", "wha", "tuawha"],
        "five" = [
            {"fifth" = "rima"},
            ["tuarima"],
            "quinary"
        ],
        "six" = ((sixth, senary) => arguments)("ono", "tuaono", [6])
    }
    expected = {
        "one" = "tahi",
        "first" = "tuatahi",
        "two[second]" = "rua",
        "three[third][thrice]" = "toru",
        "four[0]" = "fourth",
        "four[1]" = "quaternary",
        "four[2]" = "wha",
        "four[3]" = "tuawha",
        "five[0][fifth]" = "rima",
        "five[1][0]" = "tuarima",
        "five[2]" = "quinary",
        "six[sixth]" = "ono",
        "six[senary]" = "tuaono",
        "six[2][0]" = 6
    }

    actual = flattenStruct(input)

    expect(actual).toBe(expected)
})

Note I'm using CFML's IIFE feature to call an inline function expression there.

I am not entirely happy with my fix for this, but I've gaffer-taped it up:

function flattenStruct(required struct struct) {
    flatten = (flattened, key, value, actual, prefix="") => {
        var offsetKey = isArray(actual) && isNumeric(key) ? key - 1 : key
        var qualifiedKey = prefix.len() > 0 ? "[#offsetKey#]" : offsetKey

I'm just checking if the "key" is numeric before I decrement it.


There's perhaps one more case I should put in there. This will silently ignore anything other than simple values, arrays or structs. I had a mind to throw an exception at the end if we'd not already returned, but I didn't get around to it. This solved the need we had, so that's all good.

Hopefully you found another incremental TDD exercise useful, and also found the problem we were trying to solve an interesting one as well. I did.

Righto.

--
Adam

Wednesday 22 March 2017

Code puzzle: find any match from an array of regex patterns

G'day:
I'm being slightly cheeky as I'm hijacking someone else's question / puzzle from the CFML Slack channel that we were discussing last night. Don't worry, that'll be the last mention of CFML in this article, and indeed the question itself was asking for a JS solution. But it's a generic code puzzle.

Let's say one has a string, and an array of regex patterns one wants to check for in said string. A number of solutions were offered, using a number of techniques. And it even branched away from JS into Clojure (thanks Sean).

How would you go about doing it?

The input data is:

patterns = ["Savvy", "Smart", "Sweet"]
testForPositive = "Super Savvy"
testForNegative = "Super Svelt"

And in these cases we want true and false to be returned, respectively.

Note that the patterns there need to be considered regular expressions, not simple strings. The example data is just using simple strings for the sake of clarity.

Initially I was gonna do something like this:

function matchesOne(str, patterns) {
    var onePattern = new RegExp(patterns.join("|"));
    return str.search(onePattern) > -1;
}

Here I'm joining all the patterns into one big pattern, and searching for that. I was slightly hesitant about this as it's loading the complexity into the regex operation, which I felt perhaps wasn't ideal. Also it's not very clever: if any of the individual patterns had a | in them: this'd break. Scratch that.

Ryan Guill seeded the idea of approaching it from the other direction. Using the array as an array, and using one of JS's array higher-order functions to do the trick. We saw examples using find and filter, but I reckoned some was the best option here:

function matchesOne(str, patterns) {
    return patterns.some(pattern => str.search(pattern) > -1);
}


I'm happy with that.

The ante was raised next: write a function which returned the position of the first match (not just a boolean as to whether there was a match or not).

To me this was a reduction operation: take an array and - given some algorithm applied to each element - return a final single value.

I tried, but I could not make it very clean. This is what I got:

function firstMatch(str, patterns) {
    return patterns.reduce(function(first, pattern){
        var thisMatch = str.search(pattern);
        return first == -1 ? thisMatch : thisMatch == -1 ? first : Math.min(first, thisMatch);
    }, -1);
}


I'm happy with the reduce approach, but I'm really not happy with the expression to identify the lowest match (or continue to return -1 if no match is found):

first == -1 ? thisMatch : thisMatch == -1 ? first : Math.min(first, thisMatch)


To be clear, if we expanded that out, the logic is:

if (first == -1) {
    return thisMatch;
}
if (thisMatch == -1) {
    return first;
}
return Math.min(first, thisMatch);

That's actually "cleaner", but I still think I'm missing a trick.

So the puzzle to solve here is: what's a clever (but also Clean - in the RC Martin sense) way returning the lowest non -1 value of first and thisMatch, or fall back to -1 if all else fails.

Using this code (or some approximation thereof):

function firstMatch(str, patterns) {
    return patterns.reduce(function(first, pattern){
        var thisMatch = str.search(pattern);
        // return result of your code here
    }, -1);
}

console.log(firstMatch("Super Savvy", ["Savvy", "Smart", "Sweet"]));
console.log(firstMatch("Super Svelt", ["Savvy", "Smart", "Sweet"]));


One should get 6 and -1 for the respective "tests".

Use any language you like.

Oh and hey... if you have any interesting solutions for the two functions themselves, feel free to post 'em too.


Righto.

--
Adam

Friday 4 November 2016

Another Friday code puzzle on the Slack Channel

G'day:
Another perennial CFML community participant, Ryan Guill, has posted this week's Friday code puzzle, over on the #friday-puzzle subchannel of the #CFML Slack channel.

Don't worry about the fact it's on the #CFML channel: it's open to anyone who wants to participate (you will need to join the channel though! I'll check with Ryan if it's OK to post the direct link to the Gist for it... update... yes it is OK. Here it is: friday-puzzle-leaderboard.md (gist)).

Here's a summary of the challenge:

Challenge:

Create a function that takes an array of objects that contain a name and a date counts instances of the name for a score. Output a leaderboard of the top 10 scores. Combine the same scores together so that if, for example, first place is 20 points and two people have 20 points, show them both in first place. The order of the names in the same place are not important. Next would still be second place. For this challenge, the dates do not matter.

(but see the actual Gist for more expectations: that's just the summary).

I for one would like to see how various different languages might approach this. On the other hand, I'm pretty rammed at the moment so dunno if I will find time to actually participate, meself.

Anyway, go on... give it a blast.

Righto.

--
Adam



Friday 23 September 2016

TIL: Cameron is bloody wrong again

G'day:
This is just a short one. I ballsed-up a coupla things in my previous article: "Another Friday puzzle; and the importance of tests that try to break code". It should have also read "and the importance of following the spec". Ahem.

So the bit of the question I read was this:

The maximum sum subarray problem consists in finding the maximum sum of a contiguous subsequence in an array or list of integers:

maxSequence([-2, 1, -3, 4, -1, 2, 1, -5, 4])
// should be 6: [4, -1, 2, 1]
And I admonished John and Isaiah for not dealing with sets with only negative numbers in them. My solution "worked" in that it returned the highest-total subset (basically a subset containing just the single highest negative number). They were returning 0, which I figured was an oversight / edge-case-failure.

However elsewhere, if I was to continue reading (or following links, or something), it did indeed say something about the minimum total being 0, even if the set contained only negative numbers. Oops.

Another contentious point was that I figured if the set was empty, then the result should be undefined or null. This is showing-up my lack of mathematical background, in that there's well-established rules for this, which Ryan pointed out to me: "Operations on the empty set". Basically it's taken as written that the sum of an empty set is zero. So it's official.

Now... whilst my solution was - as a result of these facts - wrong, I still think the exercise was a useful one, so I'll update that article to point to this, but otherwise leave it as is.

And that's it. Just thanks to John, Isaiah and Ryan for setting me straight there.

Righto.

--
Adam

Monday 19 September 2016

Another Friday puzzle; and the importance of tests that try to break code

G'day:
There was another Friday Puzzle on the CFML Slack channel this week. The question was:

The maximum sum subarray problem consists in finding the maximum sum of a contiguous subsequence in an array or list of integers:

maxSequence([-2, 1, -3, 4, -1, 2, 1, -5, 4])
// should be 6: [4, -1, 2, 1]

Apparently the word "contiguous" is something that will flummox some people (judging by the discussion on the Slack channel). In this sense a contiguous sub-sequence is one that is made from adjacent array elements. So you can see they highlighted sub-sequence is a subset of adjacent array elements.

Eyeballing the example series here reveals the tricky bit one needs to watch for... negative values. Take a sub-sequence of 3,4. That has a sum of 7. Now if the next number is negative: 3,4,-1 (sum 6), the sum of this longer sub-sequence is less than that of the preceding shorter one, so the shorter subsequence is still the "winner". However if the next number (or numbers) sum to greater than the negative number, eg; 3,4,-1,2 (sum 8) then we've got the highest sum again. And of course there could be more negative numbers followed by positive numbers with a higher sum repeatedly.

Update:

I did not thoroughly read the requirement here, so got some of this wrong. See "TIL: Cameron is bloody wrong again".

A couple of the bods on the Slack channel quickly came up with solutions. Here they are, respectively: Isaiah's and John's. I'll only repeat one of them here because both used pretty much the same algorithm, just one used procedural code, and the other a more functional approach (in that they used higher-order functions). But they've both got the same logic error in them, because the approach is flawed. Let's have a look at John's one:

numeric function maxSequence(required array arr){
    var currentSum = 0;
    return arr.reduce(function(maxSum, number){
        currentSum = max(currentSum+number, 0);
        return max(currentSum, maxSum);
    }, 0);
}

WriteDump(maxSequence([-2, 1, -3, 4, -1, 2, 1, -5, 4]));

WriteDump(maxSequence([-2, 1, -3, 4, -1, 2, 1, -5, 4, 3]));

WriteDump(maxSequence([-2, -3, -1, -5]));

WriteDump(maxSequence([1, 4, 2, 1, 4, 3]));

This is nice and simple, and it took me a while to get what he's doing here. But we're keeping a running sum of the sequence outside the callback, and the callback itself returns the greater of the running sum or the whatever has previously been the maximum sum. So taking the first series there, we get the following iterations:

maxSumnumbercurrentSumnewMax
0-200
0111
1-301
1444
4-134
4255
5166
6-516
6456

Superficially this seems OK, except for one thing: it considers the minimum possible sum to be 0. Which is not correct. In a sequence with only negative numbers... the maximum sum will be the highest negative number: in [-1,-2,-3] the highest sum there is -1, but sums less than zero are ignored by this algorithm, so it incorrectly returns 0. Equally for an empty sequence, the answer is null or undefined, so this algorithm will fail there too, as it returns 0 for this too (as it defaults to a maximum sum of zero, whereas we don't know there'll be a sum when we start the process. So I guess that's two slightly different logic errors there.

I'll come back to John's solution in a bit, but first here's my solution. It's not as elegant as John's, and I'm not entirely happy with it, but it does work.

Here's the code (I've opted to do mine in ES2015 rather than CFML):

let sumLongestContiguousSubsequence = function (array) {
    let subSequences = array.map((_,i,a)=>a.slice(i));

    return subSequences.reduce(function(max, subsequence){
        let runningSum = subsequence[0];
        let maximumSubSequenceSum = subsequence.reduce(function(max, element){
            return (runningSum += element) > max ? runningSum : max;
        });
        return Math.max(max||maximumSubSequenceSum, maximumSubSequenceSum);
    }, null);
};

My conceit is that one cannot make assumptions about any earlier maximums, so I don't default the answer to anything.

Also I don't default the running sum to anything, I just take its starting point as the first element of the sequence I'm checking.

The general approach here is slightly more ham-fisted than John's approach.
  • I figure I need to scan each separate sub-sequence of the main sequence, starting each sub-sequence from each subsequent value in the main sequence. EG: if I have a sequence of [1,2,3,4], then subSequences will be [[1,2,3,4],[2,3,4] [3,4],[4]].
  • For each of those I do much the same thing as John: just run a cumulative sum, and remember whatever the highest value for that was.
  • So that'll bring me back to an array of maximums (in my [1,2,3,4] example, this'd be [10,9,7,4].
  • And as I reduce that lot, I simply return whichever is the higher of the previous ones and the current one.

I also wrote actual tests here. And this is where my approach differs from John's (or Isaiah's for that matter). When I test things I don't try to demonstrate to myself it works, I try to break the thing.

Here are my tests:

"use strict";

let assert = require("chai").assert;

let sumLongestContiguousSubsequence = require("./sumLongestContiguousSubsequence.js");

describe("Test of puzzle requirement", function(){
    it("returns the highest contiguous subseries sum for baseline requirement", function(){
        let sequence = [-2, 1, -3, 4, -1, 2, 1, -5, 4];
        let expectation = 4 + -1 + 2 + 1; // 6
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
});

describe("Tests of other puzzle submission sequences", function(){
    it("returns the highest contiguous subseries sum for variation 1", function(){
        let sequence = [-2, 1, -3, 4, -1, 2, 1, -5, 4, 3];
        let expectation = 4 + -1 + 2 + 1 + -5 + 4 + 3; // 8
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum for variation 2", function(){
        let sequence = [-2, -1, -3, -5];
        let expectation = -1;
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum for variation 3", function(){
        let sequence = [1, 4, 2, 1, 4, 3];
        let expectation = 1 + 4 + 2 + 1 + 4 + 3; // 15
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
});

describe("Test edge cases", function(){
    it("returns the highest contiguous subseries sum with an empty array", function(){
        let sequence = [];
        let expectation = null;
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum with just zero", function(){
        let sequence = [0];
        let expectation = 0;
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum with just -1", function(){
        let sequence = [-1];
        let expectation = -1;
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum with just 1", function(){
        let sequence = [1];
        let expectation = 1;
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
});

describe("Better described tests", function(){
    it("returns the highest contiguous subseries sum when the sequence has negative values", function(){
        let sequence = [1,2,3,-11];
        let expectation = 1 + 2 + 3; // 6
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum when the sequence has negative values followed by a greater positive value", function(){
        let sequence = [1,2,3,-4,5];
        let expectation = 1 + 2 + 3 + -4 + 5; // 7
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum when the sequence has negative values followed by a subseries positive values that are net greater than the negative one", function(){
        let sequence = [2,4,6,-8,3,7];
        let expectation = 2 + 4 + 6 + -8 + 3 + 7; // 14
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
    it("returns the highest contiguous subseries sum when the sequence has repeated negative values followed by a subseries positive values that are net greater than the negative one", function(){
        let sequence = [12,14,16,-8,3,7,-12,5,9];
        let expectation = 12 + 14 + 16 + -8 + 3 +7 + -12 + 5 + 9; // 46
        let result = sumLongestContiguousSubsequence(sequence);
        assert.equal(expectation, result);
    });
});

These are fairly deliberate in what they test, but don't try to do anything clever (so there's a lot of repeated code). I make a point of testing edge cases like empty sequences, and sequences with only one element, and all negative values and all positive values etc. My philosophy with testing is that I'm not trying to demonstrate the code works, I'm trying to demonstrate it doesn't break. This amounts to the same thing, but it starts from a slightly different mindset. Testing is one situation where "glass half empty" rather than "glass half full" is the better way to look at things.

So this version of the solution works, but I'm not happy about it. For one thing, I don't think one can look at it and go "right, it's clear what's going on there". I looked for refactoring opportunities, but am not seeing any. Maybe subSequences could have a better name? I looked at extracting the callbacks into named function expressions, but that didn't look any clearer to me either.

let sumLongestContiguousSubsequence = function (array) {
    let runningSum;
    
    let getCurrentMaxSum = function(max, element){
        return (runningSum += element) > max ? runningSum : max;
    };
    
    let findSumOfLongestSub = function(max, subsequence){
        runningSum = subsequence[0];
        let maximumSubSequenceSum = subsequence.reduce(getCurrentMaxSum);
        return Math.max(max||maximumSubSequenceSum, maximumSubSequenceSum);
    };

    let subSequencesSlicedAtEachIndex = array.map((_,i,a)=>a.slice(i));

    return subSequencesSlicedAtEachIndex.reduce(findSumOfLongestSub, null);
};

What do you think: is that clearer? I guess it's a bit better, innit? It's straying away from "elegance in simplicity" though.

Ha, just for a laugh I took the code clarity in the opposite direction. How about this mess:

let f=a=>a.map((_,i,a)=>a.slice(i)).reduce((m1,s)=>{
    let t=s[0],m2=s.reduce((m,v)=>(t+=v)>m?t:m)
    return Math.max(m1||m2,m2)
},null)

It's the same logic as my original version.

The other issue that Sean and Ryan are likely to pull me up on is that one of my reductions relies on side-effects spilling out into the intermediary calling code:

let sumLongestContiguousSubsequence = function (array) {
    let subSequences = array.map((_,i,a)=>a.slice(i));

    return subSequences.reduce(function(max, subsequence){
        let runningSum = subsequence[0];
        let maximumSubSequenceSum = subsequence.reduce(function(max, element){
            return (runningSum += element) > max ? runningSum : max;
        });
        return Math.max(max||maximumSubSequenceSum, maximumSubSequenceSum);
    }, null);
};


This didn't actually bother me until I did the PHP version of this, where the side-effects are more glaring:

$sumLongestContiguousSubsequence = function ($array) {
    $subSequences = array_map(function($i) use ($array) {
        return array_slice($array, $i);
    }, array_keys($array));

    return array_reduce($subSequences, function($max, $subSequence){
        $runningSum = 0;
        $maximumSubSequenceSum = array_reduce($subSequence, function($max, $element) use (&$runningSum){
            return ($runningSum += $element) > $max ? $runningSum : $max;
        }, $subSequence[0]);
        return max($max?:$maximumSubSequenceSum, $maximumSubSequenceSum);
    }, null);
};

PHP sux a bit because it doesn't do closure implicitly, one needs to tell it what to enclose, which is clumsy IMO. But it's also a bit of an orange flag. This orange flag becomes a red flag when I need to actually use a reference here, cos I'm changing the original value, not simply using it.

That guilt-tripped me into revising my JS version so as to not need the side effects:

let sumLongestContiguousSubsequence = function (array) {
    let subSequences = array.map((_,i,a)=>a.slice(i));

    return subSequences.reduce(function(max, subsequence){
        let maximumSubSequence = subsequence.reduce(function(working, element){
            working.runningSum += element;
            working.max = working.max || working.runningSum;
            working.max = working.runningSum > working.max ? working.runningSum : working.max
            return {max:working.max, runningSum:working.runningSum};
        }, {runningSum:0});
        return Math.max(max||maximumSubSequence.max, maximumSubSequence.max);
    }, null);
};


Now I'm carrying both the runningSum and the max value through the first argument in the reduction, by using an object rather than just the simple value. It's a small change but gets rid of the smell. To be completely honest though... I prefer my original version. I realise it's frowned-upon to bleed side-effects (which I then leverage), but this refactoring seems like an exercise in pedantic/dogmatic busy-work, and makes the code slightly less clear in the process. I dunno.

Oh, for completeness I did a CFML conversion too:

sumLongestContiguousSubsequence = function (array) {
    var subSequences = array.map((_,i,a) => a.slice(i));
    return subSequences.reduce(function(maxSum, subSequence){
        var runningSum = 0;
        var maximumSubSequenceSum = subSequence.reduce(
            (maxSum, element) => (runningSum += element) > maxSum ? runningSum : maxSum,
            subSequence[1]
        );
        return max(maxSum ?: maximumSubSequenceSum, maximumSubSequenceSum);
    }, null);
};

Note that this solution only works on Lucee, not ColdFusion, for three reasons:
  • ColdFusion does not have arrow functions;
  • ColdFusion has a bug in its parser which breaks on this expression (see 4190163);
  • Lucee has a null keyword whereas ColdFusion does not.

The Lucee version is OK though. Same as the JS version for the most part: just 1-based arrays, not 0-based; and Lucee has the ?: operator whereas JS uses ||.

That's about all I can think to say about this. I might see if I can mess with John's approach to make it work for those two edge-cases it fails on. I much prefer his way compared to my way... but only provided it can be made to work! I'd also be keen to see treatments of this puzzle in other languages, or better/different solutions for JS, PHP or CFML.

Righto.

--
Adam

Saturday 3 September 2016

JS: next Friday puzzle: displaying bytes in "nearest unit"

G'day:
Here's my answer for the next Friday puzzle that was posted on the CFML Slack channel. I skipped last week's one as... well... I couldn't be arsed, but this was an easy one and I could sort it out whilst cursing my jet lag (I've just done London -> Auckland this week).

The details of the puzzle are in this gist, but basically it's to take a number of bytes and "round" it to the appropriate closest unit of bytes (eg: kB, MB, GB etc), up to PB. There's more detail than that, but that's the bit I'm paying attention do. Doing CFML is a waste of time, so this week I've decided to do it in JavaScript instead, and re-polish my piss-poor ES2015 / Node.js / Mocha skills. Although not too much.

First I tried a solution which treated the number as a string and just reduced it down to the most significant figures and slapped a unit onto the end of it. Whilst this did what I set out to do, it only dealt with decimal groupings, not 1024-based ones. Here's the code anyhow. It's not polished up cos I abandoned it half-way through:

f = function(x) {
    var units = ["B", "kB", "MB", "GB", "TB", "PB"];
    var ordersOfMagnitude = 3;

    var numberAsString = x.toString();
    var lengthOfNumber = numberAsString.length; 

    var numberOfDigits = lengthOfNumber % ordersOfMagnitude;
    var unitIndex = Math.floor(lengthOfNumber / ordersOfMagnitude);

    if (numberOfDigits == 0) {
        numberOfDigits = 3;
        unitIndex--;
    }
    if (unitIndex+1 > units.length){
        unitIndex = units.length - 1;
        numberOfDigits = lengthOfNumber - (unitIndex * ordersOfMagnitude);
    }
    
    var digits = numberAsString.substring(0, numberOfDigits);
    var unitToUse = units[unitIndex];

    var result = digits + unitToUse;
    
    return result;
}

This was more just a spike to get me thinking.

Then I tried another version where I did a reduction on the units array, but that was crap as it was quite side-effect-y, and I don't really think it was a good use of a reduce operation. I did not keep the code for that one, so I can't show you.

Next I decided to stop messing around and come up with an answer that actually worked and wasn't daft, so I knocked this one together:

"use strict";
 
let numberToMemoryUnits = function(bytes) {
    let units = ["kB", "MB", "GB", "TB", "PB"];
    let binaryDivisor = 1024;
    let numberOfBytesAsUnit = bytes;
    let unit = "B";
    while (numberOfBytesAsUnit >= binaryDivisor && units.length){
        numberOfBytesAsUnit /= binaryDivisor;
        unit = units.shift();
    }
    let roundedValue = Math.floor(numberOfBytesAsUnit);

    return `${roundedValue}${unit}`;
}

module.exports = numberToMemoryUnits;

That's OK-ish I guess. The only thing I don't like is how I have that first assignment of the units separate to the loop. It seems like dodgy code doubling-up to me, and I should get rid of that somehow, but can't be bothered thinking it through (I'm slightly hungover, which doesn't help).

Update:

I've tweaked this slightly since I first posted it:
  • Improved the argument name from a rather lazy-arsed x to the more descriptive bytes.
  • Likewise improved the name of the variable which was digits to be numberOfBytesAsUnit.
  • Used the intermediary variable roundedValue,
  • and used an interpolated string instead of string concatenation with the return value.
  • improved the way I initialised the units object in the tests, from being individual assignment statements to being a reduction of the units array.
This was off the back of a discussion about my implementation that I had with Brendan on the Slack channel. He asked me why I had the separate variable digits (now numberOfBytesAsUnit) instead of just using the argument x (bytes). This was because whilst the value passed to the function is indeed a number of bytes, once I start using it - specifically in that loop - it's no longer a value in bytes, so I use a new - more descriptively accurate - variable name instead. We also discussed my separate handling of bytes and the other units coming from the array, and I'm still not 100% happy with it, but in using the better variable names around it I mind it less than I did before.

Any other code review input is welcomed, btw.

I was a bit naughty as I tested it by hand whilst developing it, but then felt guilty and formalised a bunch of tests after the fact. I guess I still did TDD whilst throwing this together, but not as deliberately as perhaps I shoulda. Anyhow, here are the tests too:

"use strict";

let assert = require("chai").assert;

let numberToMemoryUnits = require("../src/numberToMemoryUnits.js");

let binaryFactor = 1024;

let units = ["kB", "MB", "GB", "TB", "PB"].reduce(function(units, unit, index){
    units[unit] = Math.pow(binaryFactor, index + 1);
    return units;
}, {});

describe("Tests for each unit", function(){
    it("should work for bytes", function(){
        let result = numberToMemoryUnits(123);
        let expectation = "123B";
        assert.equal(expectation, result);
    });
    it("should work for kB", function(){
        let result = numberToMemoryUnits(2345);
        let expectation = "2kB";
        assert.equal(expectation, result);
    });
    it("should work for MB", function(){
        let result = numberToMemoryUnits(3456789);
        let expectation = "3MB";
        assert.equal(expectation, result);
    });
    it("should work for GB", function(){
        let result = numberToMemoryUnits(4567890123);
        let expectation = "4GB";
        assert.equal(expectation, result);
    });
    it("should work for TB", function(){
        let result = numberToMemoryUnits(5678901234567);
        let expectation = "5TB";
        assert.equal(expectation, result);
    });
    it("should work for PB", function(){
        let result = numberToMemoryUnits(6789012345678901);
        let expectation = "6PB";
        assert.equal(expectation, result);
    });
    
});

describe("Test exact units", function(){
    it("should work for 1kB", function(){
        let result = numberToMemoryUnits(units.kB);
        let expectation = "1kB";
        assert.equal(expectation, result);
    });
    it("should work for 1MB", function(){
        let result = numberToMemoryUnits(units.MB);
        let expectation = "1MB";
        assert.equal(expectation, result);
    });
    it("should work for 1GB", function(){
        let result = numberToMemoryUnits(units.GB);
        let expectation = "1GB";
        assert.equal(expectation, result);
    });
    it("should work for 1TB", function(){
        let result = numberToMemoryUnits(units.TB);
        let expectation = "1TB";
        assert.equal(expectation, result);
    });
    it("should work for 1PB", function(){
        let result = numberToMemoryUnits(units.PB);
        let expectation = "1PB";
        assert.equal(expectation, result);
    });
});
describe("Test off by one", function(){
    it("should work for <1kB", function(){
        let result = numberToMemoryUnits(units.kB-1);
        let expectation = "1023B";
        assert.equal(expectation, result);
    });
    it("should work for >1kB", function(){
        let result = numberToMemoryUnits(units.kB+1);
        let expectation = "1kB";
        assert.equal(expectation, result);
    });
    it("should work for <1MB", function(){
        let result = numberToMemoryUnits(units.MB-1);
        let expectation = "1023kB";
        assert.equal(expectation, result);
    });
    it("should work for >1MB", function(){
        let result = numberToMemoryUnits(units.MB+1);
        let expectation = "1MB";
        assert.equal(expectation, result);
    });
    it("should work for <1GB", function(){
        let result = numberToMemoryUnits(units.GB-1);
        let expectation = "1023MB";
        assert.equal(expectation, result);
    });
    it("should work for >1GB", function(){
        let result = numberToMemoryUnits(units.GB+1);
        let expectation = "1GB";
        assert.equal(expectation, result);
    });
    it("should work for <1TB", function(){
        let result = numberToMemoryUnits(units.TB-1);
        let expectation = "1023GB";
        assert.equal(expectation, result);
    });
    it("should work for >1TB", function(){
        let result = numberToMemoryUnits(units.TB+1);
        let expectation = "1TB";
        assert.equal(expectation, result);
    });
    it("should work for <1PB", function(){
        let result = numberToMemoryUnits(units.PB-1);
        let expectation = "1023TB";
        assert.equal(expectation, result);
    });
    it("should work for >1PB", function(){
        let result = numberToMemoryUnits(units.PB+1);
        let expectation = "1PB";
        assert.equal(expectation, result);
    });
});

describe("Test boundaries", function(){
    it("should work for 0bytes", function(){
        let result = numberToMemoryUnits(0);
        let expectation = "0B";
        assert.equal(expectation, result);
    });
    it("should work for 1024PB", function(){
        let result = numberToMemoryUnits(units.PB*units.kB);
        let expectation = "1024PB";
        assert.equal(expectation, result);
    });
    it("should work for 1048576PB", function(){
        let result = numberToMemoryUnits(units.PB*units.MB);
        let expectation = "1048576PB";
        assert.equal(expectation, result);
    });
});


They're a bit long-winded in total, but the individual tests are simple enough. The key here is I test either side of each each unit, eg: 1023 is presented in bytes, 1024 is 1kB and so is 1025 etc. Also a test of zero for good measure, as well that it handles numbers beyond PB, and just uses PB thereafter (eg: 1025PB displays as such).

And they all pass:
C:\src\js\puzzle\20160903>mocha test\numberToMemoryUnitsTest.js


  Tests for each unit
    should work for bytes
    should work for kB
    should work for MB
    should work for GB
    should work for TB
    should work for PB

  Test exact units
    should work for 1kB
    should work for 1MB
    should work for 1GB
    should work for 1TB
    should work for 1PB

  Test off by one
    should work for <1kB
    should work for >1kB
    should work for <1MB
    should work for >1MB
    should work for <1GB
    should work for >1GB
    should work for <1TB
    should work for >1TB
    should work for <1PB
    should work for >1PB

  Test boundaries
    should work for 0bytes
    should work for 1024PB
    should work for 1048576PB


  24 passing (31ms)


C:\src\js\puzzle\20160903>


That's it. Nothing special today, but still required some thought, and also reminding myself how Mocha works.

Give it a go. Do it in some other language than CFML!

Righto.

--
Adam

Sunday 28 August 2016

Blimey, a CFML-centric article

G'day:
But it's nothing that interesting. I submitted my answer ("Friday puzzle: my PHP answer") for the Friday Code Puzzle yesterday, deciding to use PHP, given it's what I do for a crust these days, and I could best visualise the solution in PHP.

But the puzzle was from a CFML chat group, so I figured I might blow the dust of my CFML skills and see how closely a CFML answer would approximate my PHP one. The answer is "quite closely!". Here's a CFML transliteration of the PHP answer:

component {

    parents = {};

    function init() {
        parents[0]["children"] = [];

        return this;
    }

    static function loadFromCsv(filePath) {
        var dataFile = fileOpen(filePath, "read");

        var tree = new Tree();
        while(!fileIsEof(dataFile)){
            var line = fileReadLine(dataFile);
            tree.addNode(
                nodeText = line.listLast(),
                id = line.listFirst(),
                parent = line.listGetAt(2, ",", true)
            );
        }

        return tree;
    }

    private function addNode(nodeText, id, parent) {
        parent = parent == "" ? 0 : parent;

        parents[id].nodeText = nodeText;
        parents[parent].children = parents[parent].children ?: [];
        parents[parent].children.append(parents[id]);
    }

    function serializeJson() {
        return serializeJson(parents[0]["children"]);
    }
}

Note that this is using Lucee 5's CFML vernacular, and this will not run on ColdFusion for a coupla reasons:

  • ColdFusion still doesn't support static methods;
  • It has a new bug with the ?: operator which breaks some of my shortcuts there.

I did not write the full test suite, but I wrote a small test rig and ran all the test variations manually. Here's the trickiest one:

dataFile = expandPath("../../test/testfiles/notOrdered/data.csv");

tree = Tree::loadFromCsv(dataFile);
treeAsJson = tree.serializeJson();

jsonAsArray = deserializeJson(treeAsJson);
writeDump(jsonAsArray);

And it yields the goods:




For completeness, here's the variation I had to do to make this work on ColdFusion:

static function loadFromCsv(filePath) {
    var dataFile = fileOpen(filePath, "read");

    var tree = new Tree();
    while(!fileIsEof(dataFile)){
        var line = fileReadLine(dataFile);
        tree.addNode(
            nodeText = line.listLast(),
            id = line.listFirst(),
            parent = line.listGetAt(2, ",", true)
        );
    }
    return this;
}

private function addNode(nodeText, id, parent) {
    parent = parent == "" ? 0 : parent;

    parents[id] = parents.keyExists(id) ? parents[id] : {};
    parents[id].nodeText = nodeText;
    
    parents[parent] = parents.keyExists(parent) ? parents[parent] : {};
    parents[parent].children = parents[parent].keyExists("children") ? parents[parent].children : [];
    parents[parent].children.append(parents[id]);
}

  • ColdFusion does not yet support static methods, so I cannot use a static factory method to create the tree, I need to instantiate an empty tree and then get it to load itself. Not ideal, but not the end of the world either.
  • Because of various bugs in ColdFusion's implementation of implicitly creating "deep" data structures, and also with the ?: operator, I had to be more explicit with stepping through creation of intermediary structs, and also use keyExists instead of ?:

This isn't bad, and I don't fault ColdFusion for its lack of static yet, but I'm annoyed by finding yet more bugs in ColdFusion when all I want is to run some code.

Back to the Lucee version: I think it demonstrates CFML syntax is a bit tidier than PHP's, as it's not quite so in love with the punctuation, and punctuation is just noise to achieving good Clean Code. But in this example it's much of a muchness, innit?

Anyway, I have little else to add other than that. The approach is identical to the PHP one I did: I just wanted to compare the two languages.

Job done.

Righto.

--
Adam

Saturday 27 August 2016

Friday puzzle: my PHP answer

G'day:
So as per my previous article - "Friday code puzzle" - here's my answer to the current puzzle.

I knocked the first version of this out over a beer whilst waiting for my flight to Galway, last night. Then scrapped that as being crap, and rewrote whilst in transit. And then... erm... back-filled my unit tests y/day evening. I feel guilty about not doing proper TDD,  but... so be it. It's got test coverage now.

Just on TDD for a second. It was wrong of me not to do the testing first, and whilst I was just about to manufacture an excuse as to why not ("I didn't have PHPUnit installed on this machine", "I couldn't be arsed making a whole project out of it", "um [something about being offline on the flight]"), it was all bullshit laziness. I had to write tests to prove that my function would work outside of the immediate quiz requirements: the easiest way to do this is with unit tests and clear expectations etc. These tests were of limited benefit to me, given I already had the code written, and had already been through the trial-and-error process of getting the thing working correctly.  The thing is, in a professional environment the tests aren't for you necessarily. They're for the next person who comes along to maintain your code. Or to troubleshoot your code when an edge-case crops up. It should not be down to the next person to write the tests for your code. That's lazy-arse, ego-centric shit. Do your job properly. I have encountered a coupla devs recently who think they're too good for tests, and refuse to do them properly. I don't want to work with people like that as they're all "me me me me".

But anyway... some code.

Well first a recap. Here's the "official" description from Jesse. But in summary:

We have this data set:

id,parentId,nodeText
1,,File
2,1,New
3,2,File
4,2,Folder
5,,Edit
6,5,Copy
7,5,Cut
8,5,Paste
9,,Help
10,9,About

This reflects hierarchical data (I would not represent a hierarchy that way, but that's a different story for another time), and we want to convert it to a hierarchical representation, eg:

[
  {
     nodeText : "File"
    ,children : [
      {
         nodeText : "New"
        ,children : [
           { nodeText: "File" }
          ,{ nodeText: "Folder" }
        ]
      }
    ]
  }
  ,{
     nodeText : "Edit"
    ,children : [
       {nodeText: "Copy"}
      ,{nodeText: "Cut"}
      ,{nodeText: "Paste"}
    ]
  }
  ,{
     nodeText : "Help"
    ,children : [
      { nodeText: "About" }
    ]
  }
]

There are a few "rules of engagement" at that link I mentioned, but that's the gist of it.

I decided I had better write some PHP code for a change, so knocked something together hastily. Initially I thought I might need to write some recursive monster to generate the hierarchy, but instead realised I did not need to do that, I just needed to track each "parent" node as I encountered it, and then append children to it as appropriate. Note that the data is sorted, so each record could be a parent of a subsequent node, and it will never be the case one will encounter a node before already having processed its parent. So all I needed to do is iterate over the data from top to bottom. Once. Nothing more tricky than that. Then by the time I had finished, the first parent would represent the entire tree. That was easy enough but then I had to convert it to JSON. Note the above representation is not JSON, but it's close enough, so that's what I decided to run with. As it turns out this was pretty easy to achieve in PHP as it has the ability to provide a custom serialiser for a class, and given I'd used a mix of associative and indexed arrays to build the data structure, it was simply a matter of returning the first parent node's children. Done.

Enough chatter, here's the code...

Update:

I have updated this from the first version I posted. This is slightly simpler, and also works even if the data is not pre-sorted. Thanks to Mingo for encouraging me to look into this.


<?php

namespace puzzle;

class Tree implements \JsonSerializable {

    private $parents = [];

    function __construct() {
        $tree = [
            "children" => []
        ];
        $this->parents[0] = $tree;
    }

    static function loadFromCsv($filePath) {
        $dataFile = fopen($filePath, "r");

        $tree = new Tree();
        while(list($id, $parent, $nodeText) = fgetcsv($dataFile)) {
            $tree->addNode($nodeText, $id, $parent);
        }

        return $tree;
    }

    private function addNode($nodeText, $id, $parent) {
        $parent = $parent === "" ? 0 : $parent;

        $this->parents[$id]["nodeText"] = $nodeText;
        $this->parents[$parent]["children"][] = &$this->parents[$id];
    }

    function jsonSerialize() {
        return $this->parents[0]["children"];
    }
}

The only other thing to note here is that the requirements indicated the data "should be in the form of an RDBMS resultset", but I could not be arsed horsing around with DB data for this, so I'm just reading a CSV file instead.

I also wrote the tests, as I said:

<?php

namespace test\puzzle;

use puzzle\Tree;

/** @coversDefaultClass \puzzle\Tree */
class TreeTest extends \PHPUnit_Framework_TestCase {

    private $testDir;

    function setup() {
        $this->testDir = realpath(__DIR__ . "/testfiles");
    }

    /**
     * @covers ::loadFromCsv
     * @covers ::__construct
     * @covers ::addNode
     * @covers ::jsonSerialize
     * @dataProvider provideCasesForLoadFromCsvTests
     */
    function testLoadFromCsv($baseFile){
        $files = $this->getFilesFromBase($baseFile);

        $result = Tree::loadFromCsv($files["src"]);
        $resultAsJson = json_encode($result);
        $resultAsArray = json_decode($resultAsJson);

        $expectedJson = file_get_contents($files["expectation"]);
        $expectedAsArray = json_decode($expectedJson);

        $this->assertEquals($expectedAsArray, $resultAsArray);
    }

    private function getFilesFromBase($base){
        return [
            "src" => sprintf("%s/%s/data.csv", $this->testDir, $base),
            "expectation" => sprintf("%s/%s/expected.json", $this->testDir, $base)
        ];
    }

    function provideCasesForLoadFromCsvTests(){
        return [
            "puzzle requirements" => ["testSet" => "puzzle"],
            "one element" => ["testSet" => "one"],
            "deep" => ["testSet" => "deep"],
            "flat" => ["testSet" => "flat"],
            "not ordered" =&gt ["testSet" =&gt "notOrdered"]
        ];
    }
}

There's no real surprise of discussion point there, beside highlighting the test cases I decided upon:

  • the puzzle requirements;
  • a single element;
  • a deep structure;
  • a flat structure.

I think those covered all the bases for a Friday Puzzle. An example of the data and expectation are thus (this is the "deep" example):

1,,Tahi
2,1,Rua
3,2,Toru
4,3,Wha
5,4,Rima
6,5,Ono
7,6,Whitu
8,7,Waru
9,8,Iwa
10,9,Tekau


[
  {
    "nodeText": "Tahi",
    "children": [
      {
        "nodeText": "Rua",
        "children": [
          {
            "nodeText": "Toru",
            "children": [
              {
                "nodeText": "Wha",
                "children": [
                  {
                    "nodeText": "Rima",
                    "children": [
                      {
                        "nodeText": "Ono",
                        "children": [
                          {
                            "nodeText": "Whitu",
                            "children": [
                              {
                                "nodeText": "Waru",
                                "children": [
                                  {
                                    "nodeText": "Iwa",
                                    "children": [
                                      {
                                        "nodeText": "Tekau"
                                      }
                                    ]
                                  }
                                ]
                              }
                            ]
                          }
                        ]
                      }
                    ]
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
  }
]

All the rest are on GitHub.

I ran the code coverage report on the tests, and they're all green:


So that's that: surprisingly simple once I got into my head how to approach it.

Give it a blast. Reminder: the puzzle details are in this gist.

Righto.

--
Adam

Friday 26 August 2016

Friday code puzzle

G'day:

I'm waiting for a blood test and I'm "customer number 649", and they are currently processing "customer number 541". I have dubbed this place Franz Kafka Memorial Hospital.

Anyway, I have an hour or two to kill, so writing this quicky on my phone.

Last week on the CFML Slack Channel we piloted the notion of a Friday Puzzle. It went quite well, so we're continuing it. Don't be put off by the fact it's the CFML Slack Channel: we actively encourage our community to use any language they like, or different languages from usual, as it's a good way to improve one's skills as a programmer. I'll be doing this week's one in PHP. And maybe diversifying from there into something else.

This week's puzzle is as follows (in summary):

Challenge:

Convert a flat representation of a hierarchical data into a nested tree structure.

[...]

dataset:

idparentIdnodeText
1nullFile
2 1New
32File
42Folder
5 nullEdit
65Copy
75Cut
85Paste
9nullHelp
109About


Expected result:


[ 
    { 
        nodeText : "File",
        children : [ 
            { 
                nodeText : "New",
                children : [ 
                    {nodeText: "File"},
                    {nodeText: "Folder"}
                ] 
            }
        ] 
    },{ 
        nodeText : "Edit",
            children : [ 
                {nodeText: "Copy"},
                {nodeText: "Cut"},
                {nodeText: "Paste"}
            ]
    },{ 
        nodeText: "Help",
            children : [
                { nodeText: "About"}
            ]
    } 
]


I've omitted the rules and expectations and stuff. Full details are on the Slack Channel. You can sign-up at: http://cfml-slack.herokuapp.com/;

Give it a go!

Oh... And they're now processing customer 585.

Righto.

--
Adam

Thursday 18 August 2016

Breaking: Groovy and Clojure answers for that array-look-up code puzzle

G'day:
Well it's not really that breaking really... what I mean is a coupla other people posted some answers to last week's code puzzle after I wrote up the results ("Looking at the code from that code puzzle last week"). I was gonna just append them to the bottom of that earlier article, but hen no-one would see them, and that seemed a bit disrespectful. Also for this blog which is still mostly read by CFML devs, it's some exposure to other languages you might want to take the time to look at, and these are good examples why.

Sean

I was wondering what happened to Sean's Clojure example, but he was off leading his life instead of clearing away the cobwebs from my blog, so didn't notice the code puzzle initially. But here's his Clojure code (fully annotated for us non-Clojurians):

;; simplest solution to find first match -- note that `filter` returns
;; a lazy chunked sequence so it will not search the entire vector
;; however, the chunks are 32 elements in size so it will search up
;; to 31 elements beyond the first match
(first (filter #(re-find #".+at" %) ["at" "cat" "scat" "scratch"]))

;; Clojure has `some` but it returns the result of applying the predicate
;; not the original element so we need to write a "smarter" predicate:

;; will not work in all cases:
(some #(re-find #".+at" %) ["at" "cat" "scat" "scratch"])
;; this: (some #(re-find #".+at" %) ["scratch" "at" "cat" "scat"])
;; produces this: "scrat" -- oops!

;; will work with extended predicate:
(some #(when (re-find #".+at" %) %) ["at" "cat" "scat" "scratch"])

;; or we can use reduce with an early return -- the `reduced` value:
(reduce (fn [_ s] (when (re-find #".+at" s) (reduced s))) nil ["at" "cat" "scat" "scratch"])

;; a note about notation: #(.. % ..) is shorthand for (fn [x] (.. x ..))
;; i.e., an anonymous function with one argument

That looks like a bunch of code, but it's also four examples:

(first (filter #(re-find #".+at" %) ["at" "cat" "scat" "scratch"]))

(some #(re-find #".+at" %) ["at" "cat" "scat" "scratch"])

(some #(when (re-find #".+at" %) %) ["at" "cat" "scat" "scratch"])

(reduce (fn [_ s] (when (re-find #".+at" s) (reduced s))) nil ["at" "cat" "scat" "scratch"])


Now I've only had the most superficial look at Clojure, but even I can just read what's going on in that code. So that's cool. I've been off my game recently with my out-of-hours tech stuff - in case you hadn't noticed - and I really want to finish finding my motivation to get back to it, and look at more Clojure. I think it's a good thing to look at for a perennial CFMLer or PHPer as its quite the paradigm shift, but still seems pretty easy to get at least a superficial handle on, and then work from there.

Tony

Tony's done a Groovy example. Every time I see Groovy, it just seems cool. Check this out:

print( ['a', 'at', 'cat', 'scat', 'catch'].find { it ==~ '.+at' } )

That's it. Done. 67 characters, most of it data. 25 characters of actually "doing stuff", including more whitespace than I'd usually use for this sort of thing. Doesn't that make you want to use Groovy?

Anyway, that's that. I just wanted to share that code with y'all.

Righto.

--
Adam

Tuesday 16 August 2016

Looking at the code from that puzzle last week

G'day:
So last Fri I asked this:

[...] here's a code puzzle.

Rules:
  • You have an array of strings, eg: ['a', 'at', 'cat', 'scat', 'catch'].
  • Return the first value that matches a regex pattern, eg: '.+at' would match cat, scat, catch; but we want cat returned.
  • Do not use any looping statements (eg: do/for/while etc).
  • Bear in mind functions are not statements ;-)
  • The array could be very long, the strings could be very long, and the pattern could be complex. But the desired value could be early in the array.
  • Use any language.

I didn't have much of a reason for asking. I had to do something similar at work, except the match criteria were more complicated, and I tried a few ways and didn't like any of them. When processing a "plural" data structure - and array or a struct or some other sort of collection - I like to avoid generic looping statements if I can, as I don't think they're terribly declarative. If one needs to somehow transform a collection into something else, I prefer to use collection-iteration methods or functions like map, reduce, filter etc. I initially did this sort of thing (JavaScript):

var words = ["a", "at", "cat", "scat", "catch"];

var match = words.reduce(function(match, word){
    if (match !== undefined) return match;
    if (word.match(/.+at/)){
        return word;
    }
}, undefined);

console.log(match);

That's all good in theory. If one is taking a collection and the data processing of it returns a single (different) value, then reduce makes sense. The problem is that reduce iterates over the entire collection whether one needs to or not, so after I've found "cat" I'm just wasting effort looking at "scat" and "catch". Now I'm not one to worry about wasting cycles and that sort of micro-optimisation, but it still didn't sit well with me.

So next I considered using the "early exit" iteration method, some. I can stop that when I've finished. The problem is that some returns a boolean. And I needed a cat. But I could solve that with some closure:

var match;
words.some(function(word){
    if (word.match(/.+at/)){
        return match = word;
    }
});
console.log(match);

Note that return statement contains an assignment, and that evaluates to true, thus exiting the collection iteration (some ends as soon as the callback returns true).

That's all well and good, except I actually needed this done in PHP and it doesn't have a some function.

In the end I just used a foreach loop and was done with it:

$words = ["a", "at", "cat", "scat", "catch"];

$match = null;
foreach($words as $word) {
    if (preg_match("/.+at/", $word) === 1) {
        $match = $word;
        break;
    }
}

echo $match;

Still: all this got me thinking it was an interesting exercise. Well: kinda. So posted the puzzle to see what people came up with.

I added the constraint of no looping statements to push people towards using something more interesting. It was an artificial constraint.

Anyway, enough about me. Let's have a look at other people's code.

Isaiah

First up was Isaiah with a Ruby answer. This is short and sweet:

puts %w(a at cat scat catch).detect{|w| w =~ /.+at/ }

(I added the puts so I could check it output something)

detect is exactly what I needed for this. As per the detect docs:

Passes each entry in enum to block. Returns the first for which block is not false.
So like a combination of some and reduce, really. Cool. This answer is gonna be hard to beat (NB: I am only looking at the answers for the first time now!).

Brad

Next was Brad with a CFML answer:

echo( ['a', 'at', 'cat', 'scat', 'catch'].filter( function( i ) { return reFind( '.+at', i ) } ).first() )

This one only runs on Lucee. This slight revision works on CF2016 as well:

words = ['a', 'at', 'cat', 'scat', 'catch'];
writeOutput( words.filter( function( i ) { return reFind( '.+at', i ); } )[1] );

Brad's answer falls into the same trap my reduce version did: it keeps iterating after it could stop.

I do like how terse Brad's answer is here. Although it's borderline (borderline) unreadable. It demonstrates CFML can do some good stuff though.

Jesse

Jesse's JavaScript solution is next, once again going for the terseness prize:

console.log(['a', 'at', 'cat', 'scat', 'catch'].find((e)=>/.+at/.test(e)));

(again... the console.log is my addition).

I didn't know about the find method I have to admit, so that's cool. This is the equivalent of Isaiah's Ruby answer.

Bonus points for using an arrow function there!

Tyler

Tyler has done another JavaScript solution, this time using filter:

function getFirstArrayRegexMatch(array, regex){
  return (
    array.filter(function(element){
      if(element.match(regex)){ return element; }
    })[0]
  );
}

var a = ["a","at","cat","scat","catch"]
,   r = /.+at/;

document.write(getFirstArrayRegexMatch(a, r));

Bonus point for putting it in a function.

Ryan

Ryan's given me two JavaScript answers: one short-hand, one long-hand with a bunch of testing.

var firstMatch = (input, pattern) => input.find(item => item.match(pattern));

console.log(firstMatch(['a', 'at', 'cat', 'scat', 'catch'], '.+at'));

Similar to Jesse's answer.

function f (input, pattern) {
    var reverse = str => str.split('').reverse().join('');
    var reversedIndex = (l, i) => l - (i + 1);
    
    var j = '\n' + input.join('\n') + '\n',
        i = j.search(pattern);

    if (i == -1) return undefined;

    var end = j.indexOf('\n', i);
    var start = reversedIndex(j.length, reverse(j).indexOf('\n', reversedIndex(j.length, end) + 1));
    return j.substr(start + 1, end - start - 1);
}

console.log(f(['a', 'at', 'cat', 'scat', 'catch'], '.+at'));
console.log(f(['a', 'at', 'cat', 'scat', 'catch'], '.+cat'));
console.log(f(['a', 'at', 'cat', 'scat', 'catch'], '.t'));
console.log(f(['a', 'at', 'cat', 'scat', 'catch'], 'a'));
console.log(f(['a', 'at', 'cat', 'scat', 'catch'], '.+xat'));

This shows that Ryan paid attention to the requirements. I specifically said the answer might be late in the array, and he's catering for that here.

Plus tests! Including unhappy path tests!

Ryan knows how to get bonus points from me, eh?

Choop

Choop's used CFML. It's interesting how most of my readership is still made up of CFMLers, but most of them here have chosen not to use it. Well: a bit interesting.

/* lucee or adobe CF 11+  */
mystrings = ['a', 'at', 'cat', 'scat', 'catch'];
mytest = '.+at';
function firstMatch( strings, regex ) {
    var test = arguments.regex;
    var passing = arguments.strings.filter( function ( check ) {
        var matches = REMatch( test, check );
        return matches.Len() > 0;
    });
    return passing[ 1 ];
}
WriteOutput( firstMatch( mystrings, mytest ) );

Choop's eschewed terseness in favour of writing good clear code.

Mingo

Next up is Mingo's answer. I clearly spoke too soon before as he's stuck with CFML too. He's taken the reduce route, similar to my approach:

  arrayOfStrings = [ 'a', 'at', 'cat', 'scat', 'catch' ];
  regex = '.+at';
  writeOutput( arrayOfStrings.reduce( function( result='', item ){ return len( result ) ? result : item.reFind( regex ) ? item : ''; } ) );

Adam Tuttle made a knowing comment against this one:

I thought about nested ternaries too but figured Adam would chastise me for it. ;)
And yer bloody right! ;-)

I'm all good for a single ternary expression, but as soon as there's more than one... I think everyone's getting confused pretty quickly?

Adam

And indeed Adam's own JavaScript entry is up next. Props for using ES2015 constructs. I especially like how it can now do interpolated strings. Like CFML has been doing for about 20yrs.

"use strict"

function firstMatch ( pattern, data ) {
  return data.reduce( ( prev, curr ) => {
      if ( prev.length > 0 ) { return prev }
      if ( pattern.test( curr ) ) { return curr }
      return prev
  }, '')
}

const pattern = /.+at/
let data = ['a', 'at', 'cat', 'scat', 'catch']

console.log( `the answer is: ${firstMatch( pattern, data )}` )

His logic is very similar to my own reduce example too. (Oh, I needed to add "use strict" to that code to get it to run on my node install. Not sure why... it might be old).

As I am writing this up, I have noticed two entries stuck in my moderation queue. Sorry about that.

Eric

Here's Eric's CFML answer:

    function first( arr, predicate, defaultValue ) {
        if ( arrayIsEmpty( arr ) ) {
            if ( ! isNull( defaultValue ) ) {
                return defaultValue;
            }

            throw( "Cannot return the result because the array is either empty or no value matched the predicate with no default value provided." );
        }

        if ( isNull( predicate ) ) {
            return arr[ 1 ];
        } else {
            arguments.arr = arr.filter( predicate );
            structDelete( arguments, "predicate" );
            return first( argumentCollection = arguments );
        }
    }
    
    answer = first( ['a', 'at', 'cat', 'scat', 'catch'], function( str ) {
        return reFind( '.+at', str );
    } );
    
    writeOutput( answer );

Good to see some validation going on in there. I also really like how his own function takes a callback to do the check for what constitutes a match. Bloody nice that.

Quan Tran

Coincidentally the most interesting answer is the last one (it would not have been last had it not got stuck in the moderation queue, that said). Here's Quan Tran's recursive CFML answer:

// http://blog.adamcameron.me/2016/08/code-quiz.html
function regexSearchArray(regexString,arrStrings){
    var localArrStrings = arrStrings;
    
    if (not arrayLen(localArrStrings))
        return;
    else if (refind(regexString, localArrStrings[1]))
        return localArrStrings[1];
    else{
        ArrayDeleteAt(localArrStrings,1);
        return regexSearchArray(regexString,localArrStrings);
    }
}
arrStrings = ['a', 'at', 'cat', 'scat', 'catch'];
writeoutput(regexSearchArray('.+at',arrStrings));

That is a very inventive way of getting around my "no loop statements" rule. Very impressed.



That's quite a few entries for a wee code puzzle on this blog (the readership of which has pretty much died since I moved from CFML, even allowing for the distinct lack of content recently). So thanks for that.

I think all these answers had merit and discussion points and had something interesting about them. I like how terse Jesse and others managed to get their answers. I liked how Isaiah used Ruby instead of the usual suspects (for this blog, anyhow... not much Ruby going on around here). I especially like how Ryan provided more tests.

But the winner is Quan Tran with his recursive solution. It might not be the most performant, but it's def the most interesting.

Cheers all. I have a few other dumb-arse quiz questions I might continue to ask. We'll see.

Righto.

--
Adam