Monday 22 December 2014

Weekend code puzzle: my answer (Go version)

G'day:
I sat down on Saturday / Sunday and taught myself enough Go to be able to answer the puzzle question from "Something for the weekend? A wee code puzzle (in CFML, PHP, anything really...)". I've certainly got a lot of mileage out of that one topic, eh? Oh well: I think it's a good baseline exercise for researchng new languages, and I like seeing other people's code, too. I'm so pleased I got such a good range of submissions. However it does mean it's taking me an age to slog through them all.

Right, so I discussed Adam Presley's two Go entries over the last few days: "Weekend puzzle: Adam Presley's answer (Go)" and "Weekend puzzle: Adam Presley's answer - redux (Go)", and I did a "G'day World in Go" exercise in prep for understanding his code. Not that a "G'day world" exercise was sufficient for that, but at least I got Go installed and compiling, and located the docs etc.



I've already done a few different implementations of the puzzle answer:
I've implemented the Go solution using the same general approach. I was a bit naughty with this exercise in that I wrote the code first, then back-filled the tests. For all the wittering-on about TDD I do, this is not ideal. I did consider my options before starting, but I decided it was more prudent an approach to do the exercise in teaching myself the basics of the language first, thus to facilitate then knowing how to write the tests. Truth be told, I just decided it'd be more interesting writing the code for the exercise than getting up to speed with testing it. I will hasten to add, though, that I'm glad I did do the testing, as it turned up a bug in my initial implementation. So there you go: do your tests. And unlike me... do them first!

OK, so the code. I've got two files. One that handles the UI, one that does the actual work. First the UI handling one.

// runGetSubseries.go

package main

import (
    "flag"
    "fmt"
    "github.com/daccfml/golang/quiz/subseries"
    "strconv"
    "strings"
)

var seriesAsString string
var threshold int

func init() {
    flag.StringVar(&seriesAsString, "series", "", "Series from which to extract subseries")
    flag.IntVar(&threshold, "threshold", 0, "Threshold within which the subseries total must be")
    flag.Parse()
}

func main() {
    series := convertSeriesToArray(seriesAsString)

    subseries := subseries.Get(series, threshold)

    fmt.Println(subseries)
}

func convertSeriesToArray(seriesAsString string) (seriesAsIntegers []int) {
    for _, stringElement := range strings.Split(seriesAsString, ",") {
        integerElement, e := strconv.Atoi(stringElement)
        if e == nil {
            seriesAsIntegers = append(seriesAsIntegers, integerElement)
        }
    }
    return
}

When I say "UI", there is no "G" element to that, I just mean the interface between user and program. This code handles receiving input from the command line, pre-processing it into a format the function needs, calling the function, outputting the result. None of this is anything to do with the actual function for the exercise, hence separating it out.

Because this is all new to me, there are a few things to note here:

Packages

Go is a package-centric language. Everything goes in a package, and indeed to do anything useful one needs to import packages to avail any sort of functionality (even basic string operations!). If one wants to compile a program as an executable, it needs to be in package main (from "How to Write Go Code: Package Names").

One interesting thing is that the golang.org bods actively encourage packages to be homed on GitHub (from "How to Write Go Code: Package Paths"). The cool thing about this is that the go install command knows how to fetch dependency packages from GitHub, and does so seamlessly when compiling the program. See "Command go: Compile and install packages and dependencies".

Variable declaration and assignments

This was non-obvious, and took me a bit to get a handle on. There's three constructs here which all do similar things. Although had I just read the docs instead of fumbling about, I would have got it quicker ("The Go Programming Language Specification: Variable declarations").

Variables are declared and assigned an initial value via the var keyword. If the variable isn't given a value at initialisation, it needs to be given a type. If it is given a default value, then the type is inferred from the value.

There is a shorthand syntax for initialisation with a value: := (there's also a plain assignment operator, =, the difference between the two throwing me, to start with).

Flags

I think it was a poor naming decision to refer to these as flags, but so be it. The flag package handles the manipulation of command-line options (/flags) ("Package flag"). Flags to me are boolean, not just "any old value", and also aren't limited to the domain of command line options. Still: it's what the Go bods decided.

So, anyhow, this code:

import (
    "flag"
    "fmt"
    "github.com/daccfml/golang/quiz/subseries"
    "strconv"
    "strings"
)

var seriesAsString string
var threshold int

func init() {
    flag.StringVar(&seriesAsString, "series", "", "Series from which to extract subseries")
    flag.IntVar(&threshold, "threshold", 0, "Threshold within which the subseries total must be")
    flag.Parse()
}

reads two command line options: -series and -threshold. One can also access command line options via the os package's Args array ("Package os: variables"), but the flags approach seems to be favoured. So I ran with that.

Notice here I'm passing references (via the & prefix) to those variables into the StringVar() / IntVar() mathods, not the variables themselves. This is because those methods set the values for those variables, rather than using their values.

I'm setting some help text there as well, so when I've compiled the code, I can do this:

C:\src\go\src>quiz -h
Usage of quiz:
  -series="": Series from which to extract subseries
  -threshold=0: Threshold within which the subseries total must be

And an example of running it:

C:\src\go\src>quiz -series=100,100,200 -threshold=300
[100 200]

C:\src\go\src>

But that's getting ahead of myself.

init() function

As per the docs ("Effective Go: The init function"), this function is run after any package-level variable declarations, and before main() is called. So a good place - IMO - to get the args from the command line.

main() function

As I mentioned earlier, an executable program needs to have a main() function. This is the code that's run when calling the executable. Here it is again:

func main() {
    series := convertSeriesToArray(seriesAsString)

    subseries := subseries.Get(series, threshold)

    fmt.Println(subseries)
}

This is pretty simple. I've only been passed a comma-delimited list of series items, and I need that as an array (of integers), which is a bit of horsing about, so I've factored that out into a helper function (see below). Then I have the inputs I need, so I run my function (I'm getting that that), and output the result. I showed an example of this above.

One doesn't need to compile Go code to run it, one can just run it from source:

C:\src\go\src\github.com\daccfml\golang\quiz>go run runGetSubseries.go -series=100,200,300 -threshold=400
[100 200]

C:\src\go\src\github.com\daccfml\golang\quiz>

This is pretty bloody slow though.

convertSeriesToArray()

As I mentioned, this takes my comma-delimited string list and converts it to an array of integers (which is what my function requires to run):

func convertSeriesToArray(seriesAsString string) (seriesAsIntegers []int) {
    for _, stringElement := range strings.Split(seriesAsString, ",") {
        integerElement, e := strconv.Atoi(stringElement)
        if e == nil {
            seriesAsIntegers = append(seriesAsIntegers, integerElement)
        }
    }
    return
}

One interesting and quite cool thing here is that as part of the function declaration, one can specify not only the return type, but also the variable which will contain the return type (see "Effective Go: Named result parameters"). Meaning when one comes to return, one doesn't need to specify it. This makes a kind of sense.

Have a look at the for loop there; there's some slightly odd syntax with the _. for passes an index and a value into the loop body. Here we are not interested in the index, so we don't use it. _ just indicates we don't want to use that index. See "Effective Go: For" (third option, relating looping over arrays etc). The underscore is called the "blank identifier".

The other thing of note here is see how Atoi() returns two values: the result of the atoi process, as well as a potential error. See "Effective Go: (Functions) Multiple return values".

Oh, and Go has a built-in function for appending to slices (I'll get to those shortly). See "Built-in package: append".

The actual function

Right, so that was all lovely, but it doesn't actually have anything to do with the task at hand. Here's my Go implementation of the puzzle answer:

// Subseries.go

package subseries

func Get(series []int, threshold int) []int {
    subseries := series[0:0]
    low := 0
    for index, _ := range series {
        high := index + 1
        working := series[low:high]

        for sumSlice(&working) > threshold {
            low++
            working = series[low:high]
        }

        if len(working) > len(subseries) {
            subseries = working
        } else if (len(working) == len(subseries)) && (sumSlice(&working) > sumSlice(&subseries)) {
            subseries = working
        }
    }
    return subseries
}

It took me an age to work out how to get library packages and public methods sorted out. There's plenty of docs, and (kinda) helpful error messages, but it just didn't sink in for a while.

The chief thing that threw me is that to make a method public... start its name with a capital letter. That sounds like a bit of a crappy way of doing things, really. How does "use a capital letter" equate to "public". It's not intuitive. I would have thought "public unless otherwise stated" would be a better approach. I dislike having methods starting with capital letters, but I guess I'll need to get used to it.

Slices

This is the most interesting new thing for me. This concept Go has of "slices" ("Effective Go: Slices"). A slice is not just Go terminology for an array; Go also has arrays. A slice is kinda like a "window" into a section (/slice) of an underlying array. I'd not entirely sure why Go decided to go this route, if I'm honest.

So in my function the subseries I ultimately return is a slice of the series array. The algorithm here simply works out which section of the array holds the best subseries, and returns that slice of it. Note that a slice does not imply any slicing action on the array itself, it's literally just a sort of overlay which indicates a specific element range within the array. I think this analogy works really well for the requirements of the exercise here.

One interesting side effect of using a slice here is that when I loop through the series, I don't actually need to use the value from the series in my calculations. I just use the loop to set what the latter element of a potential subseries is. I do need to sum the values indicated by the slice, but that's only when checking if the current working slice is a "good 'un".

Note that when I initialise subseries, I am setting it to be a slice of series, currently looking at just the zeroth element. One doesn't have just "a slice", it's "a slice of a given array". There is a strong association between the slice and its underlying array.

Omission

One less good thing is that Go doesn't have the prefix increment operator, so I need to do this:

for sumSlice(&working) > threshold {
    low++
    working = series[low:high]
}

Rather than just this:

for sumSlice(&working) > threshold {
    working = series[++low:high]
}

Pity.

Summing a slice

I'm not too sure I approve of this, but Go doesn't have much built-in support for array (/slice) operations. Not even a sum function. So I've needed to roll me own (which, granted, is not too arduous):

func sumSlice(slice *[]int) (sum int) {
    for _, value := range *slice {
        sum += value
    }
    return
}

I'd've thought there's be a "slice" package with stuff like this, but I was buggered if I could find one. I did find a lot of ppl asking about it on Stack Overflow though.

In this function I'm passing a pointer (*) to the slice being summed, not the slice itself. Mostly because I can, but also because I don't actually need a copy of it, I just need to access its values (in a read-only fashion).

Go doesn't have any reduce functionality either, otherwise I would have used a reduction to get the sum, not a loop.

Testing

Go has testing built into it ("Package: testing"). This is pretty cool! Basically one just creates a file with a filename suffix of _test, and import the testing package, and off one goes. Any method within that file which starts with Test is... well... a test.

Here's an abbrev. of my tests:

// Subseries_test.go

package subseries

import (
    "reflect"
    "testing"
)

// [...]

func TestReturnsSubseriesWithHighestTallyWhenThereAreTwoEqualLengthSubseriesAndSecondSubseriesIsHigher(t *testing.T) {
    series := []int{100, 50, 50, 50, 50, 50, 500, 100, 60, 60, 60, 60, 60, 500}
    threshold := 500
    expected := []int{100, 60, 60, 60, 60, 60}
    result := Get(series, threshold)

    assertEquals(t, expected, result)
}

// [...]

func assertEquals(t *testing.T, expected []int, actual []int) {
    if !reflect.DeepEqual(actual, expected) {
        t.Errorf("\nexpected %d\n actual %d", expected, actual)
    }
}

As all the tests are much the same, I've only included the one with the stupidest name (the names are directly derived from the TestBox tests for my CFML version of this code.

This also shows how array-literal syntax is implemented in Go.

There's nothing really that surprising there. Oh... note that because this test file is in the same package as the actual method being tested, I do not need to qualify my references to the Get() method.

Go doesn't have any inbuilt assertions, so I've rolled my own assertEquals(). Also note that Go doesn't have a native way of comparing arrays, so I'm using the reflect package's DeepEqual() method. This is not a very performant way of doing the comparison, but as this is only tests, code clarity is more of a consideration than performance in my view.

The tests are run thus:

C:\src\go\src\github.com\daccfml\golang\quiz\subseries>go test
PASS
ok      github.com/daccfml/golang/quiz/subseries        0.638s

C:\src\go\src\github.com\daccfml\golang\quiz\subseries>

If there was an error, it'd be indicated thus:

C:\src\go\src\github.com\daccfml\golang\quiz\subseries>go test
--- FAIL: TestReturnsSubseriesWithHighestTallyWhenThereAreTwoEqualLengthSubserie
sAndSecondSubseriesIsHigher (0.00s)
        Subseries_test.go:144:
                expected [100 60 60 60 60 60 60]
                 actual [100 60 60 60 60 60]
FAIL
exit status 1
FAIL    github.com/daccfml/golang/quiz/subseries        0.518s

C:\src\go\src\github.com\daccfml\golang\quiz\subseries>

So that's all pretty easy.

Conclusion

I rather like Go, actually. It's got some idiosyncracies, but on the whole it seems well-thought-out. One thing I forgot to mention is that there's a really good Sublime Text plug-in, Go Sublime, which gives code completion, hinting, linting etc, which makes writing the code pretty easy too.

--
Adam