Showing posts with label Groovy. Show all posts
Showing posts with label Groovy. Show all posts

Tuesday 28 November 2017

That array_map quandary implemented in other languages

G'day:
A coupla days ago I bleated about array_map [having] a dumb implementation. I had what I thought was an obvious application for array_map in PHP, but it couldn't really accommodate me due to array_map not exposing the array's keys to the callback, and then messing up the keys in the mapped array if one passes array_map more than one array to process.

I needed to remap this:

[
    "2008-11-08" => "Jacinda",
    "1990-10-27" => "Bill",
    "2014-09-20" => "James",
    "1979-05-24" => "Winston"
]

To this:

array(4) {
  '2008-11-08' =>
  class IndexedPerson#3 (2) {
    public $date =>
    string(10) "2008-11-08"
    public $name =>
    string(7) "Jacinda"
  }
  '1990-10-27' =>
  class IndexedPerson#4 (2) {
    public $date =>
    string(10) "1990-10-27"
    public $name =>
    string(4) "Bill"
  }
  '2014-09-20' =>
  class IndexedPerson#5 (2) {
    public $date =>
    string(10) "2014-09-20"
    public $name =>
    string(5) "James"
  }
  '1979-05-24' =>
  class IndexedPerson#6 (2) {
    public $date =>
    string(10) "1979-05-24"
    public $name =>
    string(7) "Winston"
  }
}

Note how the remapped object also contains the original key value. That was the sticking point. Go read the article for more detail and more whining.

OK so my expectations of PHP's array higher order functions are based  on  my experience with JS's and CFML's equivalents. Both of which receive the key as well as the value in all callbacks. I decided to see how other languages achieve the same end, and I'll pop the codee in here for shits 'n' giggles.


CFML

Given most of my history is as a CFML dev, that one was easy.

peopleData = ["2008-11-08" = "Jacinda", "1990-10-27" = "Bill", "2014-09-20" = "James", "1979-05-24" = "Winston"]

people = peopleData.map((date, name) => new IndexedPerson(date, name))

people.each((date, person) => echo("#date# => #person#<br>"))

Oh, this presupposes the IndexedPerson component. Due to a shortcoming of how CFML works, components must be declared in a file of their own:

component {

    function init(date, name) {
        this.date = date
        this.name = name
    }

    string function _toString() {
        return "{date:#this.date#; name: #this.name#}"
    }
}


But the key bit is the mapping operation:

people = peopleData.map((date, name) => new IndexedPerson(date, name))

Couldn't be simpler (NB: this is Lucee's CFML implementation, not ColdFusion's which does not yet support arrow functions).

The output is:


2008-11-08 => {date:2008-11-08; name: Jacinda}
1990-10-27 => {date:1990-10-27; name: Bill}
2014-09-20 => {date:2014-09-20; name: James}
1979-05-24 => {date:1979-05-24; name: Winston}

Also note that CFML doesn't have associative arrays, it has structs, so the keys are not ordered. This does not matter here. (Thanks to Zac for correcting me here: CFML does have ordered structs these days).


JS

The next language I turned to was JS as that's the I'm next most familiar with. One thing that hadn't occurred to me is that whilst JS's Array implementation has a map method, we need to use an object here as the keys are values not indexes. And whilst I knew Objects didn't have a map method, I didn't know what the equivalent might be.

Well it turns out that there's no real option to use a map here, so I needed to do a reduce on the object's entries, Still: it's pretty terse and obvious:

class IndexedPerson {
    constructor(date, name) {
        this.date = date
        this.name = name
    }
}

let peopleData = {"2008-11-08": "Jacinda", "1990-10-27": "Bill", "2014-09-20": "James", "1979-05-24": "Winston"}

let people = Object.entries(peopleData).reduce(function (people, personData) {
    people.set(personData[0], new IndexedPerson(personData[0], personData[1]))
    return people
}, new Map())

console.log(people)

This returns what we want:

Map {
  '2008-11-08' => IndexedPerson { date: '2008-11-08', name: 'Jacinda' },
  '1990-10-27' => IndexedPerson { date: '1990-10-27', name: 'Bill' },
  '2014-09-20' => IndexedPerson { date: '2014-09-20', name: 'James' },
  '1979-05-24' => IndexedPerson { date: '1979-05-24', name: 'Winston' } }

TBH I think this is a misuse of an object to contain basically an associative array / struct, but so be it. It's the closest analogy to the PHP requirement. I was able to at least return it as a Map, which I think is better. I tried to have the incoming personData as a map, but the Map prototype's equivalent of entries() used above is unhelpful in that it returns an Iterator, and the prototype for Iterator is a bit spartan.

I think it's slightly clumsy I need to access the entries value via array notation instead of some sort of name, but this is minor.

As with all my code, I welcome people showing me how I should actually be doing this. Post a comment. I'm looking at you Ryan Guill ;-)

Java

Next up was Java. Holy fuck what a morass of boilterplate nonsense I needed to perform this simple operation in Java. Deep breath...

import java.util.HashMap;
import java.util.Map;
import java.util.stream.Collectors;

class IndexedPerson {
    String date;
    String name;
    
    public IndexedPerson(String date, String name) {
        this.date = date;
        this.name = name;
    }
    
    public String toString(){
        return String.format("{date: %s, name: %s}", this.date, this.name);
    }
}

class Collect {

    public static void main(String[] args) {

        HashMap<String,String> peopleData = loadData();

        HashMap<String, IndexedPerson> people = mapToPeople(peopleData);
            
        dumpIdents(people);
    }
    
    private static HashMap<String,String> loadData(){
        HashMap<String,String> peopleData = new HashMap<String,String>();
        
        peopleData.put("2008-11-08", "Jacinda");
        peopleData.put("1990-10-27", "Bill");
        peopleData.put("2014-09-20", "James");
        peopleData.put("1979-05-24", "Winston");
        
        return peopleData;
    }
    
    private static HashMap<String,IndexedPerson> mapToPeople(HashMap<String,String> peopleData) {
        HashMap<String, IndexedPerson> people = (HashMap<String, IndexedPerson>) peopleData.entrySet().stream()
            .collect(Collectors.toMap(
                e -> e.getKey(),
                e -> new IndexedPerson(e.getKey(), e.getValue())
            ));
            
        return people;
    }
    
    private static void dumpIdents(HashMap<String,IndexedPerson> people) {
        for (Map.Entry<String, IndexedPerson> entry : people.entrySet()) {
            System.out.println(String.format("%s => %s", entry.getKey(), entry.getValue()));
        }
    }
    
}

Result:
1979-05-24 => {date: 1979-05-24, name: Winston}
2014-09-20 => {date: 2014-09-20, name: James}
1990-10-27 => {date: 1990-10-27, name: Bill}
2008-11-08 => {date: 2008-11-08, name: Jacinda}

Most of that lot seems to be just messing around telling Java what types everything are. Bleah.

The interesting bit - my grasp of which is tenuous - is the Collectors.toMap. I have to admit I derived that from reading various Stack Overflow articles. But I got it working, and I know the general approach now, so that's good.

Too much code for such a simple thing though, eh?


Groovy

Groovy is my antidote to Java. Groovy makes this shit easy:

class IndexedPerson {
    String date
    String name

    IndexedPerson(String date, String name) {
        this.date = date;
        this.name = name;
    }

    String toString(){
        String.format("date: %s, name: %s", this.date, this.name)
    }
}

peopleData = ["2008-11-08": "Jacinda", "1990-10-27": "Bill", "2014-09-20": "James", "1979-05-24": "Winston"]

people = peopleData.collectEntries {date, name -> [date, new IndexedPerson(date, name)]}

people.each {date, person -> println String.format("%s => {%s}", date, person)}

Bear in mind that most of that is getting the class defined, and the output. The bit that does the mapping is just the one line in the middle. That's more like it.

Again, I don't know much about Groovy… I had to RTFM to find out how to do the collectEntries bit, but it was easy to find and easy to understand.

I really wish I had a job doing Groovy.

Oh yeah, for the sake of completeness, the output was thus:

2008-11-08 => {date: 2008-11-08, name: Jacinda}
1990-10-27 => {date: 1990-10-27, name: Bill}
2014-09-20 => {date: 2014-09-20, name: James}
1979-05-24 => {date: 1979-05-24, name: Winston}


Ruby

Ruby's version was pretty simple too as it turns out. No surprise there as Ruby's all about higher order functions and applying blocks to collections and stuff like that.

class IndexedPerson

    def initialize(date, name)
        @date = date
        @name = name
    end

    def inspect
        "{date:#{@date}; name: #{@name}}\n"
    end
end

peopleData = {"2008-11-08" => "Jacinda", "1990-10-27" => "Bill", "2014-09-20" => "James", "1979-05-24" => "Winston"}

people = peopleData.merge(peopleData) do |date, name|
    IndexedPerson.new(date, name)
end

puts people

Predictable output:

{"2008-11-08"=>{date:2008-11-08; name: Jacinda}
, "1990-10-27"=>{date:1990-10-27; name: Bill}
, "2014-09-20"=>{date:2014-09-20; name: James}
, "1979-05-24"=>{date:1979-05-24; name: Winston}
}

I wasn't too sure about all that block nonsense when I first started looking at Ruby, but I quite like it now. It's easy to read.


Python

My Python skills don't extend much beyond printing G'day World on the screen, but it was surprisingly easy to google-up how to do this. And I finally got to see what Python folk are on about with this "comprehensions" stuff, which I think is quite cool.

class IndexedPerson:
    def __init__(self, date, name):
        self.date = date
        self.name = name

    def __repr__(self):
        return "{{date: {date}, name: {name}}}".format(date=self.date, name=self.name)

people_data = {"2008-11-08": "Jacinda", "1990-10-27": "Bill", "2014-09-20": "James", "1979-05-24": "Winston"}

people = {date: IndexedPerson(date, name) for (date, name) in people_data.items()}

print("\n".join(['%s => %s' % (date, person) for (date, person) in people.items()]))


And now that I am all about Clean Code, I kinda get the "whitespace as indentation" thing too. It's clear enough if yer code is clean in the first place.

The output of this is identical to the Groovy one.

Only one more then I'll stop.

Clojure

I can only barely do G'day World in Clojure, so this took me a while to work out. I also find the Clojure docs to be pretty impentrable. I'm sure they're great if one already knows what one is doing, but I found them pretty inaccessible from the perspective of a n00b. It's like if the PHP docs were solely the user-added stuff at the bottom of each docs page. Most blog articles I saw about Clojure were pretty much just direct regurgitation of the docs, without much value-add, if I'm to be honest.

(defrecord IndexedPerson [date name])

(def people-data (array-map "2008-11-08" "Jacinda" "1990-10-27" "Bill" "2014-09-20" "James" "1979-05-24" "Winston"))

(def people
  (reduce-kv
    (fn [people date name] (conj people (array-map date (IndexedPerson. date name))))
    (array-map)
    people-data))

(print people)

The other thing with Clojure for me is that the code is so alien-looking to me that I can't work out how to indent stuff to make the code clearer. All the examples I've seen don't seem very clear, and the indentation doesn't help either, I think. I guess with more practise it would come to me.

It seems pretty powerful though, cos there's mot much code there to achieve the desired end-goal.

Output for this one:

{2008-11-08 #user.IndexedPerson{:date 2008-11-08, :name Jacinda},
1990-10-27 #user.IndexedPerson{:date 1990-10-27, :name Bill},
2014-09-20 #user.IndexedPerson{:date 2014-09-20, :name James},
1979-05-24 #user.IndexedPerson{:date 1979-05-24, :name Winston}}


Summary

This was actually a very interesting exercise for me, and I learned stuff about all the languages concerned. Even PHP and CFML.

I twitterised a comment regarding how pleasing I found each solution:


This was before I did the Clojure one, and I'd slot that in afer CFML and before JS, making the list:
  1. Python
  2. Ruby
  3. Groovy
  4. CFML
  5. Clojure
  6. JS
  7. PHP
  8. Java

Python's code looks nice and it was easy to find out what to do. Same with Ruby, just not quite so much. And, really same with Groovy. I could order those three any way. I think Python tips the scales slightly with the comprehensions.

CFML came out suprisingly well in this, as it's a bloody easy exercise to achieve with it.

Clojure's fine, just a pain in the arse to understand what's going on, and the code looks a mess to me. But it does a lot in little space.

JS was disappointing because it wasn't nearly so easy as I expected it to be.

PHP is a mess.

And - fuck me - Java. Jesus.

My occasional reader Barry O'Sullivan volunteered some input the other day:


Hopefully he's still up for this, and I'll add it to the list so we can have a look at that code too.

Like I said before, if you know a better or more interesting way to do this in any of the languages above, or any other languages, make a comment and post a link to a Gist (just don't put the code inline in the comment please; it will not render at all well).

I might have another one of these exercises to do soon with another puzzle a friend of mine had to recently endure in a job-interview-related coding test. We'll see.

Righto.

--
Adam

Saturday 30 January 2016

CFML (or probably LuceeLang) and what constitutes "The Truth"

G'day:
This all stems from a Jira ticket for Lucee: "Redefining truthy and falsey values (LDEV-449)". The way the ticket is worded means it's not convenient to copy and paste a chunk of its detail to explain the gist of it, so I'll reiterate.

At the moment CFML relies on type coercion rules determine which values are valid to be used where boolean values are expected. Obviously at the base of things boolean values are true or false. But because CFML is loosely and dynamically typed, one doesn't need to simply use actual booleans in boolean expressions. The following value for other types will also coerce to boolean when called for:

TypeValueCoerces to
String"true" (as opposed to true, which is a keyword, and fundamentally boolean already)true
String"false"false
String"Yes"true
String"No"false
Numeric!=0true
Numeric0false

There are other coercions that will feed into these, for example strings with numeric values - eg: "42" (as opposed to just 42) - will happily coerce into a numeric, then from there to a boolean.

What this ticket is about is extending this to have a notion of "truthy" and "falsy" for other values and other data types. This notion stems from Groovy (and perhaps other languages) which is pretty liberal with what it will consider truthy or falsy. These terms "truthy" and "falsy" denote the values are not actually boolean, but will pass / fail a boolean check.

Here's a quick demo, which is fairly self-explanatory:

print "populated string: "
println " " ? "truthy" : "falsy" 

print "empty string: "
println "" ? "truthy" : "falsy" 
println "============="

print "non-zero: "
println (-1 ? "truthy" : "falsy") 

print "zero: "
println 0 ? "truthy" : "falsy" 
println "============="

print "populated collection: "
println([null] ? "truthy" : "falsy") 

print "empty collection: "
println([] ? "truthy" : "falsy") 
println "============="

print "populated map: "
println([key:""] ? "truthy" : "falsy") 

print "empty map: "
println([:] ? "truthy" : "falsy")
println "============="

class Test{
    String value
    
    Test(value){
        this.value = value
    }
    
    boolean asBoolean(){
        return (value == "truthy")
    }
}

print "truthy object: "
println new Test("truthy") ? "truthy" : "falsy" 

print "falsy object: "
println new Test("anything else") ? "truthy" : "falsy" 
println "============="


And the output:

>groovy truthy.groovy
populated string: truthy
empty string: falsy
=============
non-zero: truthy
zero: falsy
=============
populated collection: truthy
empty collection: falsy
=============
populated map: truthy
empty map: falsy
=============
truthy object: truthy
falsy object: falsy
=============

>

So all the common data types, and indeed custom data types have a sense of truthiness (and, by inversion: falsiness).

This is the gist of the ticket. Basically... checking for emptiness in an object (irrespective of what the definition of "empty" is), is a very very common requirement, and this truthiness concept just gives the ability to abbreviate that via some synatactic sugar and an agreed-on method of identifying what it is to be empty, and fielding that as a boolean: empty is "falsy", and not-empty is "truthy".

I think it's a good idea.

There's been some kick-back on it by people who don't seem to understand the concept of "syntactic sugar" and simply "don't get" the ticket. I'll leave it to you to read their comments and form your own opinion.

Now, as for implementation, I think there is a critical step here, as suggested by that Groovy code. LAS need to implement this as an interface, for both Java and LuceeLang. See how the demo class I have there implements an asBoolean() method:

boolean asBoolean(){
    return (value == "truthy")
}

Any class that implements that method can then have its objects used in boolean expressions. Basically they need to implement something like a BooleanBehaviour interface. Obviously my class there doesn't implement any interfaces, but I have not got far enough into Groovy to know whether interface implementation is just fulfilled by behaviour, rather than by contract (like it is in CFML).

I think it's essential for Lucee's implemention of this to follow an approach like this. Not just to bung some behaviour in to allow its current types to magically behave like booleans, but implement it in a formal fashion. And also expose the same interface to both Lucee users' .lucee code, but Java code too.

The current ticket seems to be asking for this for Lucee's CFML implementation, but I think this is a mistake because some of the CFML's implementation's automatic coercion to boolean conflicts with the "truthy" approach. Most notably a string "0" in CFML is false, because it will be converted to a numeric, and 0 is false. But using sensible "truthy" behaviour for strings... "" is false, and any other string is true. It makes little sense (and violates POLA,  IMO) to have "" and "0" as false, and [any other string] be true. It would just make CFML look like it "didn't quite get it" when it implemented this feature.

Also note that Brad's narrative mentions toBoolean(), whereas Groovy's handling is asBoolean(). This might seem like a semantic difference, but I think it requires closer examination. Groovy has both toBoolean() and asBoolean(); and it looks like C# does too. Best LAS get this one right. At least understand the differences (which I don't, I have to say), before deciding how to approach.

Sorry this article isn't so interesting. I started to percolate away at it during the week and reckon I got enough content for an article, but never really achieved any real nuance to it. Still: for something that started as a Jira comment I guess it's all right.

Back to me Guinness...

--
Adam

Friday 22 January 2016

Groovy: making and using a jar file

G'day:
First up: there's nothing clever in any of this, and I'm approaching it as a newbie. I just had to work out how to do this - which I have - and want to document it. It's nothing you couldn't find on Google with 15min effort.

We need to start writing some Groovy - cool! - for some tests we need to write. We're using SoapUI to test some of our web service end points (JSON schema validation and the like) and its built-in scripting language is Groovy. Our app itself is PHP, but that's irrelevant as it's hidden behind HTTP requests anyhow.

We have some shared utils that we want to us in our tests, and we're thinking about sticking these in a jar file so they can be a) factored-out into a separate source control project; b) plus just making our tests more focused. So I have to find out how to actually compile Groovy code, and stick it in a jar. I know if we're using jars then the source language doesn't need to be Groovy, but we've already got a bunch of the code written, so this is a refactoring exercise.

For my part, all my Groovyage so far has just been individual test scripts, run by the command-line interpreter. I've never needed to compile anything.

I'm not in the situation to actually work with our real test utils yet, so this is just a contrived example.

Here's my class:

package me.adamcameron.greetingapp

class Greeter {

    String greet(String name){
        return "G'day ${name}";
    }

}

Note I've got the package statement in there too. This is homed in me/adamcameron/greetingapp.

I googled about how to compile Groovy, and strangely enough it's much the same as Java:

>groovyc me\adamcameron\greetingapp\Greeter.groovy

>

This creates me/adamcameron/greetingapp/Greeter.class. I then jar it up in the usual way:

>jar cvf greetingapp.jar -C me\adamcameron\greetingapp .
added manifest
adding: Greeter.class(in = 4996) (out= 2092)(deflated 58%)
adding: Greeter.groovy(in = 120) (out= 101)(deflated 15%)

>


From there I just need to test it:

// testGreeter.groovy
import me.adamcameron.greetingapp.*

greeter = new Greeter()

println greeter.greet("Zachary")

And for Groovy to know where the jar file is, I need it to be on the class path:

>groovy -cp greetingapp.jar testGreeter.groovy

G'day Zachary

>

So there we go.

I didn't expect it to be tricky, but pleased how easy it really is.

Told you there'd be nothing insightful in this one!

Righto.

--
Adam

Tuesday 19 January 2016

Floating point arithmetic with decimals

G'day:
As a human... what is the value of z, after you process this pseudocode with your wetware:

x = 17.76
y = 100
z = x * y

Hopefully you'd say "1776". It was not a trick question.

And that's an integer, right? Correct.

CFML

Now... try this CFML code:

x = 17.76;
y  = 100;
z = x*y;

writeOutput(z);

1776 So far so good.

But what about this:

writeOutput(isValid("integer", z));

You might think "YES" (or true if yer on Lucee), however it's "NO".

And this is where young players fall into the trap. They get all annoyed with isValid() getting it wrong, etc. Which, to be fair, is a reasonable assumption with isValid(), but it's not correct in this instance. It's the young player who is mistaken.

If we now do this:

writeOutput(z.getClass().getName());

We get: java.lang.Double

OK, but 1776 can be a Double, sure. But CFML should still consider a Double 1776 as a valid integer, as it should be able to be treated like one. So why doesn't it? What if we circumvent CFML, and go straight to Java:

writeOutput(z.toString());

1776.0000000000002

Boom. Floating point arithmetic inaccuracy.

Never ever ever forget, everyone... when you multiply floating point numbers with decimals... you will get "unexpected" (but you should pretty much expect it!) floating point accuracy issues. This is for the perennial reason that what's easy for us to express in decimal is actually quite hard for a computer to translate into binary accurately.

Aside: we were chatting about all this on the CFML Slack channel this morning, and one person asked "OK, so how come 17.75 x 100 works and 17.76 x 100 does not?". This is because a computer can represent 0.75 in binary exactly (2-1 + 2-2), whereas 0.76 can only be approximated, hence causing the "issue".

The problem really is that CFML should simply output 1776.0000000000002 when we ask it, and it should not try to be clever and hide this stuff. Because it's significant information. Then when the young player output the value, they'd go "oh yeah, better round that" or whatever they need to do before proceeding. CFML is not helping here.

This is pretty ubiquitous in programming. Let's have a trawl through the various languages I can write the simplest of code in:

JavaScript


x = 17.76;
y = 100;
z = x * y

console.log(z);


>node jsVersion.js
1776.0000000000002

>

JS just does what it's told. Unsurprisingly.

Groovy


x = 17.76
y = 100
z = x * y
println "x * y: " + z

println "x: " + x.getClass().getName()
println "y: " + y.getClass().getName()
println "z: " + z.getClass().getName()
println "z: " + z.toString()


>groovy32 groovyVersion.groovy

x * y: 1776.00
x: java.math.BigDecimal
y: java.lang.Integer
z: java.math.BigDecimal
z: 1776.00
>


This is interesting. Whilst Groovy keeps the result as a float (specifically a BigDecimal) - which is correct - it truncates it to the total number of decimal places expressed in its factors. That's how I was taught to do it in Physics at school, so I like this. This second example makes it more clear:

x = 3.30
y = 7.70
z = x * y
println "x * y: " + z

println "x: " + x.getClass().getName()
println "y: " + y.getClass().getName()
println "z: " + z.getClass().getName()
println "z: " + z.toString()


>groovy32 more.groovy
x * y: 25.4100
x: java.math.BigDecimal
y: java.math.BigDecimal
z: java.math.BigDecimal
z: 25.4100
>

In 3.30 and 7.70 there are four decimal places expressed (ie: two for each factor), so Groovy maintains that accuracy. Nice!


Java


import java.math.BigDecimal;

class JavaVersion {

    public static void main(String[] args){
        double x = 17.76;
        int y = 100;
        System.out.println(x*y);
        
        BigDecimal x2 = new BigDecimal(17.76);
        BigDecimal y2 = new BigDecimal(100);
        System.out.println(x2.multiply(y2));
        
    }
}

Here I added a different variation because I was trying to see why the Groovy code behaved the way it did, but it didn't answer my question. I suspected that perhaps it was a BigDecimal thing how it decided on the accuracy of the result, but it wasn't:


>java JavaVersion
1776.0000000000002
1776.000000000000156319401867222040891647338867187500

>

This is a good demonstration of how a simply base-10 decimal fraction is actually an irrational number in binary.

Saturday 7 December 2013

CFML weirdness with chr(0)

G'day:
I was trying to help someone on Stack Overflow y/day with their question "How to pad a string with null/zero-bytes in ColdFusion & difference between CF on MacOS and Windows". The question is actually shorten than the title: how to add null characters to a string in CFML.

I thought the answer was a simple "use chr(0)", but this turned out to not be a viable answer on ColdFusion (or Railo for that matter).

In response to my suggestion, Stack Overflow veteran Leigh made the observation "Unfortunately no. The character just disappears. But URLDecode("%00") will generate a single null byte". I've not known Leigh to be wrong so I didn't doubt him, but that sounded really odd, so I decided to check it out.

<!--- baseline.cfm --->
<cfset s = "test" & chr(0)>
<cfoutput>
string: [#s#]<br>
length: [#len(s)#]<br>
</cfoutput>

And the - surprising to me - output:

string: [test]
length: [4]


Um... where gone?

I tried this on Railo... thinking Railo's more likely to get it right than CF is, and it had the same output. Testing on OpenBD got what I'd consider to be the correct results:

string: [test]
length: [5]


The NULL isn't printable, so doesn't render anything, but it should still actually be there, and that string should consist of five bytes: 0, 116, 101, 115, 116. That's a length of five. As per OpenBD's output.

That said, I know that NULLs do have special meaning in some strings, for example C has the concept of a null-terminated string, in which the null signifies the end of the string, not a character in the string. I wasn't aware of this being a "thing" in Java, but maybe it was.

I refined my code somewhat to not be simply end-padding the string:

<!--- outputStringContainingChr0.cfm --->
<cfset s = chr(0) & "foo#chr(0)#">
<cfoutput>#s#:#len(s)#:#asc(left(s,1))#</cfoutput>

So I'm sticking a NULL at the beginning and end of the string. If it was acting as a terminator, s would simply be an empty string afterwards. But I get this (CF and Railo both):

foo:3:102

102 is indeed the ASCII code for f.


So what's the story here? I still had a suspicion that something "non-stupid" was happening here, and I just didn't get it. Maybe there's something about a NULL char's standard handling that means it's not added to strings. Although this seems far-fetched as obviously there's use-cases for it (see the Stack Overflow question), and indeed Leigh came up with the fudged way to do it:

Groovy: G'day World

G'day:
I'm investigating some ColdFusion stupidity (more on that after this article), and as part of my comparison of the behaviour in other languages, I figured I should also test on another JVM-based language. And given all the adulation of Groovy at the moment, and an exceptional presentation "Grails/Groovy Primer for ColdFusion Developers" from Scott Stroz, I decided to D/L and install Groovy.

Warning: this will be a short article, and will only take me as far as getting "G'day World" onto the screen. It's really aimed at CFMLers like myself who never find time to do anything else, and just a demonstration as to how quick this all was.

Previously I've written up similar exploits on PHP: "PHP: from zero to... Hello World" and Ruby: "Ruby: stream-of-consciousness".

OK, so here's what I did to get Groovy up and running on my machine: