Showing posts with label Java. Show all posts
Showing posts with label Java. Show all posts

Sunday 17 March 2024

CFML: solving a CFML problem with Java. Kinda.

G'day:

The other day on the CMFL Slack channel, Nick Petrie asked:

Anyone know of a native or plugin-based solution to pretty-formatting XML for display on the page? I'd like to output the XML nested this:
I'll use a simplified example> we wanna render this:
<aaa><bbb ccc="ddd"><eee/></bbb></aaa>
Like this:
<aaa>
    <bbb ccc="ddd">
        <eee/>
    </bbb>
</aaa

There's no native CFML way of doing this, but I figured "this is a solved problem: there'll be an easy way to do it in Java". I googled java prettier xml, and the first match was on the Baeldung website (Pretty-Print XML in Java) which is a site I trust to have good answers. But I checked that and a few others, and they all seem to be solving the problem the same way. So I decided to run with that.

The task at hand is to convert this to CFML (still using the Java libs to do the work, I mean, just runnable in CFML):

public static String prettyPrintByTransformer(String xmlString, int indent, boolean ignoreDeclaration) {

    try {
        InputSource src = new InputSource(new StringReader(xmlString));
        Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(src);

        TransformerFactory transformerFactory = TransformerFactory.newInstance();
        transformerFactory.setAttribute("indent-number", indent);
        Transformer transformer = transformerFactory.newTransformer();
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, ignoreDeclaration ? "yes" : "no");
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");

        Writer out = new StringWriter();
        transformer.transform(new DOMSource(document), new StreamResult(out));
        return out.toString();
    } catch (Exception e) {
        throw new RuntimeException("Error occurs when pretty-printing xml:\n" + xmlString, e);
    }
}

To be very clear, this is not my code, it's from Pretty-Print XML in Java.

And also to be clear: I'm not gonna be doing anything unique or difficult or insightful or anything like that in this article. All I'm gonna do is to show how easy it is to convert native Java code to native CFML code, and this is a topical use case. It's gonna be a function with some object creation and some variable assigments. I'm gonna go line-by-line through that function above, and CFMLerise it. I just made that word up.

I'm gonna be using trycf.com to write and run this code, and the aim is to have a version that runs in both CF and Lucee.

First, let's run with the same function signature:

unformattedXml = '<aaa><bbb ccc="ddd"><eee/></bbb></aaa>' 

public string function prettyPrintByTransformer(required string xmlString, numeric indent=4, boolean ignoreDeclaration=true) {

}

prettyPrintByTransformer(unformattedXml)

In each step I'll be using that same unformattedXml input value, and just running the function to ensure it's got no compilation errors and each new statement I add "works" (in that it doesn't have runtime errors). The function won't do anything useful until it's done.

In this first step note that I've given the latter two parameters sensible defaults. This is a change from the Java version.

First things first: I don't like how they've put that largely pointless try/catch in the Java code. To me that sort of error-handling should be in the calling code if needed, not embedded in the implementation. If the implementation errors-out: let it. The actual exception will be more useful than swallowing the real exception and throwing a contrived one.

I'll include each statement I'm converting as a comment above the CFML version. This is only so I can draw your focus to it. I'd never include comments of this nature in my actual code.

public string function prettyPrintByTransformer(required string xmlString, numeric indent=4, boolean ignoreDeclaration=true) {
    // InputSource src = new InputSource(new StringReader(xmlString));
    var xmlAsStringReader = createObject("java", "java.io.StringReader").init(xmlString) // new StringReader(xmlString)
    var src = createObject("java", "org.xml.sax.InputSource").init(xmlAsStringReader)
}

I could have done this without the intermediary variable, but CFML is quite cumbersome making Java object proxies, so I think the code is clearer spread over two statements.

    // Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(src);
    var document = createObject("java", "javax.xml.parsers.DocumentBuilderFactory").newInstance().newDocumentBuilder().parse(src)
    // TransformerFactory transformerFactory = TransformerFactory.newInstance();
    var transformerFactory = createObject("java", "javax.xml.transform.TransformerFactory").newInstance()

Now: this statement did not work for me:

transformerFactory.setAttribute("indent-number", indent);

The CFML equivalent is exactly the same as it happens, but I was just getting "Not supported: indent-number" (CF) or "TransformerFactory does not recognise attribute 'indent-number'." (Lucee). I presume it's a library version difference, but I was fairly limited in my troubleshooting as I was running this on trycf.com. Although I did verify I was getting the same thing on my local CF2023 container too. I googled a bit and found a work around, but I'll come back to this a bit further down.

    // Transformer transformer = transformerFactory.newTransformer();
    var transformer = transformerFactory.newTransformer()
    /*
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, ignoreDeclaration ? "yes" : "no");
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
    */
    var outputKeys = createObject("java", "javax.xml.transform.OutputKeys")
    transformer.setOutputProperty(outputKeys.ENCODING, "UTF-8")
    transformer.setOutputProperty(outputKeys.OMIT_XML_DECLARATION, ignoreDeclaration ? "yes" : "no")
    transformer.setOutputProperty(outputKeys.INDENT, "yes")
    transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", indent)

Once again, I need an intermediary variable here for the outputKeys class proxy. Just to save repetition in this case.

Also note the last line there is extra: it's the solution for the bit that wasn't working earlier. Easy.

    // Writer out = new StringWriter();
    var out = createObject("java", "java.io.StringWriter").init()
    // transformer.transform(new DOMSource(document), new StreamResult(out));
    var documentAsDomSource = createObject("java", "javax.xml.transform.dom.DOMSource").init(document)
    var outAsStreamResult = createObject("java", "javax.xml.transform.stream.StreamResult").init(out)
    transformer.transform(documentAsDomSource, outAsStreamResult)

More intermediary variables here.

    // return out.toString();
    return out.toString()

And that's it. I've not got much commentary in all that lot, because it's all so straight forward [shrug].

The end result in one piece is thus:

public string function prettyPrintByTransformer(required string xmlString, numeric indent=4, boolean ignoreDeclaration=true) {
    var xmlAsStringReader = createObject("java", "java.io.StringReader").init(xmlString)
    var src = createObject("java", "org.xml.sax.InputSource").init(xmlAsStringReader)
    
    var document = createObject("java", "javax.xml.parsers.DocumentBuilderFactory").newInstance().newDocumentBuilder().parse(src)
    
    var transformerFactory = createObject("java", "javax.xml.transform.TransformerFactory").newInstance()
    var transformer = transformerFactory.newTransformer()
    
    var outputKeys = createObject("java", "javax.xml.transform.OutputKeys")
    transformer.setOutputProperty(outputKeys.ENCODING, "UTF-8")
    transformer.setOutputProperty(outputKeys.OMIT_XML_DECLARATION, ignoreDeclaration ? "yes" : "no");
    transformer.setOutputProperty(outputKeys.INDENT, "yes")
    transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", indent)
    
    var out = createObject("java", "java.io.StringWriter").init()
    
    var documentAsDomSource = createObject("java", "javax.xml.transform.dom.DOMSource").init(document)
    var outAsStreamResult = createObject("java", "javax.xml.transform.stream.StreamResult").init(out)
    transformer.transform(documentAsDomSource, outAsStreamResult)

    return out.toString()
}

And a coupla test runs:

formattedXml = prettyPrintByTransformer(unformattedXml)
writeOutput("<pre>#encodeForHtml(formattedXml)#</pre>")
<aaa>
    <bbb ccc="ddd">
        <eee/>
    </bbb>
</aaa>
formattedXml = prettyPrintByTransformer(unformattedXml, 8, false)
writeOutput("<pre>#encodeForHtml(formattedXml)#</pre>")
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<aaa>
        <bbb ccc="ddd">
                <eee/>
        </bbb>
</aaa>

The output above is from Lucee. On CF we get this:

<?xml version="1.0" encoding="UTF-8"?><aaa>
        <bbb ccc="ddd">
                <eee/>
        </bbb>
</aaa>

Note this is nothing to do with the CFML code, it'll be some Java library variation on the Lucee and CF set-ups on trycf.com.

Right, so all this lot shows is that there's no reason to shy away from implementing a CFML version of some Java code you might find that solves a problem. So in turn, if you have some "algorithmic" issue that you wanna solve in CFML, don't shy away from googling for Java solutions, and checking how easy/hard it is to convert.

A runnable version of this code is @ trycf.com.

Righto.

--
Adam

Tuesday 28 November 2017

That array_map quandary implemented in other languages

G'day:
A coupla days ago I bleated about array_map [having] a dumb implementation. I had what I thought was an obvious application for array_map in PHP, but it couldn't really accommodate me due to array_map not exposing the array's keys to the callback, and then messing up the keys in the mapped array if one passes array_map more than one array to process.

I needed to remap this:

[
    "2008-11-08" => "Jacinda",
    "1990-10-27" => "Bill",
    "2014-09-20" => "James",
    "1979-05-24" => "Winston"
]

To this:

array(4) {
  '2008-11-08' =>
  class IndexedPerson#3 (2) {
    public $date =>
    string(10) "2008-11-08"
    public $name =>
    string(7) "Jacinda"
  }
  '1990-10-27' =>
  class IndexedPerson#4 (2) {
    public $date =>
    string(10) "1990-10-27"
    public $name =>
    string(4) "Bill"
  }
  '2014-09-20' =>
  class IndexedPerson#5 (2) {
    public $date =>
    string(10) "2014-09-20"
    public $name =>
    string(5) "James"
  }
  '1979-05-24' =>
  class IndexedPerson#6 (2) {
    public $date =>
    string(10) "1979-05-24"
    public $name =>
    string(7) "Winston"
  }
}

Note how the remapped object also contains the original key value. That was the sticking point. Go read the article for more detail and more whining.

OK so my expectations of PHP's array higher order functions are based  on  my experience with JS's and CFML's equivalents. Both of which receive the key as well as the value in all callbacks. I decided to see how other languages achieve the same end, and I'll pop the codee in here for shits 'n' giggles.


CFML

Given most of my history is as a CFML dev, that one was easy.

peopleData = ["2008-11-08" = "Jacinda", "1990-10-27" = "Bill", "2014-09-20" = "James", "1979-05-24" = "Winston"]

people = peopleData.map((date, name) => new IndexedPerson(date, name))

people.each((date, person) => echo("#date# => #person#<br>"))

Oh, this presupposes the IndexedPerson component. Due to a shortcoming of how CFML works, components must be declared in a file of their own:

component {

    function init(date, name) {
        this.date = date
        this.name = name
    }

    string function _toString() {
        return "{date:#this.date#; name: #this.name#}"
    }
}


But the key bit is the mapping operation:

people = peopleData.map((date, name) => new IndexedPerson(date, name))

Couldn't be simpler (NB: this is Lucee's CFML implementation, not ColdFusion's which does not yet support arrow functions).

The output is:


2008-11-08 => {date:2008-11-08; name: Jacinda}
1990-10-27 => {date:1990-10-27; name: Bill}
2014-09-20 => {date:2014-09-20; name: James}
1979-05-24 => {date:1979-05-24; name: Winston}

Also note that CFML doesn't have associative arrays, it has structs, so the keys are not ordered. This does not matter here. (Thanks to Zac for correcting me here: CFML does have ordered structs these days).


JS

The next language I turned to was JS as that's the I'm next most familiar with. One thing that hadn't occurred to me is that whilst JS's Array implementation has a map method, we need to use an object here as the keys are values not indexes. And whilst I knew Objects didn't have a map method, I didn't know what the equivalent might be.

Well it turns out that there's no real option to use a map here, so I needed to do a reduce on the object's entries, Still: it's pretty terse and obvious:

class IndexedPerson {
    constructor(date, name) {
        this.date = date
        this.name = name
    }
}

let peopleData = {"2008-11-08": "Jacinda", "1990-10-27": "Bill", "2014-09-20": "James", "1979-05-24": "Winston"}

let people = Object.entries(peopleData).reduce(function (people, personData) {
    people.set(personData[0], new IndexedPerson(personData[0], personData[1]))
    return people
}, new Map())

console.log(people)

This returns what we want:

Map {
  '2008-11-08' => IndexedPerson { date: '2008-11-08', name: 'Jacinda' },
  '1990-10-27' => IndexedPerson { date: '1990-10-27', name: 'Bill' },
  '2014-09-20' => IndexedPerson { date: '2014-09-20', name: 'James' },
  '1979-05-24' => IndexedPerson { date: '1979-05-24', name: 'Winston' } }

TBH I think this is a misuse of an object to contain basically an associative array / struct, but so be it. It's the closest analogy to the PHP requirement. I was able to at least return it as a Map, which I think is better. I tried to have the incoming personData as a map, but the Map prototype's equivalent of entries() used above is unhelpful in that it returns an Iterator, and the prototype for Iterator is a bit spartan.

I think it's slightly clumsy I need to access the entries value via array notation instead of some sort of name, but this is minor.

As with all my code, I welcome people showing me how I should actually be doing this. Post a comment. I'm looking at you Ryan Guill ;-)

Java

Next up was Java. Holy fuck what a morass of boilterplate nonsense I needed to perform this simple operation in Java. Deep breath...

import java.util.HashMap;
import java.util.Map;
import java.util.stream.Collectors;

class IndexedPerson {
    String date;
    String name;
    
    public IndexedPerson(String date, String name) {
        this.date = date;
        this.name = name;
    }
    
    public String toString(){
        return String.format("{date: %s, name: %s}", this.date, this.name);
    }
}

class Collect {

    public static void main(String[] args) {

        HashMap<String,String> peopleData = loadData();

        HashMap<String, IndexedPerson> people = mapToPeople(peopleData);
            
        dumpIdents(people);
    }
    
    private static HashMap<String,String> loadData(){
        HashMap<String,String> peopleData = new HashMap<String,String>();
        
        peopleData.put("2008-11-08", "Jacinda");
        peopleData.put("1990-10-27", "Bill");
        peopleData.put("2014-09-20", "James");
        peopleData.put("1979-05-24", "Winston");
        
        return peopleData;
    }
    
    private static HashMap<String,IndexedPerson> mapToPeople(HashMap<String,String> peopleData) {
        HashMap<String, IndexedPerson> people = (HashMap<String, IndexedPerson>) peopleData.entrySet().stream()
            .collect(Collectors.toMap(
                e -> e.getKey(),
                e -> new IndexedPerson(e.getKey(), e.getValue())
            ));
            
        return people;
    }
    
    private static void dumpIdents(HashMap<String,IndexedPerson> people) {
        for (Map.Entry<String, IndexedPerson> entry : people.entrySet()) {
            System.out.println(String.format("%s => %s", entry.getKey(), entry.getValue()));
        }
    }
    
}

Result:
1979-05-24 => {date: 1979-05-24, name: Winston}
2014-09-20 => {date: 2014-09-20, name: James}
1990-10-27 => {date: 1990-10-27, name: Bill}
2008-11-08 => {date: 2008-11-08, name: Jacinda}

Most of that lot seems to be just messing around telling Java what types everything are. Bleah.

The interesting bit - my grasp of which is tenuous - is the Collectors.toMap. I have to admit I derived that from reading various Stack Overflow articles. But I got it working, and I know the general approach now, so that's good.

Too much code for such a simple thing though, eh?


Groovy

Groovy is my antidote to Java. Groovy makes this shit easy:

class IndexedPerson {
    String date
    String name

    IndexedPerson(String date, String name) {
        this.date = date;
        this.name = name;
    }

    String toString(){
        String.format("date: %s, name: %s", this.date, this.name)
    }
}

peopleData = ["2008-11-08": "Jacinda", "1990-10-27": "Bill", "2014-09-20": "James", "1979-05-24": "Winston"]

people = peopleData.collectEntries {date, name -> [date, new IndexedPerson(date, name)]}

people.each {date, person -> println String.format("%s => {%s}", date, person)}

Bear in mind that most of that is getting the class defined, and the output. The bit that does the mapping is just the one line in the middle. That's more like it.

Again, I don't know much about Groovy… I had to RTFM to find out how to do the collectEntries bit, but it was easy to find and easy to understand.

I really wish I had a job doing Groovy.

Oh yeah, for the sake of completeness, the output was thus:

2008-11-08 => {date: 2008-11-08, name: Jacinda}
1990-10-27 => {date: 1990-10-27, name: Bill}
2014-09-20 => {date: 2014-09-20, name: James}
1979-05-24 => {date: 1979-05-24, name: Winston}


Ruby

Ruby's version was pretty simple too as it turns out. No surprise there as Ruby's all about higher order functions and applying blocks to collections and stuff like that.

class IndexedPerson

    def initialize(date, name)
        @date = date
        @name = name
    end

    def inspect
        "{date:#{@date}; name: #{@name}}\n"
    end
end

peopleData = {"2008-11-08" => "Jacinda", "1990-10-27" => "Bill", "2014-09-20" => "James", "1979-05-24" => "Winston"}

people = peopleData.merge(peopleData) do |date, name|
    IndexedPerson.new(date, name)
end

puts people

Predictable output:

{"2008-11-08"=>{date:2008-11-08; name: Jacinda}
, "1990-10-27"=>{date:1990-10-27; name: Bill}
, "2014-09-20"=>{date:2014-09-20; name: James}
, "1979-05-24"=>{date:1979-05-24; name: Winston}
}

I wasn't too sure about all that block nonsense when I first started looking at Ruby, but I quite like it now. It's easy to read.


Python

My Python skills don't extend much beyond printing G'day World on the screen, but it was surprisingly easy to google-up how to do this. And I finally got to see what Python folk are on about with this "comprehensions" stuff, which I think is quite cool.

class IndexedPerson:
    def __init__(self, date, name):
        self.date = date
        self.name = name

    def __repr__(self):
        return "{{date: {date}, name: {name}}}".format(date=self.date, name=self.name)

people_data = {"2008-11-08": "Jacinda", "1990-10-27": "Bill", "2014-09-20": "James", "1979-05-24": "Winston"}

people = {date: IndexedPerson(date, name) for (date, name) in people_data.items()}

print("\n".join(['%s => %s' % (date, person) for (date, person) in people.items()]))


And now that I am all about Clean Code, I kinda get the "whitespace as indentation" thing too. It's clear enough if yer code is clean in the first place.

The output of this is identical to the Groovy one.

Only one more then I'll stop.

Clojure

I can only barely do G'day World in Clojure, so this took me a while to work out. I also find the Clojure docs to be pretty impentrable. I'm sure they're great if one already knows what one is doing, but I found them pretty inaccessible from the perspective of a n00b. It's like if the PHP docs were solely the user-added stuff at the bottom of each docs page. Most blog articles I saw about Clojure were pretty much just direct regurgitation of the docs, without much value-add, if I'm to be honest.

(defrecord IndexedPerson [date name])

(def people-data (array-map "2008-11-08" "Jacinda" "1990-10-27" "Bill" "2014-09-20" "James" "1979-05-24" "Winston"))

(def people
  (reduce-kv
    (fn [people date name] (conj people (array-map date (IndexedPerson. date name))))
    (array-map)
    people-data))

(print people)

The other thing with Clojure for me is that the code is so alien-looking to me that I can't work out how to indent stuff to make the code clearer. All the examples I've seen don't seem very clear, and the indentation doesn't help either, I think. I guess with more practise it would come to me.

It seems pretty powerful though, cos there's mot much code there to achieve the desired end-goal.

Output for this one:

{2008-11-08 #user.IndexedPerson{:date 2008-11-08, :name Jacinda},
1990-10-27 #user.IndexedPerson{:date 1990-10-27, :name Bill},
2014-09-20 #user.IndexedPerson{:date 2014-09-20, :name James},
1979-05-24 #user.IndexedPerson{:date 1979-05-24, :name Winston}}


Summary

This was actually a very interesting exercise for me, and I learned stuff about all the languages concerned. Even PHP and CFML.

I twitterised a comment regarding how pleasing I found each solution:


This was before I did the Clojure one, and I'd slot that in afer CFML and before JS, making the list:
  1. Python
  2. Ruby
  3. Groovy
  4. CFML
  5. Clojure
  6. JS
  7. PHP
  8. Java

Python's code looks nice and it was easy to find out what to do. Same with Ruby, just not quite so much. And, really same with Groovy. I could order those three any way. I think Python tips the scales slightly with the comprehensions.

CFML came out suprisingly well in this, as it's a bloody easy exercise to achieve with it.

Clojure's fine, just a pain in the arse to understand what's going on, and the code looks a mess to me. But it does a lot in little space.

JS was disappointing because it wasn't nearly so easy as I expected it to be.

PHP is a mess.

And - fuck me - Java. Jesus.

My occasional reader Barry O'Sullivan volunteered some input the other day:


Hopefully he's still up for this, and I'll add it to the list so we can have a look at that code too.

Like I said before, if you know a better or more interesting way to do this in any of the languages above, or any other languages, make a comment and post a link to a Gist (just don't put the code inline in the comment please; it will not render at all well).

I might have another one of these exercises to do soon with another puzzle a friend of mine had to recently endure in a job-interview-related coding test. We'll see.

Righto.

--
Adam

Thursday 10 November 2016

PHP & CFML: I get thrown by how static variables work

G'day:
I like it when I demonstrate to myself I'm a bit thick. Today was one of those days. I've been working with PHP for a coupla years now, and pretty much have ported my OO knowledge over to it. Coming from a CFML background one thing I'm not so familiar with is static variables and methods. I "get" it, that's cool, but clearly I'm still mid-way up the learning curve, and some idiosyncrasies still elude me. I found one today.

I was doing some testing, and the situation I was in was that my test classes needed some subclassing... don't ask... it was legit but I cannot go into details and it'd bore you rigid anyhow. In my base class I had a static variable which I needed to have a different value for in the subclass (it was actually a sub-sub-class. Dumb tests). So I set it to a new value in my test method, then the test method called a method from the base class, and the test went splat cos it was still using the base class value. This confused the shit out of me at first. Here's a simplified example:

class GP {

    public static $test = "GP";
    
    public function fromGP(){
        return self::$test;
    }
    
    public function get(){
        return self::$test;
    }

}

class P extends GP {

    public static $test = "P";
    
    public function fromP(){
        return self::$test;
    }

}

class C extends P {

    public static $test = "C";
    
    public function fromC(){
        return self::$test;
    }

}

$c = new C();

echo "fromC: " . $c->fromC() . PHP_EOL;
echo "fromP: " . $c->fromP() . PHP_EOL;
echo "fromGP: " . $c->fromGP() . PHP_EOL;
echo "get: " . $c->get() . PHP_EOL;

So:

  • I've got a grandparent class (GP) which sets a static variable, and has two methods to get its value.
  • And a Parent class (P) that extends GP, with its own value for the static variable, and another different method to get its value.
  • And a Child class (C) which does the same thing, but to the Parent.
  • In my test rig I create a C instance, and call each of the methods.
Now cos I'm thick, I expected this output:

fromC: C
fromP: C
fromGP: C
get: C

Because C overrides the value for the static test variable, so the inherited methods should all use that value.

But what I did get is this:

fromC: C
fromP: P
fromGP: GP
get: GP

It took me a while to click. I changed my variable to be not static, and got the results I expected. Changed it back: back to the "wrong" (where "wrong" is "not what I wanted") answer. Then it dawned on me I guess. Static variables are bound to the class. I already knew this, but the ramifications didn't hit me initially. $c might be an object that inherits from P and GP, and obviously object variables will also follow inheritance rules, but static variables relate to the class, not the object. So in the class GP, in which the method fromGP is defined... self::$test refers to the static $test variable in that class. So: the value is GP. and on from there for fromP and down to fromC too (and get).

It made sense once I engaged my brain.

When I got home from work I decided to "ask" Java what it thought about this, and knocked together an analogous solution. Please pardon my shit Java skills.

package me.adamcameron.test;

public class GP {

    public static String test = "GP";
    
    public String fromGP(){
        return test;
    }
    
    public String get(){
        return test;
    }

}

package me.adamcameron.test;

public class P extends GP {

    public static String test = "P";
    
    public String fromP(){
        return test;
    }

}

package me.adamcameron.test;

public class C extends P {

    public static String test = "C";
    
    public String fromC(){
        return test;
    }

}

package me.adamcameron.test;

public class Test {

    public static void main(String[] args){
        C c = new C();
        
        System.out.println("fromC: " + c.fromC());
        System.out.println("fromP: " + c.fromP());
        System.out.println("fromGP: " + c.fromGP());
        System.out.println("get: " + c.get());
        
    }

}

And that - predictably - confirmed PHP's behaviour (not that I doubted PHP. Much):

>java me.adamcameron.test.Test
fromC: C
fromP: P
fromGP: GP
get: GP

Coolio. I'm also pleased I managed to write 20-odd lines of Java and get it right firstsecond time (it was a typo! ;-)

Last but not least, I remember that Lucee's flavour of CFML introduced static variables in Lucee 5, so I decided to see what it did:

component {

    static {
        test = "GP";
    }
    
    public function fromGP(){
        return static.test;
    }
    
    public function get(){
        return static.test;
    }
}

component extends=GP {

    static {
        test = "P";
    }
    
    public function fromP(){
        return static.test;
    }
    
    public function get(){
        return static.test;
    }
}

component extends=P {

    static {
        test = "C";
    }
    
    public function fromC(){
        return static.test;
    }
    
    public function get(){
        return static.test;
    }
}


<cfscript>
o = new C();

writeOutput("fromC: #o.fromC()#<br>");
writeOutput("fromP: #o.fromP()#<br>");
writeOutput("fromGP: #o.fromGP()#<br>");
writeOutput("get: #o.get()#<br>");
</cfscript>

When I ran this...

fromC: C
fromP: C
fromGP: C
get: C


Hrm. So Lucee does what I originally wanted, but I now reckon this is not what it should be doing. I best go ask someone from LAS what the story is there.

Anyway that was that. Just a small lesson learned for me today.

Oh, btw... the solution was to just make the variable not static. Then me code worked fine. There really was no need for it to be static in the first place, on reflection.

Update:
It's important to read Kalle's comment (below) regarding static:: in PHP.

Righto.

--
Adam

Tuesday 19 January 2016

Floating point arithmetic with decimals

G'day:
As a human... what is the value of z, after you process this pseudocode with your wetware:

x = 17.76
y = 100
z = x * y

Hopefully you'd say "1776". It was not a trick question.

And that's an integer, right? Correct.

CFML

Now... try this CFML code:

x = 17.76;
y  = 100;
z = x*y;

writeOutput(z);

1776 So far so good.

But what about this:

writeOutput(isValid("integer", z));

You might think "YES" (or true if yer on Lucee), however it's "NO".

And this is where young players fall into the trap. They get all annoyed with isValid() getting it wrong, etc. Which, to be fair, is a reasonable assumption with isValid(), but it's not correct in this instance. It's the young player who is mistaken.

If we now do this:

writeOutput(z.getClass().getName());

We get: java.lang.Double

OK, but 1776 can be a Double, sure. But CFML should still consider a Double 1776 as a valid integer, as it should be able to be treated like one. So why doesn't it? What if we circumvent CFML, and go straight to Java:

writeOutput(z.toString());

1776.0000000000002

Boom. Floating point arithmetic inaccuracy.

Never ever ever forget, everyone... when you multiply floating point numbers with decimals... you will get "unexpected" (but you should pretty much expect it!) floating point accuracy issues. This is for the perennial reason that what's easy for us to express in decimal is actually quite hard for a computer to translate into binary accurately.

Aside: we were chatting about all this on the CFML Slack channel this morning, and one person asked "OK, so how come 17.75 x 100 works and 17.76 x 100 does not?". This is because a computer can represent 0.75 in binary exactly (2-1 + 2-2), whereas 0.76 can only be approximated, hence causing the "issue".

The problem really is that CFML should simply output 1776.0000000000002 when we ask it, and it should not try to be clever and hide this stuff. Because it's significant information. Then when the young player output the value, they'd go "oh yeah, better round that" or whatever they need to do before proceeding. CFML is not helping here.

This is pretty ubiquitous in programming. Let's have a trawl through the various languages I can write the simplest of code in:

JavaScript


x = 17.76;
y = 100;
z = x * y

console.log(z);


>node jsVersion.js
1776.0000000000002

>

JS just does what it's told. Unsurprisingly.

Groovy


x = 17.76
y = 100
z = x * y
println "x * y: " + z

println "x: " + x.getClass().getName()
println "y: " + y.getClass().getName()
println "z: " + z.getClass().getName()
println "z: " + z.toString()


>groovy32 groovyVersion.groovy

x * y: 1776.00
x: java.math.BigDecimal
y: java.lang.Integer
z: java.math.BigDecimal
z: 1776.00
>


This is interesting. Whilst Groovy keeps the result as a float (specifically a BigDecimal) - which is correct - it truncates it to the total number of decimal places expressed in its factors. That's how I was taught to do it in Physics at school, so I like this. This second example makes it more clear:

x = 3.30
y = 7.70
z = x * y
println "x * y: " + z

println "x: " + x.getClass().getName()
println "y: " + y.getClass().getName()
println "z: " + z.getClass().getName()
println "z: " + z.toString()


>groovy32 more.groovy
x * y: 25.4100
x: java.math.BigDecimal
y: java.math.BigDecimal
z: java.math.BigDecimal
z: 25.4100
>

In 3.30 and 7.70 there are four decimal places expressed (ie: two for each factor), so Groovy maintains that accuracy. Nice!


Java


import java.math.BigDecimal;

class JavaVersion {

    public static void main(String[] args){
        double x = 17.76;
        int y = 100;
        System.out.println(x*y);
        
        BigDecimal x2 = new BigDecimal(17.76);
        BigDecimal y2 = new BigDecimal(100);
        System.out.println(x2.multiply(y2));
        
    }
}

Here I added a different variation because I was trying to see why the Groovy code behaved the way it did, but it didn't answer my question. I suspected that perhaps it was a BigDecimal thing how it decided on the accuracy of the result, but it wasn't:


>java JavaVersion
1776.0000000000002
1776.000000000000156319401867222040891647338867187500

>

This is a good demonstration of how a simply base-10 decimal fraction is actually an irrational number in binary.

Sunday 17 January 2016

Stepping looping

G'day:
Yeah, this is an odd thing to write about, and there's no trick to it: I really do mean this sort of thing:

for i=1 to 10 step 2

I got to thinking about this because of a comment from Chris Dawes on my article Lucee 5 beta: <cfloop> tweak, in which he sung the praises of the CFML tag equivalent of that general construct:

<cfloop index="i" from="1" to="10" step="2">

This is fine for a view, but I'd steer clear of it in my business logic. But what would I use instead? The obvious answer is this:

for (i=1; i <= 10; i+=2)

But these days I try to eschew generic looping like that, in favour of a more purposeful iteration exercise using iteration methods. If I was iterating a collection to remap data, I'd use map(); if I was doing so to translate the collection data to another data structure or object, I'd reduce(); if I was getting rid of stuff I'd filter(), and - if the language allowed it - if I was checking to see if one or any of the collection elements fulfilled some criteria, I'd use some() or every(). It's really seldom that one wants to loop "for the hell of it", if one thinks about it. It's an exercise in processing data or checking data.

Obviously in a view one is likely to be just outputting data from the data collection, so using a general loop is as good as anything. Also if one is skipping records... a stepped general purpose loop is perhaps better than using like an array-based loop and conditionally continuing the loop or some nonsense like that. All good.

But this got me thinking... let's say I was:

  1. writing business logic;
  2. wanting to step over array elements;
  3. still wanted to use array iteration methods.

This is more a mental exercise than a programming one, but it just happened to pique my interest.

CFML

In CFML this is dead easy. Here's an example:

numbers = ["tahi", "rua", "toru", "wha", "rima", "ono", "whitu", "waru", "iwa", "tekau"];

oddNumbers = numbers.filter(function(_,i){return i mod 2;});

writeDump(oddNumbers);

Here I'm using ColdFusion 12's CLI, and the results are as follows:

C:\src\cfml\languageComparison\steppedLoop>cf cfmlVersion.cfm
array

1) tahi
2) toru
3) rima
4) whitu
5) iwa

C:\src\cfml\languageComparison\steppedLoop>

filter() allows us to create an array of every second element simply by doing the usual mod operation. This gives us a new array.

If we wanted to actually render the elements as a string, we'd just chain a reduce() call onto that:

oddNumbersAsString = numbers.filter(function(_,i){return i mod 2;}).reduce(function(combined, oddNumber){
    return "#combined##oddNumber##chr(10)#";
}, "");
CLI.writeLn(oddNumbersAsString);


C:\src\cfml\languageComparison\steppedLoop>cf cfmlVersion.cfm
tahi
toru
rima
whitu
iwa

C:\src\cfml\languageComparison\steppedLoop>

Or, hey: not get so dogmatic about using collection iteration methods, and just leverage a list method:

oddNumbersAsString = numbers.filter(function(_,i){return i mod 2;}).toList(chr(10));

(yields the same output)

I turned my mind to other languages to compare.

JavaScript

Same as CFML really. Just some "spelling" differences:

oddNumbersAsString = ["tahi", "rua", "toru", "wha", "rima", "ono", "whitu", "waru", "iwa", "tekau"]
    .filter(function(_,i){return (i+1) % 2;})
    .reduce(function(combined, oddNumber){
        return combined + oddNumber + String.fromCharCode(10);
    }, "");

console.log(oddNumbersAsString);

Output:


C:\src\cfml\languageComparison\steppedLoop>node jsVersion.js
tahi
toru
rima
whitu
iwa


C:\src\cfml\languageComparison\steppedLoop>

PHP

PHP is a bit disappointing here for a coupla reasons. Firstly PHP is still at its roots still procedural, so the code is intrinsically more clunky, plus it's collection iteration methodsfunctions aren't implemented that thoughtfully, so I have to jump through some intermediary hoops:
$numbers = ["tahi", "rua", "toru", "wha", "rima", "ono", "whitu", "waru", "iwa", "tekau"];

$indexedNumbers = array_map(null, range(0, count($numbers)-1), $numbers);
$indexedOddNumbers = array_filter(
    $indexedNumbers,
    function($number){
        return ($number[0]+1) % 2;
    }
);
$reindexedIndexedOddNumbers = array_values($indexedOddNumbers);

$oddNumbers = array_map(
    function($number){
        return $number[1];
    },
    $reindexedIndexedOddNumbers
);

$oddNumbersAsString = array_reduce(
    $oddNumbers,
    function($combined, $oddNumber){
        return $combined . $oddNumber . PHP_EOL;
    },
    ""
);
echo $oddNumbersAsString;

The issues here are:

  • we want to filter on the array index position, but PHP doesn't think to pass this to the array_filter()'s callback, so I need to push an index into the data using array_map() and range(). One good thing here is that array_map(), if not given a callback then it just maps all the arrays it's passed together into sub arrays. Which suits my purposes here.
  • Then I can use array_filter() to do the actual filtering, but array_filter() is a bit sh!t for a second time because it leaves a sparse array after it filters. So after I filter I don't have a new array with elements 0-4, I have one with elements 0,2,4,6,8 (!!!!). How bloody stupid is that? I guess this is down to PHP's array implementation being quite rubbish and basically being "kinda an array, kinda a struct, kinda - as a result - neither".
  • I need to fix this by creating a new array with just the values.
  • As I said PHP is procedural, so I can't chain stuff. I could nest it, but that makes for awful, inside-out code. So I use individual statements and intermediary variables.
  • Having filtered I still need to get my data elements back to "normal" by getting rid of the index I embedded in them. Another call to array_map() to undo what I did before.
  • And finally I can reduce things back down to a string.
Way too much horsing around there. Note: I know this is - as a result - a contrived example. There's no way a sane person would do this sort of thing. In PHP, one's better off eschewing the array iteration methods and just using general forEach() loops. Old school, but they work.

Saturday 6 June 2015

Not a Lucee bug, but a ColdFusion one. And I learn more about Java

G'day:
OK so remember that glitch I had in my earlier blog article: "CFML: learning something new about directoryList()":


I finally managed to work out WTF is going on. And despite appearances it actually seems like a bug in ColdFusion, not a bug in Lucee. But I'm unsure.

But there was certainly a bug in my recollection of how Java works, which did not help.

Tuesday 12 May 2015

Lucee 5: using a Java class file

G'day:
Here's one that bit me on the bum with Lucee, and took me a while to sort out. Not least of all because of some... err... communication challenges I had along the way trying to get help on the issue. I'll get to that.


One of the most convenient features of ColdFusion for me - a dedicated CFML programmer and only an occasional Java hacker - is that on those rare occasions I do need to knock together some quick Java functionality, all I need to do is to write a quick class (although there's nothing quick about me and "writing Java classes"), sling it into the /WEB-INF/classes directory, and then I could just create instances of said class from CFML. I use this for "my" ClassViewer class which has proven invaluable to me over the last decade or so when trying to work out WTF is going on with CFML's interaction with Java. On any of my CFML servers one of the first things I do is chuck a copy of that ClassViewer.class into /WEB-INF/classes, and then it's there when I'm shufti-ing around:

// cv.cfm
cv = createObject("java", "ClassViewer");
try {
    1/0;
}catch (any e){
    writeOutput("<pre>#cv.viewObject(e)#</pre>");    
}

Easy. Oh, and this gives stuff like this:

public class coldfusion.runtime.DivideByZeroException
 extends coldfusion.runtime.ExpressionException
 extends coldfusion.runtime.NeoException
 extends java.lang.RuntimeException
 extends java.lang.Exception
 extends java.lang.Throwable
 extends java.lang.Object
 extends {
 /*** CONSTRUCTORS ***/
 public coldfusion.runtime.DivideByZeroException()


 /*** METHODS ***/
 public int getErrNumber()

 public final void setLocale(java.util.Locale)

 public java.lang.String getMessage()

 // [...]

(full output in this Gist: https://gist.github.com/adamcameron/fd5ab8db7c534acc68ac).

It's just easy. And has always "just worked".

Until Lucee 5, that is.

For reasons as yet unexplained coherently & honestly by Lucee, this functionality no longer works.

Fortunately the solution is easy... one just needs to stick the class in a JAR file and pop it in the /WEB-INF/lib directory instead. I have to admit I actually didn't know how to do two things regarding Java up until trying to work around this exercise:

  • how to implement namespacing;
  • how to make a JAR

Sad admissions indeed! Well I knew the rudimentaries of both, but had never needed to put them into practice, so didn't know the details.

Just in case you're as thick as me when it comes to Java, here's what I learned. Firstly a baseline:

class Greeter {
    public static String gday(String who){
        return "G'day " + who;
    }
}

If I compile that, I can just stick Greeter.class file into /WEB-INF/classes, and use it in my CFML, eg:

// sampleBasic.cfm

greeter = createObject("java", "Greeter");
writeOutput(greeter.gday("Zachary"));


And we get:

G'day Zachary

Next... namespacing it...

package me.adamcameron.miscellany;

class Greeter {
    public static String gday(String who){
        return "G'day " + who;
    }
}

And now this class needs to go in /WEB-INF/classes/me/adamcameron/miscellany before we can use it:

greeter = createObject("java", "me.adamcameron.miscellany.Greeter");

Note how the package needs to be expressed everywhere.

Finally, making it into a JAR instead. This uses the same code, but I needed to home Greeter.java in the correct subdirectory structure (me\adamcameron\miscellany\Greeter.java), recompile it, then run it through JAR:


C:\src\java>jar cf miscellany.jar me\adamcameron\miscellany\Greeter.class

C:\src\java>

This creates miscellany.jar, and to use that, I pop it into /WEB-INF/lib (not /WEB-INF/classes, nor /WEB-INF/lib/me/adamcameron/miscellany).

That was all pretty easy. However note that whilst the class-based solutions (both non-packaged and packaged versions) work A-OK on ColdFusion (all versions), Railo (all versions), and Lucee 4.5; they do not work on Lucee 5. On Lucee 5 the only option that works is the JAR version. This is not exactly a hardship, but it's worth knowing.

I tried to get some sort of acknowledgement of there being an issue here from Lucee, but that didn't pan out. Micha's dismissive, excuse-making, unhelpful attitude really irked me here: LDEV-315. He's basically blamed me, Tomcat, all other CFML engines (including Lucee 4.5 I guess) but not any action on his part which might have caused this regression. According to him Lucee 5 is working correctly, and all the other ones are behaving the way they do purely by coincidence (and different vendors implementing the same coincidence). At the same time, inferentially suggesting to Tomcat and Servlet docs have it wrong. He's beginning to sound like Rupesh from the Adobe ColdFusion Team in his excuse-making squirming.

I can surmise that the way the Lucee classloader has been rewritten for OSGi support either necessitates this change, or there's a glitch in it which causes it. I could not find OSGi docs making any claims one way or the other, unfortunately.

As I say on the ticket, at the very least Lucee need to document this change, but Micha - who's generally a bit averse to appropriate levels of documentation in the first place - didn't even accept that. The ticket is closed with the familar (from the Adobe bug tracker, anyhow) "Closed/Won't Fixed/Can't be arsed". This is a pity.

Anyway, at least I know about packaging and JARing in Java now. Win!

--
Adam

Sunday 26 April 2015

CFML: just cos something looks like an array doesn't mean it is an array

G'day:
One of my Aussie chums hit me up the other day, asking me for some help with some quirky code he was seeing. We worked out what the issue was, but it stands repeating because it was non-obvious from a CFMLer's point of view.

Friday 7 March 2014

Java Training

G'day:
My mate Mike Henke has passed on some good information, and said he wouldn't mind if I circulated to everyone else too.

I saw your recent post about CFML dying. Today I hit up webucator for a deal on their online java class for experienced programmers. They offered it for $1000 but I need 2 others to sign up too.

It starts April 28th 10am-5pm eastern time for 5 days..

http://www.webucator.com/java/course/java-programming-training-for-experienced-programmers.cfm

If you want to mention this in a blog post or tweet that would be awesome for CFML-ers looking to learn something else and if they are like me, needing a set, short class to ramp up fast.

Mike makes a good point that a rapid skill injection like this could be great for people who need help / opportunity to move on from CFML. I'm not necessarily be suggesting a move to Java itself would be a good idea, but there's an awful lot of good techniques to be learned from Java that a CFMLer might not necessarily acquire day to day.

Even if not moving from CFML, it's just a great opportunity to get some intense exposure to another language.

And $1000 is a bloody excellent price.

As per Mike's comment below: he can be contacted on  henke dot mike at gmail dot com.

--
Adam

Thursday 26 December 2013

Identifying when a regex pattern matches at position zero in a string

G'day:
Sean asked on Twitter today for input into how to best handle a shortfall in some TestBox functionality. The TestBox Jira ticket is this one (it also contains basically what I'm about to say here!): "toThrow() cannot match empty message & cannot match detail". The reason for the first stated problem is because of the way CFML's reFind() function works.

Consider this code:

// reFind.cfm
param name="URL.input" default="";
pattern = "^\d*$";
match = refind(pattern, URL.input);
writeDump(var=[{input=URL.input},{match=match}]);

The gist of this is that the reFind() call checks to see whether URL.input is a string which comprises (in its entirety) zero or more digits.

Here are the results with a few test inputs:

array
1
struct
INPUT123
2
struct
MATCH1

array
1
struct
INPUTabc
2
struct
MATCH0

array
1
struct
INPUT12c
2
struct
MATCH0

array
1
struct
INPUT1b3
2
struct
MATCH0

So this shows that it only lets solely-numeric values past: values like 1b3 return a zero match. OK, so far so good.

But hang on. The pattern was for zero or more digits. So zero digits is also legit here. Let's try giving an empty string (which is indeed zero digits) as the input:


array
1
struct
INPUT[empty string]
2
struct
MATCH0

This is correct. The match of zero digits occurs at position zero in that empty string. That's a match. However... reFind() returns 0 if no match was found. As it turns out, this was not sensible behaviour, because it means that there's no way to distinguish between no match, and a match at position zero. Oops.

Tuesday 15 October 2013

CFCamp: Quick unicode regex code snippet

G'day:
A question just came up in Kai's "Regular Expression Clinic" at CFCamp about unicode support in regular expressions. This is important in countries like Germany where they have a lot of non-ASCII characters in regular use in everyday words, eg: letters with umlauts above them ("ö"), or double-S characters ("ß"), etc. I didn't quite know the answer, so I had a look and came up with some code...

Sunday 29 September 2013

ColdFusion / JVM and DNS caching: maybe Adobe aren't out to get me after all!

G'day:
Well here's something I didn't know. Depending on how your ColdFusion server is configured, DNS look-ups it does might be cached "forever" (read: for the life of the JVM).

Ray followed up my earlier post about "Adobe possibly messing with ColdFusion community projects again", pointing this out in a comment:

Anyway - don't forget that CF has that bug where it caches DNS lookups. Maybe the IP changed, and your host CF install is holding on to the wrong IP. That could be why it worked just fine on your local machine and Sharon's.

Try using the IP on the host. Of course, if they have multiple servers on the box it won't resolve to the right virtual server, but you would get a response right away.
My response to Ray (and the next two paras are a re-edit of my reply to him) was that I didn't know that! We learn something every day.

Tuesday 5 March 2013

Regular expressions in CFML (part 9: Java support for regular expressions (1/3))

G'day:
Jetlag and stupidity are a fine combination of things. I'm off to Ireland today to see my son, so am currently on the Tube heading to LHR. It's just after 6am, and not being a Saturday morning person, I general pass the time on this 1.5h journey (which I do every 2-3wks) by dozing and listening to a podcast. Two flaws with this plan are that I am still not readjusted to being back on GMT as my brain thinks I am still in NZ, so I am wide awake, and have been since 4am; secondly I (this is the stupidity part) have left my headphones at home. So... what to do? Well: continue this series of articles on regular expressions (which has been on hiatus, as you might have noticed). I guess it's good cos I'll get about 1h of productive typing during the journey.

Wednesday 24 October 2012

Sending "Tweets" via ColdFusion (via Twitter4J)

G'day:
I'm writing an application than needs to send Twitter status updates.  This is an augmentation of the ColdFusion Bugs RSS feeds I have (in the box on the right); I'm gonna send tweets out when a new bug is raised as well.  Or that's the idea.

To do this, I need to work out how to send Twitter status updates (which sounds much less daft than "tweets") from ColdFusion somehow.

This is probably well-trod ground, but the CF-based resources I initially found are all out of date, so I thought I'd post the results of my investigations here in case they're helpful to anyone.

Tuesday 7 August 2012

ClassViewer.java

G'day
I've had a busy weekend and a crazy time at work and have been tied-up in the evenings recently, hence the radio silence. But anyway I'm gonna squeeze a quick one out now (oh... please don't dwell too much on the potential double entendres there. You probably wouldn't've until I mentioned that, I guess... Oops).

Today I'm just publicising some code that I've found very handy in working out what ColdFusion is getting up to behind the scenes, as well as understanding how the "internal" methods of CF's class implementations of its various data types work.