I was gonna do an article about ColdFusion 2016's new query iteration functions, but my fondness for using Maori as my sample data language has pointed me in the direction of a possible bug in ColdFusion 2016's CLI.
I've distilled it down to this:
CLI.writeln("kuputuhi tauira ki pūāhua nako");
If I run that via the ColdFusion 2016 CLI, I get this:
D:\src\CF12\cli\utf8>cf
CLI.writeln("kuputuhi tauira ki pūāhua nako");
^Z
kuputuhi tauira ki puahua nako
D:\src\CF12\cli\utf8>
CLI.writeln("kuputuhi tauira ki pūāhua nako");
^Z
kuputuhi tauira ki puahua nako
D:\src\CF12\cli\utf8>
Hmmm... where gone the diacritic marks?
Initially I put this down to a tweak I've made to the CLI batch file which allows me to put code straight into STDIN instead of a file and run it, so I ran the code old-school:
D:\src\CF12\cli\utf8>cf cfmlExample.cfm
kuputuhi tauira ki p┼½─?hua nako
D:\src\CF12\cli\utf8>
kuputuhi tauira ki p┼½─?hua nako
D:\src\CF12\cli\utf8>
Yikes! Worse!
I figured there was a chance that Windows wasn't handling encoding in its own CLI box that well, so decided to see what PHP made of this:
<?php
echo "kuputuhi tauira ki pūāhua nako";
And this yields:
D:\src\CF12\cli\utf8>php phpExample.php
kuputuhi tauira ki pūāhua nako
D:\src\CF12\cli\utf8>
kuputuhi tauira ki pūāhua nako
D:\src\CF12\cli\utf8>
OK, so ColdFusion is doing no different from PHP here when running a file. On a whim I decided to try Ruby too, and this fared better:
puts "kuputuhi tauira ki pūāhua nako"
D:\src\CF12\cli\utf8>ruby rubyExample.rb
kuputuhi tauira ki pūāhua nako
D:\src\CF12\cli\utf8>
kuputuhi tauira ki pūāhua nako
D:\src\CF12\cli\utf8>
So it wasn't like it was impossible to display the right characters in a Windows CLI box, but I decided to google anyhow.
I found a PHP answer which said I needed to change the code page in the CLI box to support UTF-8, by doing this:
D:\src\CF12\cli\utf8>chcp 65001
Active code page: 65001
D:\src\CF12\cli\utf8>
Active code page: 65001
D:\src\CF12\cli\utf8>
This is a new one on me, but it seemed to sort PHP's issues out:
D:\src\CF12\cli\utf8>php phpExample.php
kuputuhi tauira ki pūāhua nako
D:\src\CF12\cli\utf8>
kuputuhi tauira ki pūāhua nako
D:\src\CF12\cli\utf8>
But it didn't really help ColdFusion:
D:\src\CF12\cli\utf8>cf cfmlExample.cfm
kuputuhi tauira ki pū�?hua nako
D:\src\CF12\cli\utf8>
kuputuhi tauira ki pū�?hua nako
D:\src\CF12\cli\utf8>
On a further whim, I decided to extend my test bed to Python:
print("kuputuhi tauira ki pūāhua nako")
Without the
chcp
call, I got this:
D:\src\CF12\cli\utf8>py pythonExample.py
kuputuhi tauira ki pūāhua nako
D:\src\CF12\cli\utf8>
kuputuhi tauira ki pūāhua nako
D:\src\CF12\cli\utf8>
Which is exactly the same as ColdFusion's output. But once I make the
chcp
call:
D:\src\CF12\cli\utf8>chcp 65001
Active code page: 65001
D:\src\CF12\cli\utf8>py pythonExample.py
kuputuhi tauira ki pūāhua nako
D:\src\CF12\cli\utf8>
Active code page: 65001
D:\src\CF12\cli\utf8>py pythonExample.py
kuputuhi tauira ki pūāhua nako
D:\src\CF12\cli\utf8>
As you can see: it's fine.
So my conclusion here is that it's completely legit than ColdFusion might not be able to render UTF-8-encoded characters without the code page being actively changed, but even after that something is not right with the CLI. I supsect that when they load the file in, they don't give any thought to encoding at all, so the source code actually gets "corrupted" before it's run. I have to admit that my understanding of encoding is OK, but not fantastic, but when one throws Windows CLI code pages into the mix too... I dunno what my expectations ought to be. But it ain't working, that's fer sure.
Could someone please try this on *nix and see what they get when using a more robust shell?
Now it's too late for me to look at query iterations functions, so that might be a job for before work tomorrow.
Righto.
--
Adam