Friday 25 April 2014

ColdFusion 11: custom serialisers. More questions than answers

G'day:
I've been wanting to write an article about the new custom serialiser one can have in ColdFusion 11, but having looked at it I have more questions than I have answers, so I have put it off. But, equally, I have no place to ask the questions, so I'm stymied. So I figured I'd write an article covering my initial questions. Maybe someone can answer then.

ColdFusion 11 has added the notion of a custom serialiser a website can have (docs: "Support for pluggable serializer and deserializer"). The idea is that whilst Adobe can dictate the serialisation rules for its own data types, it cannot sensibly infer how a CFC instance might get serialised: as each CFC represents a different data "schema", there is no "one size fits all" approach to handling it. So this is where the custom serialiser comes in. Kind of. If it wasn't a bit rubbish. Here's my exploration thusfar.

One can specify a custom serialiser by adding a setting to Application.cfc:

component {

    this.name = "serialiser01";
    this.customSerializer="Serialiser";

}

In this case the value - Serialiser - is the name of a CFC, eg:

// Serialiser.cfc
component {

    public function canSerialize(){
        logArgs(args=arguments, from=getFunctionCalledName());
        return true;
    }

    public function canDeserialize(){
        logArgs(args=arguments, from=getFunctionCalledName());
        return true;
    }

    public function serialize(){
        logArgs(args=arguments, from=getFunctionCalledName());
        return "SERIALISED";
    }

    public function deserialize(){
        logArgs(args=arguments, from=getFunctionCalledName());
        return "DESERIALISED";
    }

    private function logArgs(required struct args, required string from){
        var dumpFile = getDirectoryFromPath(getCurrentTemplatePath()) & "dump_#from#.html";
        if (fileExists(dumpFile)){
            fileDelete(dumpFile);
        }
        writeDump(var=args, label=from, output=dumpFile, format="html");
    }
}

This CFC needs to implement four methods:
  • canSerialize() - indicates whether something can be serialised by the serialiser;
  • canDeserialize() - indicates whether something can be deserialised by the serialiser;
  • serialize() - the function used to serialise something
  • deserialize() - the function used to deserialise something
I'm being purposely vague on those functions for a reason. I'll get to that.

The first cock-up in the implementation here is that for the custom serialisation to work, all four of those methods must be implemented in the serisalisation CFC. So common sense would dictate that a way to enforce that would be to require the CFC to implement an interface. That's what interfaces are for. Now I know people will argue the merit of having interfaces in CFML, but I don't really give a shit about that: CFML has interfaces, and this is what they're for. So when one specifies the serialiser in Application.cfc and it doesn't fulfil the interface requirement, it should error. Right then. When one specifies the inappropriate tool for the job. What instead happens is if the functions are omitted, one will get erratic behaviour in the application, through to outright errors when ColdFusion goes to call the functions and cannot find it. EG: if I have canSerialize() but no serialize() method, CF will error when it comes to serialise something:

JSON serialization failure: Unable to serialize to JSON.

Reason : The method serialize was not found in component C:/wwwroot/scribble/shared/git/blogExamples/coldfusion/CF11/customerserialiser/Serialiser.cfc.
The error occurred inC:/wwwroot/scribble/shared/git/blogExamples/coldfusion/CF11/customerserialiser/testBasic.cfm: line 4
2 : o = new Basic();
3 : 
4 : serialised = serializeJson(o);
5 : writeDump([serialised]);
6 : 

Note that the error comes when I go to serialise something, not when ColdFusion is told about the serialiser in the first place. This is just lazy/thoughtless implementation on the part of Adobe. It invites bugs, and is just sloppy.

The second cock-up follows immediately on from this.

Given my sample serialiser above, I then run this test code to examine some stuff:

o = new Basic();

serialised = serializeJson(o);
writeDump([serialised]);

deserialised = deserializeJson(serialised);
writeDump([deserialised]);

So all I'm doing is using (de)serializeJson() as a baseline to see how the functions work. here's Basic.cfc, btw:

component {

}

And the test output:

array
1SERIALISED
array
1DESERIALISED

This is as one would expect. OK, so that "works". But now... you'll've noted I am logging the arguments each of the serialisation methods receives, as I got.

Here's the arguments passed to canSerialize():

canSerialize - struct
1XML

My reaction to that is: "WTF?" Why is canSerialize() being passed the string "XML" when I'm trying to serialise an object of type Basic.cfc?

Here's the docs for canSerialize() (from the page I linked to earlier):
CanSerialize - Returns a boolean value and takes the "Accept Type" of the request as the argument. You can return true if you want the customserialzer to serialize the data to the passed argument type. 
Again, back to "WTF?" What's the "Accept type" of the request? And what the hell has the request got to do with a call to serializeJson()? You might think that "Accept type" references some HTTP header or something, but there is no "Accept type" header in the HTTP spec (that I can find: "Hypertext Transfer Protocol -- HTTP/1.1: 14 Header Field Definitions"). There's an "Accept" header (in this case: "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"), and other ones like "Accept-Encoding", "Accept-Language"... but none of which contain a value of "XML". Even if there was... how would it be relevant to the question as to whether a Basic.cfc instance can be serialised? Raised as bug: 3750730.

serialize() gets more sensible arguments:

serialize - struct
1
serialize - component scribble.shared.git.blogExamples.coldfusion.CF11.customerserialiser.Basic
2JSON

So the first is the object to serialise (which surely should be part of the question canSerialize() is supposed to ask, and the format to serialise to. Cool.

canDeserialize() is passed this:

canDeserialize - struct
1JSON

I guess it's because it's being called from deserializeJson(), so it's legit to expect the input value is indeed JSON. Fair enough. (Note: I'm not actually passing it JSON, but that's beside the point here).

And deserialize() is passed this:

deserialize - struct
1SERIALISED
2JSON
3[empty string]

The first argument is the value to work on, and the second is the type of deserialisation to do. I have no idea what the third argument is for, and it's not mentioned directly or indirectly on that docs page. So dunno what the story is there.

The next issue isn't a code-oriented one, but an implementation one: how the hell are we expected to work with this?

The only way to work here is for each function to have a long array of IF/ELSEIF statements which somehow identify each object type that is serialisable, and then return true from canSerialise(), or in the case of serialize(), go ahead and do the serialisation. So this means this one CFC needs to know about everything which can be serialised in the entire application. Talk about a failure in "separation of concerns".

You know the best way of determining if an object can be seriaslised? Ask it! Don't rely on something else needing to know. This can be achieved very easily in one of two ways:

  • Check to see if the object implements a "Serializable" interface, which requires a serialize() method to exist.
  • Or simply take the duck-typing approach: if a CFC implements a serialize() method: it can be serialised. By calling that method. Job done.


Either approach would work fine, keeps things nicely encapsulated, and I see merits in both. And either make far more sense than Adobe's approach. Which is like something from the "OO Failures Special Needs" class.

Deserialisation is trickier. Because it relies on somehow working out how to deserialise() an object. I'm not sure of the best approach here, but - again - how to deserialise something should be as close to the thing needing deserialisation as possible. IE: something in the serialised data itself which can be used to bootstrap the process.

This could simply be a matter of specifying a CFC type at a known place in the serialised data. EG: Adobe stipulates that if the serialised data is JSON, and at the top level of the JSON is a key eg: type, and the value is an extant CFC... use that CFC's deserialize() method. Or it could look for an object which contains a type and a method, or whatever. But Adobe can specify a contract there.

The only place I see a centralised CFC being relevant here is for a mechanism for handling serialised data that is neither a ColdFusion internal type, nor identifiable as above. In this case, perhaps they could provide a mechanism for a serialisation router, which basically has a bunch of routes (if/elseifs if need be) which contains logic as to how to work out how to deserialise the data. But it should not be the actual deserialiser, it should simply have the mechanism to find out how to do it. This is actually pretty much the same in operation as the deserialize() approach in the current implementation, but it doesn't need the canDeserialize() method (it can return false at the end of the routing), and it doesn't need to know about serialising. And also it's not the main mechanism to do the deserialisation, it's just the fall back if the prescribed approach hasn't been used.

TBH, this still sounds a bit jerry-built, and I'm open for better suggestions. This is probably a well-trod subject in other languages, so it might be worth looking at how the likes of Groovy, Ruby or even PHP (eek!) achieve this.

There's still another issue with the current approach. And this demonstrates that the Adobe guys don't actually work with either CFML applications or even modern websites. This approach only works for a single, stand-alone website (like how we might have done in 2001). What if I'm not in the business of building websites, but I build applications such as FW/1 or ColdBox or the like? Or any sort of "helper" application. They cannot use the current Adobe implementation of the customserializer. Why? Because the serialisation code needs to be in a website-specific CFC. There's no way for Luis to implement a custom serialiser in ColdBox (for example), and then have it work for someone using ColdBox. Because it relies on either editing Application.cfc to specify a different CFC, or editing the existing customSerializer CFC. Neither of which are very good solutions. This should have been immediately apparent to the Adobe engineer(s) implementing this stuff had they actually had any experience with modern web applications (which generally aren't just a single monolithic site, but an aggregation of various other sub applications). Equally, I know it's not a case of having thought about this and [I'm just missing something], because when I asked them the other day, at first they didn't even get what I was asking, but when I clarified were just like "oh yeah... um... err... yeah, you can't do that. We'll... have to... ah yeah". This has been raised as bug 3750731.

So I declare the intent here valid, but the implementation to be more alpha- / pre-release- quality, not release-ready.

Still: it could be easily deprecated and rework fairly easily. I've raised this as bug 3750732.

Or am I missing something?

--
Adam