Tuesday 2 July 2013

Serialisation woes resolved: "screw JSON"

G'day:
(third one today... the other two were only wee ones, so hardly count).

Right, so you might've read me wittering on about "ColdFusion vs JSON. How to make a developer cry". I needed to serialise some data to put it in a cache, and ColdFusion 9's JSON-handling is not up to the job.

What were were trying to do is to reliably serialise a query, and have ColdFusion not mess up either the null values, nor the dates (actually the latter is a side-effect of the JSON processing, and more an issue with query of query).

I was trying to use JSON because it was faster and lighter than WDDX, and it's kinda become the de facto way of doing quick serialisation, I think.

Yesterday I was forced to write this email to my superiors:

What I’ve tried, and the results:

Vanilla (de)serialisation using JSON functions
Approach: using serializeJson() and deserializeJson()
Result: NULL values correctly serialised, however deserialised as “null” (ie: the four letter string, N-U-L-L). This is a known CF9 issue (https://bugbase.adobe.com/index.cfm?event=bug&id=3043777), fixed in CF10. This prevents this being any sort of solution. Basically JSON deserialisation is borked in CF9.

Assisted serialisation / vanilla deserialisation
Approach: We tried changing the serialised NULL values to empty strings after serialising, as CF should interpret empty strings in a deserialised query.
Result: This breaks any numeric column which has nulls in it, as CF interprets the column as a VARCHAR. This means we cannot do any query-of-query in which we filter on numeric columns (which we do do).

Conditional use of WDDX to serialise queries
Approach: if a query has nulls in a numeric column, use WDDX instead to serialise it
Result: whilst <cfwddx> does seem to deserialised the query with nulls in numeric colums, one still cannot do a QoQ filter on a numeric column for the same reason as above

Using GSON instead of CF’s deserialisation
Approach: Google have release GSON, an attempt at a generic Java-based (de)serialiser.
Result: works for structs. Gives a 500 error for queries.

Fast rebuild of query when deserialising
Approach: deserialised the data as a struct, then rebuild the query column by column. This has the benefit of being able to force the column type via queryAddColumn(), and entire columns can be added in one fell swoop.
Result: same as above. CF still insists the column is a varchar if there are any deserialised empty strings in place of null values. This is despite accepting the data as numeric when calling queryAddColumn()

Extract correctly-working deserialisation logic from Railo source code
Approach: Railo handles the deserialisation correctly, so we figured as it’s open source we could lift its deserialisation algorithm and use that instead.
Result: whilst this is viable, the code is all Java and written I true Java OO fashion, so there’d be a lot of work to extract just the algorithm as the logic is spread throughout several different class files (too many for a superficial analysis to go “ah, that’s how they handle it”). Have abandoned this idea for the time being.

That is where I have got to.

Other potential approaches:

Manual rebuild of query
Approach: deserialise the data as a struct, then loop over it, column by column, row by row, creating a new query.
Upside: it will work.
Downside: it will be slow.

Call a Railo web service
Approach: install Railo, and use it to handle the deserialisation.
Upside: well it’d work.
Downside: perhaps slower in the long-run to run than doing a manual rebuild, given the webservice call. Requires Railo installed everywhere (or at least on the Memcached server). Would need careful load testing.

Use Ehcache and Terracotta instead of Memcached for ColdFusion caching
Approach: install Terracotta on the Memcached server, reconfigure ColdFusion’s internal Ehcache to talk to Terracotta instead.
Upside: it’s basically native to CF. Odds-on will be faster than Memcached as there’s no (de)serialisation. It works: I’ve tested the local Ehcache, and can’t see why it’d be different with Terracotta involved. Probably the route we should have gone down in the first place.
Downside: more moving parts to support. Time overhead of getting Terracotta installed and running, and reconfiguring CF to use it. Testing to make sure it covers all the other requirements we had of Memcached.

Summary:
In order of quickest to implement: as written above.
In order of best to worst solution, all things considered: EhCache, manual rebuild, Railo web service
As I was discussing this with the other chaps, we had a flash of inspiration: "screw JSON".

I remembered at some point ColdFusion added in objectSave() and objectLoad(). I thought these were added in to CF10, but - no - they're in CF9 as well.

These supposedly work on query objects, so I did a quick test, and - once we worked around yet another bug in ColdFusion - we seem to have got the problem solved.

I'm gonna discuss the CF bug in another article. Making it the fourth one today...

--
Adam