Wednesday 11 June 2014

Super and mix-ins don't mix very well

G'day:
True to form for a "web log", here's a summary of the last six hours of my working life (Tues afternoon, Weds morning). The executive summary is "Oh FFS, ColdFusion".

I was tasked with doing dev testing of one of my colleague's work y/day just after lunch, and as soon as I hit the test URL I got a grim error:

Entity has incorrect type for being called as a function.

The symbol you provided setHeader is not the name of a function.

The thing is... yes it was. It was the correct type. The code was fine. I bounced it back to my mate, and he was like "erm... ah... 'it works OK on my machine'" (and, yes, that got a furrowed brow from me). He showed it working, and yeah, it was working. I restarted my CF instance, did an update from source control, made sure I had no uncommitted files (we were working on adjacent code, so it might be a case of "not working OK on my machine" sort of thing). Again with the error. So my colleague did the same... restarted, updated, etc, and now he got the error too. But a coupla minutes later having checked some things... error gone.

I hate these situations.

After a few min we twigged that the first hit to the code was erroring, but it was fine thereafter. This was replicable on a bunch of machines, so - yeah - something was afoot.

Now this is very old code we're maintaining, and I hate to admit it, but it's shocking. It's that app we all have tucked away in the corner and we all pray we never have to do any maintenance work on it. I presume you have an app like that too? So, anyway, it would not surprise anyone if it was some bodgy old shite somewhere interfering with the application-start-up (this app is still on Application.cfm... that's how old it is!), or how the path through subsequent requests might follow different logic. However it's a nightmare to troubleshoot, as we're talking about say a dozen contributing files, and - I just tallied up - over 3000 lines of code which would not have been great even close to ten years ago when it was written. Plus there's a bunch of same- or similar-named files, functions, CFCs etc in play.

Making matters more "interesting" is that our new code was using mix-ins to do some AOP across a few of our web services, and it was in one of these mixed-in functions we were seeing the problem. The one difference between the function which was erroring and the other mix-ins which were not was... it accessed the super scope of the object it was being mixed into. That should be fine. But it looked kinda like it was working like a closure instead of a standard function... having been mixed in it was seeing the super environment from the library CFC it was homed in, rather than the CFC it had been mixed into. And we're on CF9 so it ain't no closure issue.

At close of play y/day all my mate G-Man and I could must was a stream of swearing (OK, that was just from me), and a thought of "tomorrow is another day".

Over night I mulled things over, and decided I really should be taking my own advice: advice offered repeatedly on Stack Overflow to people raising questions when their code "isn't working, and they can't post the code because there's 1000s of lines of special secret code they can't share". The advice is that one should be building a repro case and investigating that, rather than wallowing in the 1000s of lines of app code. My rationale is that in these situations it's generally just one bung line of code, and the trick is to work out which one. And if one has a repro case, that's a lot less code to mess around with, and it's easier to test hypotheses.

So the break down of the moving parts of this erroring code were as follows:
  1. we are calling a method remotely (ie: it's a web service);
  2. that method calls a wrapper method within the same CFC;
  3. which in turn calls the mixed-in method (into that same CFC);
  4. that mix-in calls super.someOtherFunction();
  5. the inheritance hierarchy is such that the initial CFC extends another,
  6. which in turn extends another, and in there is a private method, someOtherFunction().
  7. another consideration is that the CFC being called has no init(),
  8. but the intermediary has one,
  9. and that's called from its own pseudo-constructor (these are remotely called CFCs, so we cannot rely on the calling-code to call init()),
  10. and the intermediary CFC calls the base CFC's init() too, via super.init().
  11. As it's a remote call, Application.cfm also runs, and loads some stuff into application scope, and that might matter because of the "fails first, then works after that" sort of arrangement.

To see if Application.cfm (or application.cfm... this is how sloppy this legacy code is) mattered here, I removed it and behaviour was no different, so I discarded that lot as being a contributing factor (making the repro easier).

So I implemented factors 1-10 thus:

// EndPoint.cfc
component extends="Intermediary" {

    mixinLib = createObject("MixinLib");
    variables.mixinMethod = mixinLib.mixinMethod;

    remote string function remoteMethod(){    
        return privateMethod();
    }

    private string function privateMethod(){
        return mixinMethod();
    }

}

// MixinLib.cfc
component {
    
    public string function mixinMethod(){
        return super.basePrivateMethod();
    }

}

// Intermediary.cfc
component extends="Base" {

    init();

    public Intermediary function init(){
        super.init();
        return this;
    }

}

// Base.cfc
component {

    public Base function init(){
        return this;
    }

    private string function basePrivateMethod(){
        return createUuid();
    }
    
}

Summary:

  1. We call EndPoint.cfc's remoteMethod();
  2. remoteMethod() calls a privateMethod();
  3. which calls the mixinMethod() from MixinLib.cfc;
  4. mixinMethod() calls super.basePrivateMethod() (super being Base.cfc which is extended by Intermediary.cfc, which is extended by EndPoint.cfc).
  5. Note also that Intermediary.cfc calls its own init() method along the way (so this will be triggered when EndPoint.cfc is instantiated),
  6. which calls Base.cfc's own init().
  7. All basePrivateMethod() does is to return a UUID.
Pleasingly... this code also errors in the same way:

The basePrivateMethod method was not found.

Either there are no methods with the specified method name and argument types or the basePrivateMethod method is overloaded with argument types that ColdFusion cannot decipher reliably. ColdFusion found 0 methods that match the provided arguments. If this is a Java object and you verified that the method exists, use the javacast function to reduce ambiguity.
The error occurred inD:/websites/www.scribble.local/shared/git/blogExamples/coldfusion/bugs/superbug/MixinLib.cfc: line 5

And when I reload, I get a UUID.

Excellent! I have a repro case. Now I have a total of 49 lines of code (incl. comments and spacing... 30 lines without counting that lot) to experiment with to try to fix it.

I started to factor things out.

  1. That it's being called remotely. I added a new file which instantiated EndPoint.cfc and called the method via CFML:
    // testViaDirectCall.cfm
    o = new Sub();
    writeOutput(o.callPrivateMethodDirectly());
    

    No change in behaviour. So we can scratch "remote calling" as a contributing factor.
  2. That it's a mix-in. I moved mixinMethod() into Endpoint.cfc. Problem went away. OK, so now I have a work around, if I cannot solve this. Win! But I want to solve this, so I move it back: we really want this code to be mixed-in.
  3.  At this point I was able to tell our PM that I had a work around, but would like to invest a coupla more hours trying to actually fix it. She was cool with this.
  4. Was it the weirdo init() handling in Intermediary.cfc? I removed that. Problem still there. But that's a bunch of code factored away. Good.
  5. Was it that there was an intermediary CFC in the inheritance chain? I made EndPoint.cfc extend Base.cfc instead. Problem still existed.
  6. Was it that basePrivateMethod() was private? Made it public. No change. It was not that.
  7. Was it just "super". I moved the method into EndPoint.cfc, and called it directly. Problem went away. So the problem seems to be with "super".

So by now I have pared my repro case back to this:

// testViaMixin.cfm
o = new Sub();
writeOutput(o.callPrivateMethodViaMixin());

// Sub.cfc
component extends="base" {

    mixinLib = createObject("MixinLib");
    variables.mixinMethod = mixinLib.mixinMethod;

    public string function callPrivateMethodViaMixin(){    
        return mixinMethod();
    }

    public string function callPrivateMethodDirectly(){    
        return basePrivateMethod();
    }

}

(and Base.cfc is the same as above, as is MixinLib.cfc).

I tested all this on CF10 and 11 too: broken there too. Railo? Unsurprisingly works fine.

On a whim I decided it was perhaps the fact we were using super, and there was some contextual issue. So I decided to proxy it instead.

// Sub.cfc
component extends="base" {

    mixinLib = createObject("MixinLib");
    variables.mixinMethod = mixinLib.mixinMethod;
    variables.superProxy = super;

    public string function callPrivateMethodViaMixin(){    
        return mixinMethod();
    }

    public string function callPrivateMethodDirectly(){    
        return basePrivateMethod();
    }

}

// MixinLib.cfc
component {
    
    public string function mixinMethod(){
        return variables.superProxy.basePrivateMethod();
    }

}

So instead of calling super in MixinLib.cfc, which was confusing CF, I am calling a different variable, so... problem goes away. And it did. This solved it.

I hastily raised a ticket with Adobe: 3774074, stuck a comment in the code where I set superProxy pointing at that ticket, committed my code and we had a wee rejoice.

That's been my day so far, but I can move onto other stuff now. Which I need to do now, as I'm out of lunchtime (as this is work-related and the other lads will need an explanation, I'm not so worried I've spilled over a bit today ;-)

Right... onto the next thing...

--
Adam