Tuesday 12 November 2013

Locking investigation for Brad

G'day:
I will admit I'm not in a blog mood at the moment, hence the silence over the last few days. However Brad Wood has asked for a second set of eyes on some code of his, so here are my findings.

The code Brad asked about has since been identified and accepted as a bug in ColdFusion, but I wanted to have a look anyhow. Brad's case was raised as 3662524: "Request is unable to reacquire the same named lock", and here's the gist of the issue, from that ticket:

My understanding is that ColdFusion supports a style of reentrant locks where the same request can reacquire the same lock again. [...] The issue is that if a request has obtained a named readonly lock, it will not be able to acquire an exclusive lock of the same name. [...] It appears that the exclusive lock is timing out because it's being blocked by the readonly lock even though it's held by the SAME request.
My recollection was the same as Brad's, but I was "sure" I'd seen it working in the past (although I'm not so sure of that now, in hindsight).

Here's my test rig for this (adapted from Brad's code):

<cfflush interval="1">
Before read-only named lock<br>
<cflock name="myLock" type="readonly" timeout="1">
    Within read-only named lock<br>
    Before exclusive named lock<br>
    <cflock name="myLock" type="exclusive" timeout="1">
        Within exclusive named lock<br>
    </cflock>
    After exclusive named lock<br>
</cflock>
After read-only named lock<br>
END<br>

And on ColdFusion 10 I get this:

Before read-only named lock
Within read-only named lock
Before exclusive named lock

The web site you are accessing has experienced an unexpected error.
Please contact the website administrator. 


The following information is meant for the website developer for debugging purposes.
Error Occurred While Processing Request

A timeout occurred while attempting to lock myLock.


And, indeed, the same on CF 9.0.1.

I figured Railo would get it right, but... no:

Before read-only named lock
Within read-only named lock
Before exclusive named lock

Railo 4.1.2.002 Error (lock)
Messagea timeout occurred on a exclusive lock with name [mylock] after 1 second

On a whim, I tried it on OpenBD:

Before read-only application-scope lock
Within read-only application-scope lock
Before exclusive application-scope lock
Within exclusive application-scope lock
After exclusive application-scope lock
After read-only application-scope lock
END


Which is what I was expecting from the other two. So in this case, a rare functionality win from OpenBD! Just to make sure locking was working at all on OpenBD, I modified the code to just test locking in general, with more than one request going on:

<cfapplication name="testObdLocks01" applicationtimeout="#createTimespan(0,0,0,10)#">

<cfparam name="application.requestCount" default="0">
<cfset request.count = ++application.requestCount>
    
<cfset writeMessage("Before read-only named lock")>
<cftry>
    <cflock name="myLock" type="readonly" timeout="20">
        <cfset writeMessage("Within read-only named lock")>
        <cfset sleep(5000)>
        <cfset writeMessage("Before exclusive named lock")>
        <cflock name="myLock" type="exclusive" timeout="20">
            <cfset writeMessage("Top of exclusive named lock")>
            <cfset sleep(5000)>
            <cfset writeMessage("Bottom of exclusive named lock")>
        </cflock>
        <cfset writeMessage("After exclusive named lock")>
    </cflock>
    <cfcatch>
        <cfset writeMessage("Exception occurred: #cfcatch.type# #cfcatch.message# #cfcatch.detail#")>
        <cfrethrow>
    </cfcatch>
</cftry>

<cfset writeMessage("After read-only named lock")>
<cfset writeMessage("END")>

<cfscript>
    function writeMessage(message){
        var ts = timeFormat(now(), "HH:MM:SS.LLL");
        message = "Request: #request.count#: #ts# : #message#";
        writeLog(file=application.applicationname, text=message);
        writeOutput(message & "<br>");
    }
</cfscript>

The only real difference here is I've given myself some time to run two separate requests, and have them deadlock.

The resultant log was along these lines:

Request: 1: 13:24:39.336 : Before read-only named lock
Request: 1: 13:24:39.350 : Within read-only named lock
Request: 2: 13:24:41.501 : Before read-only named lock
Request: 2: 13:24:41.502 : Within read-only named lock
Request: 1: 13:24:44.352 : Before exclusive named lock
Request: 2: 13:24:46.504 : Before exclusive named lock
Request: 1: 13:25:04.356 : Exception occurred: Application General Runtime Error CFLOCK lock timed out
Request: 2: 13:25:04.462 : Top of exclusive named lock
Request: 2: 13:25:09.463 : Bottom of exclusive named lock
Request: 2: 13:25:09.463 : After exclusive named lock
Request: 2: 13:25:09.464 : After read-only named lock
Request: 2: 13:25:09.465 : END

Initially this threw me... I expected request 1 to "win" here. But once I desk-checked this, the results (unsurprisingly) make sense:
  • Both requests set off, a coupla seconds after one another,
  • and whilst request 1 is doing the first sleep,
  • the second request enters the read only lock too (this is fine: any number of requests can share a read-only lock).
  • Request one stops sleeping first, then hits the exclusive lock...
  • ... but it cannot get in because whilst its own read-only lock doesn't impact its ability to make a same-named exclusive lock, the read-only lock on request 2 does prevent it from getting the exclusive lock.
  • Next, request 2 stops sleeping and also arrives at the exclusive lock, and it too cannot get past because request 1 still has a read-only lock.
  • So the two requests are currently deadlocked.
  • After 20 sec, request 1 errors out... which releases its read-only lock,
  • and request 2 can now make an exclusive lock, and follows through to the end of the request. 

All good, and as is intended. Except <cfflush interval="n"> doesn't seem to work on OpenBD. But that's just an inconvenience here.

Brad: are you interested in raising a ticket for Railo, too?

--
Adam