Tuesday 24 July 2012

What are the most common CFML tags?

G'day. There was a bit of a break there: no post yesterday. I'm not on the wane yet, it's just it was such if a cracker day in London yesterday that I went to stand outside the pub in the sunshine after work, and by the time I got home I was... err... well a bit pissed, actually.

Note to Americans: that means "drunk" in our vernacular, not "angry".

Anyway, I decided not to write anything. I think we can all be grateful for that.

Yesterday's pre-work research was to write some code to tally up what CF tags we use in our codebase, to compare this with the codebases of one of our older and less-well-maintained apps, as well as our brand-spanking-new app's codebase. Just to see if it turned up anything interesting.

Before I go on (and on... and on... and on... ;-), do me a favour? Jot down what you think would be the ten most-commonly-used CFML tags, and once you're done reading, post 'em as a comment. There's no right or wrong here, I just wanna gauge people's impressions. This would be my list:

  • CFIF
This is based on the notion that the most intrinsic thing about code is setting variables and deciding what to do with them. We also get data and loop over it, outputting it. Everyone does this. Hopefully a lot of this if refactored into functions, but probably a lot of it is refactored simply by chucking stuff into separate files. That seems reasonable?

OK, so what about reality? Here's the full tally for our main site's codebase:

1: cfset (9999)
2: cfargument (8949)

3: cffunction (5909)

4: cfreturn (3653)
5: cfif (3370)
6: cfscript (2746)
7: cfparam (2678)
8: cfprocparam (2111)
9: cfoutput (1817)
10: cfelse (1037)

11: cfloop (882)
12: cfstoredproc (714)
13: cfcomponent (676)
14: cfprocresult (643)
15: cfsilent (592)
16: cfcase (537)
17: cfquery (527)
18: cfinclude (485)
19: cfthrow (321)
20: cfqueryparam (303)
21: cftry (293)
22: cfcatch (291)
23: cfsavecontent (268)
24: cfprocessingdirective (237)
25: cfelseif (208)
26: cfdump (199)
27: cfabort (137)
28: cfinvokeargument (113)
29: cfswitch (112)
30: cfhttpparam (100)
31: cfsetting (74)
32: cfinvoke (69)
33: cfdefaultcase (64)
34: cffile (47)
35: cfwddx (32)
36: cflock (29)
37: cfhttp (25)
38: cftransaction (23)
39: cfheader (21)
40: cfdirectory,cfmailpart (20)
41: cfcontent (19)
42: cfbreak,cfmail (17)
43: cftrace (14)
44: cflog,cfcookie (12)
45: cfrethrow (11)
46: cfmodule,cflocation (9)
47: cfhtmlhead (8)
48: cfimage (6)
49: cfexit,cfproperty (5)
50: cfinterface (4)
51: cfflush,cftimer (2)
52: cfapplication,cfobjectcache,cfexecute,cfmailparam,cfxml (1)

This codebase has just shy of 20000 lines of code in it.  This includes our API, but does not include any of the unit test code (which would add another 70000-odd lines).

I feel like going and committing one more <cfset> tag... ;-)

Anyway, this is quite interesting!  By far the most common tag is <cfset>: no surprise there.  However <cfargument> being the next one surprises me, but I'm pleased we've seemingly got so much of our code factored into functions.

It's also interesting that those top three tags are an awful lot more widely-used than the rest of the top ten.

It might seem odd that <cfquery> is so far down the list, but all those <cfquery> tags are either old legacy stuff (most of which is obsolete, but just not removed yet), or the odd QoQ to massage data that the API doesn't quite return in the correct format for its usage.  The bulk of our DB interaction is done via stored procs (hence <cfprocparam> being so high up the list).

Note what you don't see: a single <cfform> or <cfchart> or any other "UI wizard" sort of tag.  We don't use 'em.  Never have.  Never will.

Another quick script I just ran (I'll stick all these scripts at the bottom, so you can see how I did these metrics) was to tally-up how many lines of code are in <cfscript> blocks: just short of 60000: 30% of our code.  And 100000 lines of the total code is in CFCs.

Looking at one of our old and unloved apps, we get these top ten tags:

1: cfset (15555)
2: cfif (11641)
3: cfargument (10263)
4: cffunction (6258)
5: cfoutput (4139)
6: cfreturn (3909)
7: cfelse (3219)
8: cfscript (3065)
9: cfprocparam (3019)
10: cfloop (2155)

This is a codebase of 380k lines of code (120k in CFCs, 65k in <cfscript> blocks).  And for the latest app that we've just launched:

1: cfargument (4348)
2: cfset (3439)
3: cffunction (2219)
4: cfreturn (1360)
5: cfprocparam (1265)
6: cfif (1051)
7: cfscript (700)
8: cfstoredproc (402)
9: cfprocresult (361)
10: cfelse (347)

(52k lines of code, 35k in CFCs and 17k in <cfscript>).

All of these contain our API, which is 35k lines of code, 28k of which is in CFCs (I'm surprised there are than many CFMs in there, actually?!), and 8k in CFScript.

There's not much difference in the tag usage of the old code and the new code, which is a surprise to me.

That's a bit of a raw-data-dump rather than any sort of analysis... I'm still mulling this over.  But I need to crack on with some work, so I'll finish up with the code I used to do this extraction.

I'm dead keen to hear other people's findings.  Please do add a comment with your own stats.