Thursday 8 November 2012

/>

G'day:
It's been a quiet week on the blog front this week sorry... this is down to it being a pretty busy week in general for me.  Nothing exciting... just too much work, so little motivation to spend even more time in front of the screen when I get home.  And, to be frank, I've not been in the mood to write anything on the train in the morning either. This isn't for a lack of stuff to write, it's just what I've got in the list of things to write about takes research and... well... a bunch of actual work to get the thing ready.

So this is a bit of a filler article.


Today I made a bit of a mistake.  And better, I made it on Twitter. Today I suggested that the practice of closing CFML tags with the little extra(~neous) slash at the end was a sub-optimal practice.  Oops.  I should learn to not raise topics on which people have opinions. Especially one's that don't match mine ;-)

Anyway, I've been meaning not to write an article on this topic since day one of this blog, but I figured it'd happen eventually so given it's marginally topical for me today, here goes.

What am I on about?  Consider these two lines of code:

<cfset summer = "raumati">

<cfset winter = "takurua" />

People tend to have an opinion on which they prefer. Rarely it's a strong opinion, but sometimes it's a fairly firm one.  And - if I'm honest - it really doesn't matter one way or the other.  That's not going to stop me banging on about it though ;-)

My personal preference is for the former.  My reason for this is simple. That's the way CFML was intended to be written.  There.  Simple.


Back in the old days - before XML was really relevant to many people, and XHTML hadn't started to clutter up HTML - I cannot recall seeing anyone unnecessarily closing their CFML tags.  There was no precedent for it, it wasn't (and, overwhelmingly, still isn't ~) suggested in the CF docs, and I can't recall seeing even a wayward reference or usage of that syntax in the CFWACK. Or in any code that I ever came across out there in the community. There probably was some, but I didn't spot it.

The XML and XHTML started getting some traction, and suddenly people started not only closing their HTML tags, for some reason they started closing their CFML tags too.  Huh?

I questioned this (of various people) and the response at the time was "well one should well-form ones tags... you know... like XHTML".  What?  Honestly what?? My response to this is "are you not aware of the difference between XHTML and CFML?  Just because they co-habitate in the same file sometimes, doesn't make them 'the same'". Or even similar.  Or even have any sort of co-related purpose.   Even someone with a very minimal grasp of CFML understands that the mark-up is just "noise" to ColdFusion, and has no relevance to it.  The mark-up is just something the browser is interested in.  And there was a lot of popularity of playing the "let's write mark-up that passes the W3C XHTML validator".  But if one is writing standards-compliant mark-up (something, incidentally, I think the real world importance of which was always overstated), then one needs to do that. It's one of the rules. So it's one of the goals of generating mark-up: do it so it passes the rules, and validates.  On the other hand, CFML - which for all intents and purposes resides in another plane of existence from the mark-up in a source code file - doesn't have any such standard, and is not XHTML. Not even almost.  Not even "kinda... except for <cfelse>". Not even remotely.  Even forgetting about things like <cfelse> for a second, one cannot write "XHTML-compliant CFML", because XHTML is a specific dialect, and it has specific tags. So as soon as one writes "<cfset "... bam. Not XHTML-compliant.  There is no "<cfset />" tag in XHTML. The whole enterprise is a fool's errand.


Right, so next XML. That was suddenly pretty popular, wasn't it?  Maybe CFML should be written in as much of an XML-compliant way as possible! People have said this too. Well here's something we all know: XML-compliance is a binary state. Something either is XML compliant, or it isn't XML compliant.  There is no "strict" or "transitional" in XML compliance. And CFML isn't.  And can't be.  Something cannot be "kinda XML-compliant".  There's no such thing.  And whilst people will go "[grunt] must close tag... make like XML", they don't don't go "ooh, hang on... I better use "&amp;" instead of & (try to concatenate two strings with "&amp;" ...).  Or go "hey ('lol')! Almost forgot to put CDATA tags around that character data [slaps forehead]".  No. because it's a) dumb; b) impossible to write CFML like that.  Just like it's impossible to write XML-esque CFML by putting the odd extra slash in where it's not needed.

Now... there's an interesting word: "needed".  Why is all this XHTML/XML compliance needed. Well I think the XHTML compliance side of things mostly boiled down to "no frickin' real world reason at all" on the whole; but there is good grounds for XML compliance: so a machine can read and parse it. So a machine can read and parse it.  If a machine is tasked with parsing some XML, then the data needs to actually be XML.  If it ain't XML compliant, the machine will reject it. So XML compliance in XML is critically important.

Obviously XML is intended to be a format that both machines and humans can easily read. But here's a test. Read this:

<person>
    <firstname>Zachary</firstname>
    <lastname>Cameron Lynch</lastname>
<person>

Could you read that OK? Did you get the impression that "XML" represents a person by the name of Zachary Cameron Lynch? (That's my wee son, btw).  Did you stumble at any point or get confused?  Probably not.  OK: "definitely not", right?  Even though it's malformed, and not valid XML (the closing </person> tag is missing its slash. Did you even notice?)?

Is this easier for you to understand:

<person>
    <firstname>Zachary</firstname>
    <lastname>Cameron Lynch</lastname>
</person>

Any different at all? Nah, it's not is it? People are a bit more forgiving when it comes to stuff like that. We don't need to be quite so strict about things for us to still understand it. I know that's a contrived and small example, but I could make it a lot more complex and miss out the odd closing slash or capitalise a letter in a tag and you'd still be able to follow it just the same, wouldn't you? Yup. That's the thing: for the human, it's the words that are important, we're forgiving with the punctuation.


OK, so all the shenanigans with the XML-compliance is for the machine. The machine that's expecting XML.  If the machine is expecting JSON, you can go mad trying to put slashes in it all over the place, but unless the data is valid JSON, the machine will reject it.  That's the thing about compliance: compliance is adherence to a set of rules. That's it.  It's nothing about closing tags. Hey, even in the context of mark-up it's not even necessarily about angle brackets and slashes and stuff.  There's plenty of types of mark-up that don't use those constructs at all.  And there's angle-bracket-based web-environment mark-up that doesn't intrinsically require tags to be paired: all versions of HTML, for example. It was only that one embarrassing blip that was XHTML that needed that, and even then: browsers generally still coped fine if it missed the odd slash here and there.

Let's look at ColdFusion code: CFML. For a machine to process CFML, the code needs to be compliant with CFML's rules. Rules like matching a <cfif> with a </cfif>, using # characters correctly, and that there's no such tag as <cfslartibartfast>. There's nothing in the CFML rules that say anything about "closing" tags implicitly. In fact this cuts both ways: the closing slashes are so inconsequential to the CF compiler than code using these extraneous slashes compiles into exactly the same  bytecode as the same code without the extraneous slashes.

So they're not relevant to the machine (OK, yes they can be: I'm getting to that).

But that's just the "implicit" (or perhaps "short-hand") closing tags. CFML absolutely rejects actual closing tags for tags that don't have them:

<cfset red = "whero"></cfset>

That will not compile.  The CFML compiler will "ignore" a trailing slash when it's not necessary, but that doesn't mean there's such thing as a closing tag for tags that one doesn't usually use.  <cfset> has no closing tag.  <cfquery> does.

OK: where are we? There is no legitimate reason to include these things, from either the machine's perspective, or the human's perspective. But people chose to do it. Why? As demonstrated, it's not for any effort to meet some sort of compliance (and, face it: that excuse was always just bullshit anyhow).


One rationalisation I get is "it makes my code easier to read". What? Peppering your code with extraneous punctuation characters (and always the same one) makes your code easier to read? No it doesn't. No, really: it doesn't. Not to you, not to me, not to anyone. That's ballocks too. It doesn't make it (that much) harder to read, but it does not make it easier to read.

Here's some sample code (contrived minimal example again, sorry):


<cfset blue= "kahurangi">

<cfset yellow = "kowhai" />


People tell me that having the trailing slash makes it easier for them "to tell when the tag has finished... you know: it could be a single statement, or it could be a block of code". So the slash says "it's the end of the statement, it's not a block". But doesn't the fact that it's a <cfset> statement kinda tell you that?? There is no block-syntax for <cfset>. There's no ambiguity to resolve. Same with this:

<cfquery name="myData" datasource="#application.dsn# cachedwithin="#createTimespan(0,1,0,0)#" cachedafter="#dateAdd('d', 2, now())#" maxrows="10" result="metadata">

Is that the end of the statement?  Is that a complete <cfquery>? No. It's not. Because a <cfquery> is always a block, and always ends with a </cfquery> tag. It is never simply a <cfquery> tag by itself. No ambiguity.

If you are trying to tell me that you require a closing slash to remind you whether a tag is a single statement or a block: I will tell you you are in the wrong line of work. Simple as that.


Ah, but apparently sometimes "it's easier when just scanning code to look for the closing tags [coughbullshit]" (that last bit was me interjecting, in case it's not obvious ;-). When they hell are you doing that? When is it that one needs to scan past a certain number of ends-of-statements to locate what one is scanning for?  When yer reading code (and, I hazard to guess, when you read anything) you read it left to right.  From the start to the end.  And accordingly the important bit of any statement is at the beginning.  Here's a test: when does the <cfelse> block start in the code that this lot is the ends of each statement:

>
    />
    >
        />
        />
        />
    >
>
>
    >
    >
>
    >
    >
    >
        
        />
        />
    >
    >
    />
>

Given it's from a complete and syntactically correct code block - with good indentation preserved, and no empty blocks - I think there's only one possible place it can be. It's still not completely obvious though, is it?  It's almost like that's a completely daft way of quickly scanning code.

 OK, what about from this one:

<cfloop
    <cfset
    <cfif
        <span
        <cfset
        <cfset
    </cfif
</cfloop
<cfif
    <!---
    <span
<cfelse
    <!---
    <span
    <cfquery
        update
        set
        where
    </cfquery
    <span
    <br
</cfif

That's much easier to find the statement we're after, innit? But the reason we're after the if/else is because when colour = "tawa", we're getting the wrong result.  Why?  We need to look at the condition, don't we? And then see what the IF and ELSE blocks are doing.  We're not simply counting off a number of statements/blocks, which is all one can mentally do when scanning whether a tag is closed.  That rationale is just nonsense.

OMG: I've just had an epiphany!  When we're scanning code, we're not simply counting off the ends of blocks (or just the beginning), we're actually scanning the code.  The code.  We don't necessarily read every single character and evaluate every single statement, but we we do kinda look at the important bits of each statement.  And I think this demonstrates that the closing of the tags / ending of the blocks are not the important bits.

So here's all the code (this is cut from the middle of some random file on my HDD that matched a search for "<cfelse>". The code itself is meaningless in the context of this article):

<cfloop query="qCorrect">
    <cfset sCorrectPath = replace(stFileStore.serverPath & path, "/", "\", "ALL")>
    <cfif fileExists(sCorrectPath)>
        <span style="color:green;margin-left:50px;">[<cfoutput>#sCorrectPath#</cfoutput>] does exist</span><br />
        <cfset iMatchCount = iMatchCount + 1>
        <cfset uGood = obj_uuid>
    <cfelse>
        <span style="color:orange;margin-left:50px;">[<cfoutput>#sCorrectPath#</cfoutput>] does not exist: skipping</span><br />
    </cfif>
</cfloop>
<cfif iMatchCount gt 1>
    <!--- We can't correctly second-guess which it's supposed to be, so just skip it --->
    <span style="color:red;margin-left:50px;">[<cfoutput>#iMatchCount#</cfoutput>] possible re-matches found: cannot automate rematch.</span><br />
<cfelse>
    <!--- we found a match and the file exists: Use it. --->
    <span style="color:green;margin-left:50px;">Single good rematch found: [<cfoutput>#sCorrectPath#</cfoutput>] Updating DB </span>
    <cfquery datasource="#request.siteContext.dbConnection.dsn#">
        update    obj_newsitem
        set        image    = <cfqueryparam value="#uGood#" cfsqltype="cf_sql_varchar">
        where    image    = <cfqueryparam value="#obj_uuid#" cfsqltype="cf_sql_varchar">
    </cfquery>
    <span style="color:green;"><strong> Updated in DB</strong></span>
    <br />
</cfif>

And again:

<cfloop query="qCorrect">
    <cfset sCorrectPath = replace(stFileStore.serverPath & path, "/", "\", "ALL") />
    <cfif fileExists(sCorrectPath)>
        <span style="color:green;margin-left:50px;">[<cfoutput>#sCorrectPath#</cfoutput>] does exist</span><br />
        <cfset iMatchCount = iMatchCount + 1 />
        <cfset uGood = obj_uuid />
    <cfelse>
        <span style="color:orange;margin-left:50px;">[<cfoutput>#sCorrectPath#</cfoutput>] does not exist: skipping</span><br />
    </cfif>
</cfloop>
<cfif iMatchCount gt 1>
    <!--- We can't correctly second-guess which it's supposed to be, so just skip it --->
    <span style="color:red;margin-left:50px;">[<cfoutput>#iMatchCount#</cfoutput>] possible re-matches found: cannot automate rematch.</span><br />
<cfelse>
    <!--- we found a match and the file exists: Use it. --->
    <span style="color:green;margin-left:50px;">Single good rematch found: [<cfoutput>#sCorrectPath#</cfoutput>] Updating DB </span>
    <cfquery datasource="#request.siteContext.dbConnection.dsn#">
        update    obj_newsitem
        set        image    = <cfqueryparam value="#uGood#" cfsqltype="cf_sql_varchar" />
        where    image    = <cfqueryparam value="#obj_uuid#" cfsqltype="cf_sql_varchar" />
    </cfquery>
    <span style="color:green;"><strong> Updated in DB</strong></span>
    <br />
</cfif>


And a third time (it's still exactly the same code):

<CFLOOP QUERY="QCORRECT">
<CFSET SCORRECTPATH = REPLACE(STFILESTORE.SERVERPATH & PATH, "/", "\", "ALL") />
<CFIF FILEEXISTS(SCORRECTPATH)>
<SPAN STYLE="COLOR:GREEN;MARGIN-LEFT:50PX;">[<CFOUTPUT>#SCORRECTPATH#</CFOUTPUT>] DOES EXIST</SPAN><BR />
<CFSET IMATCHCOUNT = IMATCHCOUNT + 1 />
<CFSET UGOOD = OBJ_UUID />
<CFELSE>
<SPAN STYLE="COLOR:ORANGE;MARGIN-LEFT:50PX;">[<CFOUTPUT>#SCORRECTPATH#</CFOUTPUT>] DOES NOT EXIST: SKIPPING</SPAN><BR />
</CFIF>
</CFLOOP>
<CFIF IMATCHCOUNT GT 1>
<!--- WE CAN'T CORRECTLY SECOND-GUESS WHICH IT'S SUPPOSED TO BE, SO JUST SKIP IT --->
<SPAN STYLE="COLOR:RED;MARGIN-LEFT:50PX;">[<CFOUTPUT>#IMATCHCOUNT#</CFOUTPUT>] POSSIBLE RE-MATCHES FOUND: CANNOT AUTOMATE REMATCH.</SPAN><BR />
<CFELSE>
<!--- WE FOUND A MATCH AND THE FILE EXISTS: USE IT. --->
<SPAN STYLE="COLOR:GREEN;MARGIN-LEFT:50PX;">SINGLE GOOD REMATCH FOUND: [<CFOUTPUT>#SCORRECTPATH#</CFOUTPUT>] UPDATING DB </SPAN>
<CFQUERY DATASOURCE="#REQUEST.SITECONTEXT.DBCONNECTION.DSN#">
UPDATE    OBJ_NEWSITEM
SET        IMAGE    = <CFQUERYPARAM VALUE="#UGOOD#" CFSQLTYPE="CF_SQL_VARCHAR" />
WHERE    IMAGE    = <CFQUERYPARAM VALUE="#OBJ_UUID#" CFSQLTYPE="CF_SQL_VARCHAR" />
</CFQUERY>
<SPAN STYLE="COLOR:GREEN;"><STRONG> UPDATED IN DB</STRONG></SPAN>
<BR />
</CFIF>

Of those three: which is the hardest to read?  It's the third one, right?  And neither of the first two blocks is actually easier or harder to read that the other, is it? No.  So don't give me any nonsense about having unnecessary trailing slashes actually makes it any easier to scan, follow or actually read code.  It's not true.

Irony would have it that one person I know (and will be reading this ;-) who advocates this reason for closing their CFML tags also is an advocate for making semi-colons optional in CFScript.  But how will they know where their CFScript statement ends? Won't they get all confused?  Chortle.


There is another group of developers who put trailing slashes on some tags, but not others.  They'll uniformly put them on <cfargument> and <cfset> tags - for example - but not on <cfparam> or <cfdump> etc.  What the hell is going on in their brains, I have no idea at all. Weirdos.


This leaves one last rationale that I can think of: "I just do it out of habit... it's the way I learned to do it, and I'm used to it and it looks right to me".  Cool.  I can't fault that.  And I never would. Go yer hardest.

This gets to the whole nub of my annoyance with the practice of unnecessarily closing tags like this.  It's not that people do it, it's that people try to rationalise why they do it, and just spout nonsense when they go about it. All of the technical or procedural rationalisations are founded on logical fallacies or just a load of ballocks. You do it because you want to. That's why. You might have started doing it through some misjudgement or misunderstanding about the relationship between XHTML/XML/CFML and the notion of compliance. Or you might have heard someone say it's easier to scan the code, but really have not stopped to think about whether it actually is. Or you might have started to do it that way because whoever taught your showed you that way, and you didn't know any better, or even stop to think about it.  I really don't believe it's because you think it helps you work out when a statement ends. But the reason you continue to do it is not for any good or valid reason beyond it's just cos you want to, and you like to write code like that.  And there's nowt wrong with that.




Just a footnote.  There are a few CFML tags that do work either as a single statement or as a block. <cfinvoke> is an example:

<cfinvoke method="foo" returnvariable="bar">

<cfinvoke method="foo" returnvariable="bar">
    <cfinvokeargument name="moo" value="cow">
</cfinvoke>

This is one situation in which I think it's probably a good idea to close the single tag, eg:


<cfinvoke method="foo" returnvariable="bar" />


This removes ambiguity. I can only think of <cfinvoke> and <cfhttp> (oh, sorry: in case you can't read that, I meant <cfinvoke /> and <cfhttp />.  Better? ;-) ) that fall into this category. There might be others?

There's also a situation in which - and we've all been caught by this - it is actually hazardous to put extraneous slashes about the place:

<cfmodule template="myModule.cfm" />

Or more familiar:

<cf_myModule />

That syntax will actually execute those statements twice. Once for the opening tag, and once for the implicit closing tag.  And if you don't write your module to deal with its execution mode properly, you can get unexpected results with that.

One other thing I've heard people say when discussing this. Two people will each have divergent opinions, and each will "fix" the others code when they encounter it.  Don't do that.  It's a waste of time, it will just incur irritation from your colleague, and it runs the risk of accidentally breaking code that actually works. In the bigger scheme of things, it doesn't matter. My rule of thumb is if I am working on a specific line of code that has a trailing slash, I'll remove it as part of my edit, but I'll leave the rest alone. Although really, one line that does/doesn't have a slash at the end amongst a bunch that are the other way around (say a bunch of <cfargument> tags) probably looks worse rather than better. So perhaps just leave 'em be.

The best thing to do is to adopt a standard (whichever way), and stick to it for all new code, and leave the old code as-is until such time as it can be overhauled.


I'm sure I don't have to say this, but - as always - I welcome your comments on this topic :-)

Footnote (13/11/2012):
Shawn Holmes has written an article on the same topic, from a rather different perspective, which complements this one.  It's a good read.

--
Adam