Sunday 29 July 2012

Are CFINSERT and CFUPDATE as bad as everyone makes them out to be?

This is a bit of a lazy-Sunday sort of posting.  I was on the Adobe forums last week, and CFINSERT and CFUPDATE came up in conversation.  The person asking the question was asking about <cfqueryparam> and I responded to that (and cross referenced to my post on that topic, should you want to read it), but I mostly left the <cfinsert> and <cfupdate> side of things alone, sniffing with distain and moving on to something else.

I decided I didn't like leaving the question unanswered, so revisited it today.

I could recall a discussion a while back about these two tags not parameterising their values, which kinda invalidates them in my view.  But I decided to find out where they're at now.

I found this bug that was raised on the old bug tracker which covers this topic, and the ticket is still open so it seemed like the problem still existed.  However I never trust the Adobe bug tracker because... well... err... it's just not very good and not very accurate, so I decided to re-investigate for myself.  I knocked together some quick code, thus:

There's no surprised here:
  1. I check to see if a form has been submitted (which it won't have been on first pass);
  2. if the ADD button has been pressed, do a CFINSERT;
  3. otherwise if it was an UPDATE, then use CFUPDATE.
  4. Then get all the records as they currently stand.
  5. Display some forms: a stand-alone one for adding a new entry,
  6. and a series of other forms for each existing record.
I hasten to add that this is absolutely rubbish code, and was written to demonstrate this test, not to suggest how this sort of thing should be managed!

I've got debugging output switched on, and am displaying DB activity.

When I add a new record, I get this DB output:

CFINSERT (Datasource=scratch_mysql, Time=1ms, Records=0) in C:\ColdFusion10\cfusion\wwwroot\shared\CF\CFML\tags\database\update\test.cfm @ 13:15:34.034
insert into tbl_test (tst_data) values (?)

And if I do an update, I get this:

CFUPDATE (Datasource=scratch_mysql, Time=0ms, Records=0) in C:\ColdFusion10\cfusion\wwwroot\shared\CF\CFML\tags\database\update\test.cfm @ 13:16:31.031
update tbl_test set tst_data=? where tst_id=?

The short version is that these days, CFINSERT and CFUPDATE do indeed parametrise their values.  I tested this on CF10, CF9.0.2 and CF8.0.1 (so "these days" is "the last five years").  Given that bug was raised after CF8.0.1 came out, I now question whether it was actually an issue.

On the face of it, one could question what's so bad about these tags?  They seem to do what it says on the tin, and they seem to do an adequate job.  Yet I've always been one of these ones that rails against them, considering them to be the devil's instrument.  And it's not like I'm the only one that thinks this.

Let's be objective about this though. What's good about these tags:
  1. they work;
  2. they're dead easy to use;
  3. at least these days they're SQL-injection proof
One can't contest the veracity of those statements too much.  However I think the first one can be discarded; it's facile.  That a tag "works" is kinda the minimum barrier to entry for it to be in the language.

I also challenge the validity of the second point.  Whilst researching this posting, I googled-up what had been said about these things in the past, can came across this blog entry from Ben Forta, from way back in 2006.  I'll quote a bit of what he says:
But there are lots of very simple ColdFusion applications out there. Some are created by absolute beginners who are having to learn CFML, HTML, some JavaScript, and some SQL all at once - and they can (and should) use whatever shortcuts they can use to get the job done. [...]

Here are some facts to keep in mind. [...] Fact: Most ColdFusion developers struggle with SQL statements. Fact: Most ColdFusion developers have no idea what a methodology or framework is. [...] Fact: Most ColdFusion developers use less than 10 CFML tags on a regular basis. Fact: ColdFusion makes lots of entry-to-mid-level developers productive and successful.
To that my response (in the comments of the same blog post) was:

[...] I see the dumbing down (or keeping them dumb) of low-end CF developers as a *bad* thing for everyone concerned.


I think the reason why there is a demographic of CF developer out there that struggles with such simple SQL as INSERT and UPDATE queries is down to <cfinsert> and <cfupdate>. Adobe (and its evangelists) should not encourage people to stay dumb. They should encourage them to improve.

Having - and encouraging - this demographic is counter-productive for CF [...]
Hmmm.  I've mellowed a bit since 2006 (OK, but not very much, I grant you), so I'm perhaps no so emphatic about this these days.  I stand by what I said, but I don't think this is the fault of <cfinsert> and <cfupdate>.  And, equally, for some situations <cfinsert> and <cfupdate> probably do a better job of the SQL than some person not so au fait with the idiosyncracies of DBs might.  For once thing, <cfinsert> and <cfupdate> parameterise their SQL.  How many <cfquery> statements do I see with hard-coded values in them?  I see them every time I care to look.  If anything, misusage of <cfquery> is perhaps a bigger problem than with these two other tags!

The one thing that resonates here is that there's an education requirement.  However just pandering to people's lack of knowledge on this topic is not the right approach, I think.  That said is it the job of the documentation to try to encourage this?  Would be be right to add a warning to the docs of <cfinsert> and <cfupdate> to the effect that they are limited in scope and really not intended for production use, and one should use <cfquery> and <cfstoredproc> instead?  The docs for <cfquery> advise to always use <cfqueryparam>, but don't actually then demonstrate this in the sample code, and don't explain why right on that page.  There's just a (broken!) link off to a security bulletin, and another link off to "Accessing and Using Data", on page 13 of which there's a poorly-articulated and not particularly compelling explanation of why one might want to use <cfqueryparam>.  This sort of thing should be front and centre on the <cfquery> docs, and there should never be an code example in the docs that doesn't use a <cfqueryparam>.  And the explanation as to why one ought to use them needs to be clearer.

I'll stop digressing for a bit and list what's bad about these tags, building on what I say above, but these next points are also a coalescing of various things I've found dotted around the web:
  1. They kind of encourage people not to learn how to use the tools essential to their jobs.  Sorry, but if you don't know enough SQL to do CRUD statements, then you need to learn.
  2. They're only really good for trivial situations which probably rarely exist in the real world. 
  3. They don't perform any validation on the data.
  4. They're intrinsically bound to using the FORM scope.
  5.  They can be leveraged by hackers to expose underlying SQL.
I've covered the first point.  Adding to this, I know how to use SQL just fine, but after fiddling with these tags today, my position is "there's a time and a place", and when it's that time, and it's that place, I'll probably resume using these tags now. That kinda covers the second point.

The third point was made in a blog comment somewhere, but it's an utterly specious point.  <cfquery> doesn't perform any validation either.  Nor does <cfstoredproc>.  Form validation isn't the job of the DB-storage mechanism, it's the job of the form handler ie: the form action page), and that validation should be done - and passed or failed - long before any thought is given to storing the stuff in the DB.

Point four is a valid one.  This is an example of thoughtless implementation on the part of... oooh... I guess someone back at Allaire.  Why did they innately link these tags to the form scope?  They should have been implemented from the outset to simply take [a struct] as one of its arguments.  This could be the form scope, or it could be the URL scope.  Or it could be myStructOfValidatedFormFields (which is what it ought to always be).  I guess I am applying hindsight here, and also forgetting that the ubiquity of structs in CF came about a number of years after these tags did.  And they also came about back when the web and web apps lacked the maturity they have now.  Still: we're not back in the 1990s now, so the tags could have been updated to accommodate [any struct] now.

The last point is an interesting one.  Unless told otherwise, <cfinsert> and <cfupdate> will try to add all form fields to the SQL statement.  And if there are form fields that don't correspond to DB columns, errors will ensue.  So the idea was that someone could post more form fields than are intended, to break the DB call, and maybe cause an error to show up on the screen which shows more info than it should.  This potential problem doesn't exist with <cfquery> or <cfstoredproc> because one needs to specifically code for columns and their values.  However it also relies on the <cfinsert> or <cfupdate> tag to not simply be passing through a subset of form fields it wants (see my code example).  In fact this is almost always going to be necessary, because otherwise stuff like the submit button and that sort of thing will end up being passed to the DB.  People will find out very quickly about this.  But I guess the risk is theoretically there, so it's a consideration.  I hasten to add that everyone should always have an error handler which doesn't expose the nature of any error that's occurred, instead displaying a nicely formatting "oops!" page, and returning a 404 or 50x HTTP status code as appropriate.  So it shouldn't matter if someone tries this sort of hack: they'll just get an "oops" message.  And it'll be their own fault.

OK, so have I answered my initial question: "Are CFINSERT and CFUPDATE as bad as everyone makes them out to be?"?  Well - for my own part - like I said earlier I've actually warmed to them slightly today, and I think in situations where I have simple data structures, I'll probably go back to using them again.  So it's answered as far as I'm concerned: no, they're not so bad after all.

I may have missed some very important reasons why they're either still very bad, or actually very good.  Do sing out.