Monday 28 July 2014

CFML survey results: scoping of variables-scope variables

G'day:
So I'll close off the survey "Survey: to scope or not to scope" now. I got a good level of response: 79 responses. Here's the aggregation of said responses. Including some charts just for you, Phil Duba.

I've actually "cheated" a bit, and just screen-capped the charts from Survey Monkey. Anyway, here's the business.

Q1: How vigorous are you at scoping variables-scope variables in CFM & CFC files?





1 - never scope23 - about 50/5045 - always scopeTotal
CFM files29.11% (23)13.92% (11)13.92% (11)22.78% (18)20.25% (16)79
CFC files5.06% (4)8.86% (7)6.33% (5)15.19% (12)64.56% (51)79

The results for CFCs is about what I expected, and kinda mirrors my own position here; due to the daft way CF handles its scopes in CFCs, it's kinda necessary to scope variables-scoped variables a lot of the time in CFCs... just so one can keep track of the fact the variable is object-wide. That said, with small methods, it should not be so necessary because one should be able to clearly see where local variable is declared, and accordingly anything not obviously declared is clearly in the variables scope.

There were quite a few comments with this one:

In CFCs I scope the variables when I define them in the init() method so that they are private to the instance, but not thereafter. I rarely use the variables scope in .cfms. Most of my code uses FW/1 so vars are usually safely referenced as rc. or local.

not requiring variable declaration is one of main problem with cf. It is like not switching on Option Explicit in 'another language' But at least we should scope everything
Requiring variable declaration is ceremonial code hat shouldn't be necessary. If an unscoped variable is first assigned within a function: it's function-local. If it's first declared outwith a function: it's variables scoped. If one wants to declare a variables-scoped variable within a function: scope it. There's no need for specific variable-declaration syntax.

In a CFC you need to highlight shared state so it's important to be explicit about VARIABLES scope. I never scope arguments or local variables (and I use VAR, not LOCAL). That way all unscoped variables in a CFC are (or should be!) local to the method (either true locals or arguments). In a CFM, there's no point in making this distinction.
Very good reasoning.

I tend to lean toward not scoping unless there's ambiguity, but it's all just for readability sake. I used to always do local. in CFCs just to make sure I didn't have var scoping issues, but once I got used to always doing "var x" I got out of the habit for readability again.
I will use local in a function IFF I'm forced to use tags for some reason (eg: <cfquery>) and to save me a separate var statement.

In cfm files I normally scope everything except the variables scope. In cfc files I normally scope all variables (this,variables,local,arguments). If they are in some other scope hopefully I am not accessing them directly. I think a lot of my desire to scope everything is because in the past it would be common for my code to have functions that stretch on for several pages. In my newer code I would normally try to break up and organize things better. In shorter more organized code it is generally pretty easy to see where variables came from.
Yep. We're moving in this direction too.

I try to use the variables scope only when needed. When I do so, don't always scope it. When working on legacy code I do scope the variables scope a lot when I know that's where the variables are. But that's more to make old, dirty code better.


CFC: Never scope VARIABLES, LOCAL, ARGUMENTS - scope others but try not to use them to not break encapsulation. CFM: LOCAL if necesarry - but i try to wrap cfm calls with: new Wrapper().render(path); - to make scoping unnecessary.
OK, so your CFMs are called from within CFC methods; I guess they need to follow CFC-scoping rules then (which is not always possible...)

only exception is variables scope in cfm never, but in cfc always

only for component level variables, everything inside functions is var'd

Functions should always be small in number of lines so scoping not much needed like with templates.

Only really ever scope when I feel it's required. Any other times seems like extra work for no benefit.
Yep. To slavishly follow guidelines here is counter-productive, I think.

To be honest, inheriting a lot of code that didn't meet many standards at all, it was hard to get excited about putting variables into some container/scope when so much code was written with the default variables (To UPPERCASE OR to Not, That's another question.) Coming from a C background, everything should be automatically scoped by the language to 'local' unless it has an override. I'd let the ARGUMENTS scope be another default search. Can anyone answer this question? If I declare a variable as [cfset VARIABLE.URL = StructNew()] and then reference URL.var - where will it first search for bareword "var", and where will it search for "URL.var"? I realize this is pathological but if there isn't a clear search rule there'll always be problems.
The search rules are in the docs.

In cfm files I scope for clarity with vars like URL, FORM etc. but I usually won't write out VARIABLES.foo and opt for just foo. CFCs on the other hand, I scope everything. Visually, I can trace my code better for offending data.
Scope everything? Even the local scope? Bleah.

I used to be in the camp that though "thou must scope every variable". But eventually I even got too lazy to do variables.something in view files.
I think changing one's practice due to laziness is not something one should admit to. At least in this case it's anonymous ;-)

I like scoping because in a large, messy application, you can always easily find out where the variable is from. I have seen too much code without scoping, and I have to sit there and guess or figure out where the variable is set or instantiated. But if the code says request.foo, then at least I know where to start. Also, I was taught long ago to always scope since it helps with performance (less things for CF to look up).
One good point here is that my question only really assumed "for new code", but it's a very real consideration that rules might differ between old and new code.

I always never sometimes scope my variables, ever.
I'm kind of exactly same generally. Almost often all the time.

The variables scope duality makes it important to scope in CFCs, especially with persistent components.

I *try* to scope everything, just because it makes everything blindingly clear and doesn't allow for any confusion, but I don't always scope super-local stuff.

I always use var x = "whatever" in cfc's At least I thought I did. In functions I always use var, but at the top of a cfc I usually have property name="x"; I always assumed those were only accessible either inside the function or by using accessors, but I guess I am not certain of that.
I have to admit I have only ever used property declarations when experimenting with ORM, and never checked the inner workings of how they're exposed.

I have Cascading set to "strict" in the Railo administrator, but Search resultsets is turned on.
What's the ramifications of that, for those who can't be arsed looking up the Railo docs?

CFCs can be trickier to debug and might benefit from a stricter scoping.

Variables in CFM files only get scoped if they fall within a construct where (I think) scoping is important - e.g. inside a function.

In cfm files I scope everything except the variables scope. in cfc's I scope everything except local variables

In CFM files the variables scope is the assumed scope, and the CORRECT scope. Because of the unusual way CF handles unscoped variables in CFC methods (at least in CF 9) I scope all variables in CFCs.
I'd love to hear your rationalisation as to why the variables scope is the correct default scope in a CFC.

Didn't always. Actually, just recently started being more vigorous since it seemed like a best practice. Using "local." everywhere for local scope vars seems overkill, but I think I will adjust and be better for it in the long run.
Using the local scope is just code-clutter IMO.

In a CFM I just don't see the need to scope unless there are clashes going on. Inside a CFC, I always scope the variables.

Honestly, I feel dirty if I don't scope variables in a CFC. In my newest position, I've inherited a coldbox app where scoping is occasional at best. Being the inheritor of a complex non-scoped app certainly makes one appreciate the extra button presses.
Haha. Is it not perhaps a case of poorly-written code making it hard to follow, not necessarily the variable scoping (or lack thereof)?

Since the only CFM files I touch now are usually 'dumb' views - most of the page specific variables are just loop counters and small tmp variables.

as I get the scoping more into my daily work life, I scope more.

I don't use variables scope in cfm Views. If using FW/1 I try and use the local scope, but sometimes I forget :)

We scope everything in .cfm/.cfc, except for variables.X in a .cfm. As using variables.X in a .cfm file in ACF is a slower lookup than just using X (at least in CF10 and below, as they try to find variables.variables.X apparently). No difference on Railo.
I've heard Gert say that scoping variables-scoped variables in CF is actually slower than not scoping. And measurably so. I've yet to test this.

I feels scoping helps your code to be 'self-documenting', granted you don't make an utter mess of things.

I almost never specify the variables scope even in a CFC other than to create the variable in the first place. I make sure the variable name itself conveys that its not a function-local variable, or my functions are so brief it's obvious. However I am more likely to scope one in a CFC for the sake of removing ambiguity.
This is my own answer.


I think the general gist seems to be "scope 'em in CFCs; don't bother in CFMs".

Q2: Do you have either a formal or personal coding standard you follow regarding scoping of variables-scope variables?




Optionresponses
Formal32.91% (26)
Personal70.89% (56)
None12.66% (10)

It's interesting how this sort of thing is more often covered in a personal guideline, rather than a formal coding standard. It's specifically in our coding standard to scope everything. My personal guideline is "only if one cannot make the code unambiguous any other way".

And to those of you who have no guideline one way or the other... what do you do when you come to type a reference to a variable into your code?!

Comments:

As simple as possible, as complex as necessary. Only scope if needed - when defining cfc instance variables, or in the rare occasions where there's a risk of a collision (can't remember the last time that came up though).

Easy: Don't. It solves a problem you shouldn't have.

Variable naming conventions typically indicate data type. e.g. S_surname far more important is to indicate scope e.g. L_surname (that is 'L' for local)
Oh god. Remind me never to work on any code you've written. That's horrendous. Pseudo-hungarian notation and underscores. Jesus.

In a CFC: always use "variables." for VARIABLES scope, "this." for THIS scope. Never scope local or argument variables. Always use all-cfscript for CFCs. Everywhere, always scope any of the shared scopes.


Can't say I follow these 100%, but I use it as a good reference. http://wiki.coldbox.org/wiki/DevelopmentBestPractices.cfm
Ooh... I must have a look at those. To see how much of it I disagree with ;-)

As I said above I scope almost everything inside cfc files including the variables scope. In cfm files I scope all variables except variables scope. I generally don't scope function calls inside cfc files. I have done that some in the past, but it requires a lot of updating if you change a functions access.

Use the variables scope only when needed (witch is almost never) and scope as much as possible.

component {
    property numeric age;
    property boolean isOfFullAge;

    package function init( required date birthday, numeric fullAge = 18 ){
        age = dateDiff( "yyyy", birthday, now() );
        isOfFullAge = age >= fullAge ? true : false;
    }
    
    numeric function getAge(){
        return age;
    }
    
    boolean function getIsOfFullAge(){
        return isOfFullAge;
    }
}
OK. What about variables which aren't properties?

My personal standard changes everytime ColdFusion catches up with the times. When we used to have to var at the first line, I used the built in local scope EVERYWHERE. Now that we can var anywhere I less often use the local scope but wouldn't use it at all ifs I could comma separate my var'ing like JAvaScript/Raylo and also var a variable to nothing which you can do in CF. var x, myVar=3
In CFML,  I dunno why one would want to declare a variable separately from its first usage. CF handles the hoisting of the declarations to the top of the code for you anyhow. There's no need to (want to) do it by hand.

To be honest, inheriting a lot of code that didn't meet many standards at all, it was hard to get excited about putting variables into some container/scope when so much code was written with the default variables (To UPPERCASE OR to Not, That's another question.) Coming from a C background, everything should be automatically scoped by the language to 'local' unless it has an override. I'd let the ARGUMENTS scope be another default search. Can anyone answer this question? If I declare a variable as [cfset VARIABLE.URL = StructNew()] and then reference URL.var - where will it first search for bareword "var", and where will it search for "URL.var"? I realize this is pathological but if there isn't a clear search rule there'll always be problems.

Since I typically will scope everything in a CFC (and almost everything in a cfm), one of the other things I do with the code is capitalize the scope type when writing it. FORM.foo URL.foo etc. I also var every variable at the top ahead of time in a function that's not initially scoped; even though these days, I could var as I go in the code.
Interesting. I capitalise URL because it's an abbrev.; same with CGI. But not the scopes which are just words.

Shifts over time and I don't adhere to it religiously. So maybe I should have checked none.
Well at any given time you have your current practice.

Pretty much try to scope all the time simply because if I stop doing it I will then forget to at some point when it is important, also it should produce generally quicker code as CF/Railo doesn't have the search the scope chain to find it.

In CFCs I'll scope when accessing global CFC variables that are meant to be "private" properties to the CFC. For other things I always var scope within functions and getter/setter properties for "state/instance" variables. I don't variables scope injected stuff. In CFMs I never scope variables scoped variables - that'd be crazy.

No formal standard here (there are only three of us developing here), but the general rule of thumb is "scope everything".

None as far as specifying variables scope. I don't recall ever using . I do specify FORM scope when setting or accessing a FORM variable, and if I am looking for a url variable I specify url scope. I also var everything in a function because it makes me feel good.

Pretty much scope everything - even arguments and local inside of cfcs. Exception would be inside a loop.
Why the exception? I'm not disagreeing, just wondering why.

We have a best practices policy guide and keep adding coding suggestions to keep our code looking consistent no matter who works on it.

I prefer using "var," but our current coding standard uses "local." I feel like "local" clutters up the code, but either is better than no scoping.
Agreed.

Its a personal standard that I am slowly introducing my team to
Good stuff!

At work, the standard is to specify the variables scope, which is not my preferred style. However, given previous poor coding practices/standards and the lack of scoping for all other scopes (FORM, URL, you name it), I can see the advantage of requiring the developers to always use the scope. For stuff I write on my own, I sometimes scope local variables out of habit, but generally prefer not to. For CFCs, always scope.
I would require developers to write clear code, rather than apply rules intended to facilitate that.

Scope everything in cfc's (including arguments) Scope everything in cfm's except variables

At work we have a formal definitions for coding standards for ColdFusion, Java, and C# apps since we are a large company with many products and tech stacks. Personally I follow some simple rules. Everything is tabbed appropriately, make liberal use of line breaks, upper case all constants and enums, use camel-casing, scope all CFC variables, etc...

Right now. The standard is scope everything. "local.", "session.". Trying to make global variables properties of a global vars CFC. The only thing I am not explicitly scoping is arguments. I user an "arg" prefix in the variable name anyway, so "arguments.argLastName" seems overly repetatively doubly redundant.

While there's no formal, the other developers on my team follow my lead when it comes to development styles so whatever I do they tend to do.

Maintenance, technical debt, 'pay it forward', whatever you want to call it, speaking for the unknown guy that has to update/fix your code, PLEASE be a professional and DO IT!

When setting a variable using cfset i do not add the scope for variables-scope variables. When I use that variable in e.g. cfif, cfoutput, cfset, BIFs etc I always add the scope #variables.foo#

My company typically tries to scope everything. Makes it easier to understand when there are multiple developers on a project.

My personal standards have been evolving over time. It used to be that all references to existing variables must be scoped, and variable-sets could be implied or explicit at my discretion at time of writing. These days my code is broken into smaller, more discrete, more testable chunks. I used to prefer "local" scoping everything in CFCs (var local = {}; local.foo=1; local.bar=2;) but lately I've found myself just using good old classic var scoping as I find it more terse in the long run. I think my current behavior very much mimics the way that I write my JavaScript, because as time goes on I write more and more JavaScript and CFScript, and less and less CFML Tags. JavaScript has less scoping mechanisms exposed to the developer, so I tend to use unique names and small decoupled functions.
You know that since CF9, this has been redundant, yes (indeed, it's ignored):
var local = {};

The formal coding standard at work says that the VARIABLES scope must always be used in both CFCs and CFMs. In practice, however, it's a standard that's not enforced.

always scope

I use snippets to spit out some basic boilerplate for each of my components / functions. I've got a generic block at the head of each for holding variables. Nothing special, but it ensures that everything I write is formatted the same way every time.
Hmmmm. Sounds like COBOL ;-)

I've been using some coding standards that have evolved over the last few years and have tried to change them as the technology has evolved. Right now the Coldbox Coding standards, http://wiki.coldbox.org/wiki/DevelopmentBestPractices.cfm, is very useful. I pass on the URL to other developers to use regularly.

I am working on more of a personal standard, but nothing is followed at my job, but that is another conversation.

In Cfcs I generally initialise variables.instance in the init method and put all instance variables in there. It's a habit I picked up early in my Cf "career" from a tool I used called "Rooibos generator " which afaik doesn't exist anymore. I think it's overkill, but it stuck.

We have a standard we enforce on the team, but not organizational wide.

Formal and personal because we have set standards, but I also have a preference (so sometimes I deviate, depending on who will see the code)

Make sure everything makes sense: application.athing.thingTwo.appID is a no. application.app.appID is a yes.

I've got a coding standard at work which requires the variables to be scoped. For my own code, I do not bother.
Said Adam.


Q3: Anything else to add?

Would be interested to know whether or not people always scope their arguments vars within functions. Personally I used to but no longer unless there's a clash. Legibility has become more and more important to me.
Well hopefully this lot goes some way to show what others do / think...

The key thing is readability: use the minimum ceremony needed to be clear about the intent of the code. Using "local." and "argument." just adds clutter, as does "variables." where it doesn't matter (CFM files).
Perfect.

Since I use ColdBox, I don't ever need the FORM or URL scopes, but I don't mind rc.x or prc.x since it's so short. Even though scopes like cgi or or cookie might be in the lookup order, I always prefix them just because they're not places I'd naturally expect a variable to come from. In fact, most of my code is in CFCs and I try to keep them encapsulated (only relying on variables declared in the method, passed into the method, or DI'd into the CFC) and I avoid reusing the same variable in two scopes so I think that helps reduce the ambiguity in what scope the variable is expected to live in. i.e., someone reading the code shouldn't have to make a wild guess.
I like your observation about the CGI and cookie scopes. That makes good sense.

I mostly started doing all of this scoping to avoid mistakes in large functions and also because of fear of the automatic scope lookup. For a long time I didn't know the order that the scopes were checked, especially in cfc files.
If you have "large functions", you have a mistake right there already ;-)

Never scope! ACF is much slower with scopes, because: VARIABLES.foo; searches for: -> local.variables -> arguments.variables -> variables.variables ... -> local.foo -> arguments.foo -> variables.foo -> found -
"much slower"? Got some figures?

Looping over a query "qsomething": refer to column always with qsomething.column
I dunno about this one either. If one gets out of sight of the initial loop construct so that one needs reminding might suggest the loop is doing too much.

Scoping is hardly nessary at all when you don't use includes that pass all available variables to a template that doesn't specifiy what is expected as expecer input. Loose languages like ColdFusion make us think scoping is a requirement. Programming is about input/output and if you don't specify your input and what you do with the output, your programming procedurally top to bottom with no clear indication of what happens in the middle. CFInclude is the worst tag, you don't specify your input and who knows what's coming out so you need scoping more to "better" indicate what context/scope it is that is being worked with @ runtime.
Worse than <cfclient>? ;-)

After seeing a discussion with Gert on Twitter, I try to no longer scope the closest scope (variables in a cfm, local in cfc)
Yeah, I want to see something to back that up before I make a decision based on that.

This is a good question, and I'm interested in where other folks stand on this. For me, my style has changed over the years due to experience with other languages as well as improvements in CFML itself. Remember that "var local = structNew();" crap? Still trying to get rid of all that.
Yeah. Still weeding that stuff out of our codebase too.

While there may be some marginal performance improvement by scoping everything, the main reason I do it is as a way to reduce coding errors and speed debugging.
I'd like to see examples of how scoping variables mitigates coding errors.

Looking foward to hearing results. Definitely want to hone my style to fit in.

Explicitly scoped variable will perform better. If you RTFM, there are 5ish scopes that have to be searched before 'variables': Local (function-local, UDFs and CFCs only), Arguments, Thread local (inside threads only), Query (not a true scope; variables in query loops), Thread, THEN Variables. Also, not scoping the 'variables' scope in a CFC, to me, is akin to not providing minimal comments where appropriate. "Magic" sux when you have to figure out someone else's tricks!
If - having RTFMed - you actually thought about it... you'd see that in mainline code in a CFM file we have this look-up order:

  1. local (not relevant);
  2. arguments (not relevant);
  3. thread local (not relevant);
  4. query (yep, relevant);
  5. thread (not relevant, and not even correct anyhow);
  6. variables.

So, surely rather than explicitly scoping as a matter of course is just pointless clutter other than those rare occasions in which there is ambiguity between your variables and your query columns, whilst within a query loop. So that being the case, the better tactic would be to never scope, other than in those ambiguous occasions, so as to make the ambiguity a) stand out; b) not be ambiguous.

I think it's this non-analytical adherence to rules that gets us into the situations of "always scope your variables" in the first place.

But I do not so vigorously disagree with you regarding CFC code. Except for the fact your functions should be small enough to take onboard in one hit anyhow, so non-local variables ought to be obvious.


To me, the use of the VARIABLES scope in CFM files leads to more cluttered code for very little technical gain, making the file both more difficult to read and maintain. It's useful in CFCs, but elsewhere, it's just cruft, in my opinion.

maintaining legacy applications where variables were not scoped can be a nightmare, especially when variables are defined in other files. If verbosity is the problem, I have used things like: so that you can use
Yeah, it's a shame that old code that could be helped by this practice never did it, but newer, clearer code still gets lumbered with it.

would find it great to get a more CF users wide standard on many of these topics, scoping, naming, etc, then add the standards to a github site like you and Ray did for cfXXX tags.
This is an idea I am mulling-over, after this comment.

I also refer to function args by scope after once naming one "url" and subsequently getting a name collision with the url scope. So I always do, for example: function(foo) { writeOutput(arguments.foo); }

The one area I always scope is my query variables, don't know why but it just seems to be the right thing to do...

Nope!
But... erm... ;-)


OK, so having read all that input, my position is now this:

  • In CFCs, always scope variables-scope variables. This is slightly divergent from where I was before, which was "your code should be clear enough not to have to". I think the key here is "should be". It might not be.
  • In CFM code... don't bother. Unless there's ambiguity to resolve which can't be resolved in some other coding-reading-friendly way.


Has your opinion been influenced at all?

Anything else to add?

Cheers, everyone, for participating. Even - perhaps especially! - those people I did not agree with. It's the stuff I disagree with that better helps me form my own opinion, most of the time.

--
Adam