I've got part 2 of the array discussion underway, but I'm just going to digress slightly to write some notes up about how CFML deals with assigning and passing-around complex objects.
Again there's probably nothing new here for most bods out there, but hopefully it'll be useful to some people.
OK, so first up, there are a number of data types in CFML:
- string (including lists)
- date
- boolean
- numeric (including both integer and floating point)
- array
- struct
- query
- XML
- objects (specifically native CFML ones, ie: component instances)
- [others]
The first four are all "simple values", and the rest are "complex values". Basically a simple value is one that has only one part to its value; the complex ones are multi-part. For example an array as a number of elements, a struct has a bunch of key/value pairs, etc.
I'm not going to say much about simple values in this, other than that they're all - clearly - copied by value. Given this code:
s1 = "Tahi, TWO, toru, wha";
s2 = s1;
s1 = replace(s1, "TWO", "rua");
writeDump(variables);
We predictably get this output:
struct | |
---|---|
S1 | Tahi, rua, toru, wha |
S2 | Tahi, TWO, toru, wha |
IE: after assigning
s2
the value of s1
, s1
and s2
are completely different values. So changing s1
does not impact s2
(even though in this case it'd've been convenient if it had).Things get a bit more complicated with complex objects. To demonstrate, let's run some code that's much the same as the code above, except using a struct:
st1 = {
one = "Tahi",
two = "TWO",
three = "Toru",
four = "Wha"
};
st2 = st1;
st1.two = "Rua";
writeDump(variables);
This outputs:
struct | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
ST1 |
| ||||||||||
ST2 |
|
So that's different. Changing the value of a key in
st1
also changes it in st2
. Some clever people will nod knowingly and go "yeah, it's because structs are copied by reference in CFML". Sorry, but no they're not.Structs are still copied by value, but unlike simple objects wherein the "value" is the contents of the string (or the date, etc), for structs (and most other complex objects), the value being copied is a reference.
So how's that different from copying by reference? It sounds the same.
To be honest, I understand what's going on, but I find it difficult to articulate. And I'm not 100% sure of the absolute vagaries, but I reckon it's something like this:
Copy by value:
Variable | Reference ID | Memory address |
---|---|---|
st1 | @12345678 | 0x00001234 |
st2 | @9ABCDEF0 | 0x00001234 |
Copy by reference:
Variable | Reference ID | Memory address |
---|---|---|
st1 | @12345678 | 0x00001234 |
st2 | @12345678 | 0x00001234 |
(I've made the reference ID and memory addresses look all hexadecimal and stuff for illustrative purposes).
You can see in both examples they all point to the same location of memory, but in the copy-by-value example the references are actually different.
st2
is a new reference, it's just pointing to the same memory location as st1
. Whereas in the latter example, st1
and st2
both point to exactly the same reference (and, accordingly, the same location in memory).For almost all situations you'll encounter in CF the end results are the same, but I've got an example later on that demonstrate the difference.
Now... having said that, for the rest of this article I am going to use the term "copy by reference" for the sake of brevity. I actually mean "copy by reference value".
Now the gist of this article is to demonstrate how the various CFML complex object types behave when being assigned or passed.
I'm going to start with the odd-one-out. Arrays.
Arrays
Arrays are complex data types, but for historical reasons (and - IMO - a bad decision on the part of Macromedia's CF team) arrays are actually passed by value in ColdFusion.Here's some illustrative code:
a1 = ["Tahi", "TWO", "Toru", "Wha"];
a2 = a1;
a1[2] = "Rua"; // oops, got one of the values wrong: fix it
writeDump(variables);
And the output:
struct | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
A1 |
| ||||||||||
A2 |
|
If arrays were copied by reference, then one would expect
a2
to have been "fixed" too. But because when a1
's value is assigned to a2
, it's a value-copy, then the values of a1
and a2
are thereafter completely distinct from each other, so the "fix" is not propagated to a2
.What's completely weird is that ColdFusion's array functions all seem to actually pass the input array by reference, as demonstrated here:
a1 = ["Tahi", "Rua", "Toru"];
arrayAppend(a1, "Wha");
writeDump(variables);
Resulting in:
struct | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
A1 |
|
So there's no return value, the passed-in array is simply modified "inline", implying it's passed by reference. Seriously? I do not know what Allaire / Macromedia were thinking by implementing arrays the way they did. That's partly pithy derision, partly an admission I actually don't know... there might've been a good reason for it. I doubt there actually was though.
Noticed I italicised "ColdFusion" before in the first para, when describing that arrays are passed by value. If I was to run that first example on Railo or Lucee, I get slightly different results.
WARNING:
Those that have any sort of sense of the aesthetic might want to look away now, because I'm about to show you a <cfdump> from Railo (and they look frickin' ghastly).
Here goes...
Scope | ||||||||||||||||||||
A1 |
| |||||||||||||||||||
A2 |
|
Once your eyes recover from that visual assault, you'll be able to note how
a2
has also been updated with the "fix" to the element at index 2. This is because Railo decided that ColdFusion's behaviour here was daft, so they "fixed" it in Railo (discussion here).On OpenBD, the code runs the same as on ColdFusion.
For the rest of this doc, one can assume the results are the same on all three of ColdFusion, Railo/Lucee and OpenBD, other than where I indicate otherwise.
If one really wanted to pass arrays by reference value in CFML, there is a way to kind of shoe-horn it in. Ben Nadel wrote an article a while back which covered how an ArrayList is passed by reference, so one could just use an ArrayList instead. I've slightly finetuned his approach here: one can use a CF array until the last moment, then turn it into an ArrayList when the pass-by-reference-value is needed:
a1 = ["Tahi", "TWO", "Toru", "Wha"];
a1 = createObject("java", "java.util.ArrayList").init(a1);
a2 = a1;
a1[2] = "Rua";
writeDump(variables);
And this outputs:
struct | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
A1 |
| ||||||||||
A2 |
|
Which demonstrates that the "array" is indeed being assigned by its reference value, instead of its actual value. I am not sure of the merits of doing this, but it's there to do should the need arise.
Structs
Structs do the whole "copying by reference value" thing. Here's some sample code:st1 = {
one = "Tahi",
two = "TWO",
three = "Toru",
four = "Wha"
};
st2 = st1;
st1.two = "Rua";
writeDump(variables);
And the results:
struct | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
ST1 |
| ||||||||||
ST2 |
|
This demonstrates that the refs for both
st1
and st2
are pointing to the same struct in memory.I'm going sideline slightly into some code that demonstrates exactly the same thing, but it threw me when I first encountered it, despite being obvious what's going on.
st1 = {
inner1 = {
one = "Tahi"
}
};
st1.inner2 = st1.inner1; // so those two references are pointing to the same struct in memory
st2 = st1;
st1.inner1.two = "Rua";
writeDump(variables); // note that TWO has been set into both inner1 and inner2
st3 = duplicate(st1); // make a proper value-based copy; so completely different references pointing to different bits of memory
st3.inner1.three = "Toru";
writeDump(variables); // inner2 also had a copy of "THREE"
The first dump shows:
struct | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ST1 |
| ||||||||||||||||||
ST2 |
|
That's all predicable. After the
duplicate()
, though, I expected all the substructs to be discrete entities. So adding a THREE to st3.inner1 oughtn't also add it to st3.inner2
. but it does:struct | |||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ST1 |
| ||||||||||||||||||||||
ST2 |
| ||||||||||||||||||||||
ST3 |
|
After a while of doubting my sanity, it occurred to me that the
duplicate()
had made everything discrete from a memory point of view, but the two references that st3.inner1
and st3.inner2
- whilst being different from their counterparts in st1
& st2
- still pointed to the same piece of memory (a different piece of memory from the other two structs, but the same as each other). So I squinted and went "oh yeah... makes sense I guess". And moved on.Note: there appears to be a bug in Railo here, in that the output of the
st3
is:Struct | ||||||||||||||||
inner1 |
| |||||||||||||||
INNER2 |
|
That's not right (OpenBD is also wrong here, in the same way).
Here's an example showing that it's not just assignments and passings that point to the same bit of memory. The rather useful
structFindKey()
returns an array of structs which all reference the original struct, so manipulating the result from the function call also manipulates the source struct:st1 = {
one = "Tahi",
two = "TWO",
three = "Toru",
four = "Wha"
};
a = structFindKey(st1, "two");
a[1].owner.two = "Rua";
writeDump(variables);
struct | |||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A |
| ||||||||||||||||||||||
ST1 |
|
Note how I change
a[1].owner.two
, but that change is reflected in st1
as well. That's quite handy.Less handy... the pretty much useless
structCopy()
just messes stuff up. Have a look at this code & output:st1 = {
one = "Tahi",
two = "TWO",
inner = {
three = "THREE",
four = "Wha"
}
};
st2 = structCopy(st1);
st1.two = "Rua";
st1.inner.three = "Toru";
writeDump(variables);
struct | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ST1 |
| ||||||||||||||
ST2 |
|
structCopy()
does some sort of weird / hybrid / neither-use-nor-ornament kind of copy wherein the top level items are discrete, but the rest of the thing is not. So the code "fixes" the THREE/Toru, but not the TWO/Rua. I have no idea why one would want to copy a struct like this.Lastly (as far as structs go), here's a demo of structs being passed rather than just being assigned:
st1 = {
one = "Tahi",
two = "TWO",
three = "Toru",
four = "Wha"
};
st2 = f(st1);
function f(st){
st.two = "Rua";
return st;
};
writeDump(variables);
I'll dispense with the output this time, because you get the idea, I think.
Queries
Queries behave the same was as structs do, so I'll be quick with this one:q1 = queryNew("digit,maori", "Integer,Varchar", [
[1, "tahi"],
[2, "TWO"],
[3, "Toru"],
[4, "Wha"]
]);
q2 = q1;
q1.maori[2] = "Rua";
writeDump(variables);
struct | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Q1 |
| ||||||||||||||||||
Q2 |
|
One of the new syntactical features of ColdFusion 10 is that one can now load data into a query straight in the queryNew() expression (as per above), as well as in queryAddRow(). That's a cool addition to the language. An example of using queryAddRow() like this is below:
q1 = queryNew("digit,maori", "Integer,Varchar", [
[1, "tahi"],
[2, "Rua"],
[3, "Toru"],
[4, "Wha"]
]);
st = {
digit = 5,
maori = "FIVE"
};
a = [st];
queryAddRow(q1, a);
st.maori = "Rima";
writeDump(variables);
Note what I'm testing here: I'm doing the old "update the struct" trick, to see whether it propagates into the query as well, wondering if the query data might somehow might still use the struct's reference. No. And now that I think about it, it was a pretty daft thing to try. Oh well. here's the result anyways ('cos, like, we're not sick of dumps saying "tahi, rua, toru, wha" yet, eh?):
struct | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A |
| |||||||||||||||||||||
Q1 |
| |||||||||||||||||||||
ST |
|
There's nothing more to say about queries here. Pretty dull.
Oh... Railo supports this new syntax (and works the same as CF does), but it seems OpenBD does not: it just errors with that code.
XML
XML also works via reference, as demonstrated here:<cfxml variable="x1">
<numbers>
<one>Tahi</one>
<two>TWO</two>
<three>Toru</three>
<four>Wha</four>
</numbers>
</cfxml>
<cfset x2 = x1>
<cfset x1.numbers.two.xmlText = "Rua">
<cfdump var="#variables#">
struct | |||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
X1 |
| ||||||||||||||||||||||
X2 |
|
Handily - and similar to what we saw with
structFindKey()
before - xmlSearch()
also returns an array of XML nodes which actually reference the original XML doc, so we can do this:<cfxml variable="x1">
<numbers>
<one>Tahi</one>
<two>TWO</two>
<three>Toru</three>
<four>Wha</four>
</numbers>
</cfxml>
<cfset a = xmlSearch(x1, "/numbers/two/")>
<cfset a[1].xmlText = "Rua">
<cfdump var="#variables#">
<cfoutput>#a[1].getClass().getName()#</cfoutput>
struct | |||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A |
| ||||||||||||||||||||||
X1 |
|
Note that last line just outputs the underlying datatype of the objects returned by
xmlSearch()
.Here's another trap for young players (I'm not young now, but I was when this tricked me... OK, well I wasn't even young then. So it's a trap for... err... daft people, I guess).
Have a look at this code, which is much the same as above:
<cfxml variable="x1">
<numbers>
<one>Tahi</one>
<TWO>TWO</TWO>
<three>Toru</three>
<four>Wha</four>
</numbers>
</cfxml>
<cfset a = xmlSearch(lcase(x1), "/numbers/two/")>
<cfset a[1].xmlText = "Rua">
<cfdump var="#variables#">
The difference here is that I'm lower-casing the XML, because I want to use a case-insensitive XPATH look-up. One can do this with XPath 1.0, but it was a bit of a hack, so often when I want to just find stuff, and don't need to use the values, I just
lcase()
the XML. I seem to recall that CF10 uses XPath 2.0 now, so I could just use a case-insensitive look-up without messing about. I still have to look at XPath 2.0, and will probably write something about that at a later date.However this was outputting:
struct | |||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A |
| ||||||||||||||||||||||
X1 |
|
Note how it's changed the xmlText in the search result, but hadn't updated the XML doc. It took me ages to twig why I was being a muppet here.
When I do the
lcase()
call, I'm no longer using the original XML document: as lcase()
is a string function, it passes copies of the value around, so the XML I'm doing the search on is not x1. It's a copy of it (well: it's a copy of it that's been cast to a string, then copied again, then turned back into an XML doc).OK. Getting close to the end now. Cheers for sticking with me thusfar.
Objects
Here's some code to demonstrate objects are passed by object reference value too:// C.cfc
component {
structAppend(
THIS,
{
one = "Tahi",
two = "TWO",
three = "Toru",
four = "Wha"
}
);
}
// component.cfm
o1 = new C();
o2 = o1;
o1.two = "Rua";
writeDump(o1);
writeDump(o2); // doing them separately to prevent CF "helpfully" suppressing o2's dump because it's the same object as o1
Running component.cfm yields this:
component shared.CF.data_types.assignment.C | |
---|---|
ONE | Tahi |
THREE | Toru |
FOUR | Wha |
TWO | Rua |
component shared.CF.data_types.assignment.C | |
---|---|
ONE | Tahi |
THREE | Toru |
FOUR | Wha |
TWO | Rua |
Perfect.
(NB: I did not test this on OpenBD because it does not support CFScript-only CFCs, so the code didn't work. I could not be bothered refactoring it to demonstrate what I strongly suspect to be the case: OpenBD performs the same here..?)
One last thing
Here's a demonstration of how CFML (any flavour) doesn't actually pass things by reference. Consider this code:
st1 = {
one = "Tahi",
two = "Rua",
three = "Toru",
four = "Wha"
};
st2 = f(st1);
function f(st){
st = {
one = "Ichi",
two = "Ni",
three = "San",
four = "Shi"
};
return st;
};
writeDump(st1);
writeDump(st2);
This yields:
struct | |
---|---|
FOUR | Wha |
ONE | Tahi |
THREE | Toru |
TWO | Rua |
struct | |
---|---|
FOUR | Shi |
ONE | Ichi |
THREE | San |
TWO | Ni |
If
st1
was actually being passed by reference, then st
(arguments.st
, inside the function) would be exactly the same reference. So if we then assign it a different value, then that would be assigning st1
the same value as well. Which is not the case. The reference passed into f()
is a new reference which happens to point at the same memory location as st1
. however when we reassign st
, we're pointing it to a new memory address. But st1
still points at the old one.Hence in CFML, complex objects are not passed by reference.
Make sense?
This ended up being way longer than I expected, but there's a lot of code and dumpery going on amongst the narrative. I hope it was a bit useful / interesting for some people. I will admit it actually formalised a few things in my head as I typed this stuff in, so it was good for me if nothing else.
Dinner time.
--
Adam