A week or so ago I was fascinated by this Stack Overflow question: "How to changing value in $array2 without referring $array1?". It offers this code:
// baseline.php
$array1 = array(1,20);
$x = &$array1[1];
$array2 = $array1;
$array2[1] = 22;
print_r($array1[1]); // Output is 22
And this result:
22
In PHP, as the
=
operator makes a copy of the variable being assigned, so one (and when I say "one", I mean "the person asking the question, and myself as well") might be surprised to see the answer is "22
", rather expecting it to be "20
". Surely $array1[1]
is discrete from $array2[1]
, and $array1[1]
should not be impacted by the change to $array2[1]
. $x
is not a macguffin in this: remove that line, and things behave "as expected". How is making $x
- a reference to $array1[1]
- somehow intertwined with $array2?
?That initial question was followed-up by another one: "Assign by reference bug", and that has a good answer which explains theoretically what's going on. The key part is this:
This is explained over at the PHP manual (even if you have to spend more time than you should have to in order to find it), specifically over at http://php.net/manual/en/language.types.array.php#104064And the relevant extract from the docs says:
The "shared" data stays shared, with the initial assignment just acting as an alias. It's not until you start manipulating the arrays with independent operations like...[] = ...
that the intepreter starts to treat them as divergent lists, and even then the shared data stays shared so you can have two arrays with a shared first n elements but divergent subsequent data.
please note that when arrays are copied, the "reference status" of their members is preserved (http://www.php.net/manual/en/language.references.whatdo.php).And on that page:
In other words, the reference behavior of arrays is defined in an element-by-element basis; the reference behavior of individual elements is dissociated from the reference status of the array container.OK, cool, I believe it. However I wanted to see it in action.
My first hurdle here was getting a lucid/helpful answer to "PHP determine if a variable is a reference". Most answers I spotted initially were "no, can't be done. Why even would you want to?" (but not in a "let's see if there's another approach" sense, but in a "I'm getting defensive about PHP" sense).
However that's not strictly true, as it turns out. I switched my googling to "PHP reference count", and the first link got me pointed in the direction of XDebug, which enables one to inspect reference counts and stuff like that.
Here's a reworked version of the code above, with some debug:
// baselineWithDebug.php
echo "<hr><h3><code>numbers</code> created</h3>";
$numbers = array("tahi", "rua", "toru");
xdebug_debug_zval('numbers');
echo "<hr><h3><code>refToSecondElement</code> created</h3>";
$refToSecondElement = &$numbers[1];
xdebug_debug_zval('numbers');
xdebug_debug_zval('refToSecondElement');
echo "<hr><h3><code>copyOfNumbers</code> created</h3>";
$copyOfNumbers = $numbers;
xdebug_debug_zval('numbers');
xdebug_debug_zval('refToSecondElement');
xdebug_debug_zval('copyOfNumbers');
echo "<hr><h3><code>copyOfNumbers[1]</code> changed</h3>";
$copyOfNumbers[1] = "two";
xdebug_debug_zval('numbers');
xdebug_debug_zval('refToSecondElement');
xdebug_debug_zval('copyOfNumbers');
echo "<hr><h3><code>copyOfNumbers[2]</code> changed</h3>";
$copyOfNumbers[2] = "three";
xdebug_debug_zval('numbers');
xdebug_debug_zval('refToSecondElement');
xdebug_debug_zval('copyOfNumbers');
The key elements here are:
- there's an initial array,
$numbers
; - I make a reference to one element of it as
$refToSecondElement
; - I copy
$numbers
as$copyOfNumbers
; - I change the value of the second element of
$copyOfNumbers
; - and also the third element of same.
- Along the way I output some debug regarding each variable.
numbers
created
numbers: (refcount=1, is_ref=0), array (size=3) 0 => (refcount=1, is_ref=0),string 'tahi' (length=4) 1 => (refcount=1, is_ref=0),string 'rua' (length=3) 2 => (refcount=1, is_ref=0),string 'toru' (length=4)
refToSecondElement
created
numbers: (refcount=1, is_ref=0), array (size=3) 0 => (refcount=1, is_ref=0),string 'tahi' (length=4) 1 => (refcount=2, is_ref=1),string 'rua' (length=3) 2 => (refcount=1, is_ref=0),string 'toru' (length=4)refToSecondElement:
(refcount=2, is_ref=1),string 'rua' (length=3)
copyOfNumbers
created
numbers: (refcount=2, is_ref=0), array (size=3) 0 => (refcount=1, is_ref=0),string 'tahi' (length=4) 1 => (refcount=2, is_ref=1),string 'rua' (length=3) 2 => (refcount=1, is_ref=0),string 'toru' (length=4)refToSecondElement:
(refcount=2, is_ref=1),string 'rua' (length=3)copyOfNumbers:
(refcount=2, is_ref=0), array (size=3) 0 => (refcount=1, is_ref=0),string 'tahi' (length=4) 1 => (refcount=2, is_ref=1),string 'rua' (length=3) 2 => (refcount=1, is_ref=0),string 'toru' (length=4)
copyOfNumbers[1]
changed
numbers: (refcount=1, is_ref=0), array (size=3) 0 => (refcount=2, is_ref=0),string 'tahi' (length=4) 1 => (refcount=3, is_ref=1),string 'two' (length=3) 2 => (refcount=2, is_ref=0),string 'toru' (length=4)refToSecondElement:
(refcount=3, is_ref=1),string 'two' (length=3)copyOfNumbers:
(refcount=1, is_ref=0), array (size=3) 0 => (refcount=2, is_ref=0),string 'tahi' (length=4) 1 => (refcount=3, is_ref=1),string 'two' (length=3) 2 => (refcount=2, is_ref=0),string 'toru' (length=4)
copyOfNumbers[2]
changed
numbers: (refcount=1, is_ref=0), array (size=3) 0 => (refcount=2, is_ref=0),string 'tahi' (length=4) 1 => (refcount=3, is_ref=1),string 'two' (length=3) 2 => (refcount=1, is_ref=0),string 'toru' (length=4)refToSecondElement:
(refcount=3, is_ref=1),string 'two' (length=3)
copyOfNumbers: (refcount=1, is_ref=0), array (size=3) 0 => (refcount=2, is_ref=0),string 'tahi' (length=4) 1 => (refcount=3, is_ref=1),string 'two' (length=3) 2 => (refcount=1, is_ref=0),string 'three' (length=5)
Observations:
- each variable and array element has a
refcount
and andis_ref
. - When
$numbers
is first created, it'srefcount
is1
(itself), and itsis_ref
isfalse
. Fair enough. - When
$refToSecondElement
is made,$numbers
refcount
andis_ref
don't change, but note that its second element now has arefcount
of two (itself and$refToSecondElement
), and it not statesis_ref=1
. So not only is$refToSecondElement
a reference, so is$numbers[1]
. - When
$copyOfNumbers
is made, initially it's just a reference. Note therefcount
on both reflects this, but theis_ref
does not. I guess this is because there's a difference between how PHP handles value copying internally is distinct from actively creating references. Not sure. - Also note that the
refcount
on$refToSecondElement
(and$numbers[1]
and$copyOfNumbers[1]
) still reflect two references. As$copyOfNumbers
is itself a reference at this point,$numbers[1]
and$copyOfNumbers[1]
are exactly the same thing. Not a new reference. - Now
$copyOfNumbers[1]
changes, so PHP has to actually make$copyOfNumbers
a copy now, not just a reference back to$numbers
. This is an example of copy-on-write, and is a performance optimisation. If all one is doing is reading a copied value, then leaving it as a reference is fine. It's not until the copied data changes that it needs to take on life of its own. And this is borne out by the fact that$numbers
and$copyOfNumbers
now have arefcount
of one apiece. - On the other hand, the
refcount
on$numbers[1]
,$refToSecondElement
and$copyOfNumbers[1]
is now three, because$copyOfNumbers[1]
is now a separate reference from$numbers[1]
. - Also note that - as one would expect - the value change to
$copyOfNumbers[1]
is reflected in$referenceToSecondElement
too. And indeed back to the reference in$numbers[1]
too. This latter bit is the thing we didn't initially expect, but it kinda makes sense now. - Lastly we change
$copyOfNumbers[2]
, and we see therefcount
for it and$numbers[2]
decrement, because again we are seeing copy-on-write: PHP has ceased using a reference to$numbers[2]
for$copyOfNumbers[2]
, and it's now its own value.
XDebug came in handy here demonstrating what's going on, and being able to see how the values / references change as values are changed.
I feel just that slight bit less ignorant about how PHP works now. Nice one.
--
Adam