What is the difference between the following two Perl variable declarations?
my $foo = 'bar' if 0;
my $baz;
$baz = 'qux' if 0;
The difference is significant when these appear at the top of a loop. For example:
use warnings;
use strict;
foreach my $n (0,1){
my $foo = 'bar' if 0;
print defined $foo ? "defined\n" : "undefined\n";
$foo = 'bar';
print defined $foo ? "defined\n" : "undefined\n";
}
print "==\n";
foreach my $m (0,1){
my $baz;
$baz = 'qux' if 0;
print defined $baz ? "defined\n" : "undefined\n";
$baz = 'qux';
print defined $baz ? "defined\n" : "undefined\n";
}
results in
undefined
defined
defined
defined
==
undefined
defined
undefined
defined
It seems that if 0
fails, so foo
is never reinitialized to undef
. In this case, how does it get declared in the first place?
First, note that
my $foo = 'bar' if 0;
is documented to be undefined behaviour, meaning it's allowed to do anything including crash. But I'll explain what happens anyway.my $x
has three documented effects:In short, it's suppose to be like Java's
Scalar x = new Scalar();
, except it returns the variable if used in an expression.But if it actually worked that way, the following would create 100 variables:
This would mean two or three memory allocations per loop iteration for the
my
alone! A very expensive prospect. Instead, Perl only creates one variable and clears it at the end of the scope. So in reality,my $x
actually does the following:As such, only one variable is ever created[2]. This is much more CPU-efficient than then creating one every time the scope is entered.
Now consider what happens if you execute a
my
conditionally, or never at all. By doing so, you are preventing it from placing the directive to clear the variable on the stack, so the variable never loses its value. Obviously, that's not meant to happen, so that's whymy ... if ...;
isn't allowed.Some take advantage of the implementation as follows:
But doing so requires ignoring the documentation forbidding it. The above can be achieved safely using
or
Notes:
"Variable" can mean a couple of things. I'm not sure which definition is accurate here, but it doesn't matter.
If anything but the sub itself holds a reference to the variable (REFCNT>1) or if variable contains an object, the directive replaces the variable with a new one (on scope exit) instead of clearing the existing one. This allows the following to work as it should:
See ikegami's better answer, probably above.
In the first example, you never define $foo inside the loop because of the conditional, so when you use it, you're referencing and then assigning a value to an implicitly declared global variable. Then, the second time through the loop that outside variable is already defined.
In the second example, $baz is defined inside the block each time the block is executed. So the second time through the loop it is a new, not yet defined, local variable.