If I understand correctly, calling if (exists $ref->{A}->{B}->{$key}) { ... }
will spring into existence $ref->{A}
and $ref->{A}->{B}
even if they did not exist prior to the if
!
This seems highly unwanted. So how should I check if a "deep" hash key exists?
It's much better to use something like the autovivification module to turn off that feature, or to use Data::Diver. However, this is one of the simple tasks that I'd expect a programmer to know how to do on his own. Even if you don't use this technique here, you should know it for other problems. This is essentially what Data::Diver
is doing once you strip away its interface.
This is easy once you get the trick of walking a data structure (if you don't want to use a module that does it for you). In my example, I create a check_hash
subroutine that takes a hash reference and an array reference of keys to check. It checks one level at a time. If the key is not there, it returns nothing. If the key is there, it prunes the hash to just that part of the path and tries again with the next key. The trick is that $hash
is always the next part of the tree to check. I put the exists
in an eval
in case the next level isn't a hash reference. The trick is not to fail if the hash value at the end of the path is some sort of false value. Here's the important part of the task:
sub check_hash {
my( $hash, $keys ) = @_;
return unless @$keys;
foreach my $key ( @$keys ) {
return unless eval { exists $hash->{$key} };
$hash = $hash->{$key};
}
return 1;
}
Don't be scared by all the code in the next bit. The important part is just the check_hash
subroutine. Everything else is testing and demonstration:
#!perl
use strict;
use warnings;
use 5.010;
sub check_hash {
my( $hash, $keys ) = @_;
return unless @$keys;
foreach my $key ( @$keys ) {
return unless eval { exists $hash->{$key} };
$hash = $hash->{$key};
}
return 1;
}
my %hash = (
a => {
b => {
c => {
d => {
e => {
f => 'foo!',
},
f => 'foo!',
},
},
f => 'foo!',
g => 'goo!',
h => 0,
},
f => [ qw( foo goo moo ) ],
g => undef,
},
f => sub { 'foo!' },
);
my @paths = (
[ qw( a b c d ) ], # true
[ qw( a b c d e f ) ], # true
[ qw( b c d ) ], # false
[ qw( f b c ) ], # false
[ qw( a f ) ], # true
[ qw( a f g ) ], # false
[ qw( a g ) ], # true
[ qw( a b h ) ], # false
[ qw( a ) ], # true
[ qw( ) ], # false
);
say Dumper( \%hash ); use Data::Dumper; # just to remember the structure
foreach my $path ( @paths ) {
printf "%-12s --> %s\n",
join( ".", @$path ),
check_hash( \%hash, $path ) ? 'true' : 'false';
}
Here's the output (minus the data dump):
a.b.c.d --> true
a.b.c.d.e.f --> true
b.c.d --> false
f.b.c --> false
a.f --> true
a.f.g --> false
a.g --> true
a.b.h --> true
a --> true
--> false
Now, you might want to have some other check instead of exists
. Maybe you want to check that the value at the chosen path is true, or a string, or another hash reference, or whatever. That's just a matter of supplying the right check once you have verified that the path exists. In this example, I pass a subroutine reference that will check the value I left off with. I can check for anything I like:
#!perl
use strict;
use warnings;
use 5.010;
sub check_hash {
my( $hash, $sub, $keys ) = @_;
return unless @$keys;
foreach my $key ( @$keys ) {
return unless eval { exists $hash->{$key} };
$hash = $hash->{$key};
}
return $sub->( $hash );
}
my %hash = (
a => {
b => {
c => {
d => {
e => {
f => 'foo!',
},
f => 'foo!',
},
},
f => 'foo!',
g => 'goo!',
h => 0,
},
f => [ qw( foo goo moo ) ],
g => undef,
},
f => sub { 'foo!' },
);
my %subs = (
hash_ref => sub { ref $_[0] eq ref {} },
array_ref => sub { ref $_[0] eq ref [] },
true => sub { ! ref $_[0] && $_[0] },
false => sub { ! ref $_[0] && ! $_[0] },
exist => sub { 1 },
foo => sub { $_[0] eq 'foo!' },
'undef' => sub { ! defined $_[0] },
);
my @paths = (
[ exist => qw( a b c d ) ], # true
[ hash_ref => qw( a b c d ) ], # true
[ foo => qw( a b c d ) ], # false
[ foo => qw( a b c d e f ) ], # true
[ exist => qw( b c d ) ], # false
[ exist => qw( f b c ) ], # false
[ array_ref => qw( a f ) ], # true
[ exist => qw( a f g ) ], # false
[ 'undef' => qw( a g ) ], # true
[ exist => qw( a b h ) ], # false
[ hash_ref => qw( a ) ], # true
[ exist => qw( ) ], # false
);
say Dumper( \%hash ); use Data::Dumper; # just to remember the structure
foreach my $path ( @paths ) {
my $sub_name = shift @$path;
my $sub = $subs{$sub_name};
printf "%10s --> %-12s --> %s\n",
$sub_name,
join( ".", @$path ),
check_hash( \%hash, $sub, $path ) ? 'true' : 'false';
}
And its output:
exist --> a.b.c.d --> true
hash_ref --> a.b.c.d --> true
foo --> a.b.c.d --> false
foo --> a.b.c.d.e.f --> true
exist --> b.c.d --> false
exist --> f.b.c --> false
array_ref --> a.f --> true
exist --> a.f.g --> false
undef --> a.g --> true
exist --> a.b.h --> true
hash_ref --> a --> true
exist --> --> false
You could use the autovivification pragma to deactivate the automatic creation of references:
use strict;
use warnings;
no autovivification;
my %foo;
print "yes\n" if exists $foo{bar}{baz}{quux};
print join ', ', keys %foo;
It's also lexical, meaning it'll only deactivate it inside the scope you specify it in.
Check every level for exist
ence before looking at the top level.
if (exists $ref->{A} and exists $ref->{A}{B} and exists $ref->{A}{B}{$key}) {
}
If you find that annoying you could always look on CPAN. For instance, there is Hash::NoVivify
.
Take a look at Data::Diver. E.g.:
use Data::Diver qw(Dive);
my $ref = { A => { foo => "bar" } };
my $value1 = Dive($ref, qw(A B), $key);
my $value2 = Dive($ref, qw(A foo));
Pretty ugly, but if $ref is a complicated expression that you don't want to use in repeated exists tests:
if ( exists ${ ${ ${ $ref || {} }{A} || {} }{B} || {} }{key} ) {