Result: Many lines of HASH(0x1948958) ARRAY(0x1978250) ./directory/filename
Desired result: [Key of first hash] [Key of second hash] ./directory/filename #(elements of array, currently working)
Catch: Should carry across to N level structures, hence my attempt at using Data::Walk.
What I really want to do as I walk the structure is to reference the key that is being used. Kind of like Data::Dumper but tab-separated instead of in code format. I think the likely solutions (in order of preference) are:
- Some call to Data::Walk that I've overlooked.
- A better module for this task that I don't know about.
- A quick code snippet that I can inline
- My own module / fork of Data::Walk / Data::Dumper (big frown) that will add this functionality.
use strict;
use File::Basename;
use Data::Walk;
my $files;
while (<>) {
chomp;
#ls -l output in a file; referencing filename from it (8th column)
my @line = split(/ /, $_, 8);
#fileparse exported by File::Basename
my ($name,$path) = fileparse($line[7]);
open (my $fh, '<', $path . $name);
my $sha = Digest::SHA->new('sha1');
$sha->addfile($fh);
#finding files by basename, then unique hash, then however many places it is stored.
#question not why I don't use the hash as the first field.
#basename digest path
push(@{$files->{$name}->{$sha->hexdigest}}, $path . $name);
}
my @val;
sub walkit {
$val[$Data::Walk::depth - 1] = $_;
if ($Data::Walk::depth == 3) {
print join("\t", @val), "\n";
}
}
&walk (\&walkit, %$files);
Gurus?
Edit: against my better judgement, I'll try to answer this question again.
Here's a simple approach to print what you want. Using Data::Walk is not feasible because you don't have key context when you are inside a hash (you just get a pointer to the container.)
This function works for somewhat complicated structures. Of course it will not give proper output if you put a function reference or something wonky in there.
(I'm assuming you're looking for duplicates, so I print nothing when there's only one path associated with a name-digest combo. You can remove the
if
if you want to print everything.)Some other cleanup:
("qfn" stands for "qualified file name", which means the path to the file, which wasn't what
$path
contained. You were repeatedly building the path even though$line[7]
contained it.)