可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I want to store DNA sequences of size n in the described data structure. Each hash could contain the keys C,G,A,T who will have hash values. These hash values will be the exact same kind of hashes - they will have four keys, C,G,A,T who will have hash values.

This structure is consistent for n levels of hashes. However, the last level of hashes will instead have integer values, which represent the count of the sequence from level 1 to level n.

Given the data ('CG', 'CA', 'TT', 'CG'), indicating that the sequences CG, CA, and TT occurred twice, once, and once. For this data, the depth would be 2.

This would produce a hash: %root = ( 'C' => { 'G' => 2, 'A' => 1}, 'T' => {'T' => 1 })

How would one create this hash from the data?

回答1:

What you need is a function get_node($tree, 'C', 'G') returns a reference to the hash element for "CG". Then you can just increment the referenced scalar.

sub get_node {
   my $p = \shift;
   $p = \( ($$p)->{$_} ) for @_;
   return $p;
}

my @seqs = qw( CG CA TT CG );

my $tree;
++${ get_node($tree, split //) } for @seqs;

The thing is, this function already exists as Data::Diver's DiveRef.

use Data::Diver qw( DiveRef );

my @seqs = qw( CG CA TT CG );

my $tree = {};
++${ DiveRef($tree, split //) } for @seqs;

In both case,

use Data::Dumper qw( Dumper );
print(Dumper($tree));

prints

$VAR1 = {
          'T' => {
                   'T' => 1
                 },
          'C' => {
                   'A' => 1,
                   'G' => 2
                 }
        };

回答2:

The following should work:

use Data::Dumper;

my %data;
my @sequences = qw(CG CG CA TT);

foreach my $sequence (@sequences) {
    my @vars = split(//,$sequence);
    $data{$vars[0]} = {} if (!exists($data{$vars[0]}));
    my $startref = $data{$vars[0]};
    for(my $i = 1; $i < $#vars; $i++) {
    $startref->{$vars[$i]} = {} if (!exists($startref->{$vars[$i]}));
    $startref = $startref->{$vars[$i]};
    }
    $startref->{$vars[$#vars]}++;
}

print Dumper(\%data);

Produces:

$VAR1 = {
          'T' => {
                   'T' => 1
                 },
          'C' => {
                   'A' => 1,
                   'G' => 2
                 }
        };