可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have some problems reading a file into a hash in Perl.
Chr1_supercontig_000000000 1 500
PILOT21_588_1_3_14602_59349_1
Chr1_supercontig_000000001 5 100
PILOT21_588_1_21_7318_90709_1
PILOT21_588_1_43_18803_144592_1
PILOT21_588_1_67_13829_193943_1
PILOT21_588_1_42_19678_132419_1
PILOT21_588_1_67_4757_125247_1
...
So I have this file above. My desired output is a hash with the "Chr1"-lines as key, and the "PILOT"-lines as values.
Chr1_supercontig_000000000 => PILOT21_588_1_3_14602_59349_1
Chr1_supercontig_000000001 => PILOT21_588_1_21_7318_90709_1, PILOT21_588_1_43_18803_144592_1,...
As far as I know, multiple values can be assigned to a key only by reference, is that correct?
I got stuck at this point and need help.
回答1:
You are right, the hash values need to be references that point to arrays which contain the PILOT lines.
Here's a way to do it:
my %hash;
open FILE, "filename.txt" or die $!;
my $key;
while (my $line = <FILE>) {
chomp($line);
if ($line !~ /^\s/) {
($key) = $line =~ /^\S+/g;
$hash{$key} = [];
} else {
$line =~ s/^\s+//;
push @{ $hash{$key} }, $line;
}
}
close FILE;
回答2:
You can read the file line-by-line keeping track of the current hash key:
open my $fh, '<', 'file' or die $!;
my (%hash, $current_key);
while (<$fh>) {
chomp;
$current_key = $1, next if /^(\S+)/;
s/^\s+//; # remove leading space
push @{ $hash{$current_key} }, $_;
}
回答3:
How about:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dump qw(dump);
my %hash;
my $key;
while(<DATA>) {
chomp;
if (/^(Chr1_supercontig_\d+)/) {
$key = $1;
$hash{$key} = ();
} else {
push @{$hash{$key}}, $_;
}
}
dump%hash;
__DATA__
Chr1_supercontig_000000000 1 500
PILOT21_588_1_3_14602_59349_1
Chr1_supercontig_000000001 5 100
PILOT21_588_1_21_7318_90709_1
PILOT21_588_1_43_18803_144592_1
PILOT21_588_1_67_13829_193943_1
PILOT21_588_1_42_19678_132419_1
PILOT21_588_1_67_4757_125247_1
output:
(
"Chr1_supercontig_000000001",
[
" PILOT21_588_1_21_7318_90709_1",
" PILOT21_588_1_43_18803_144592_1",
" PILOT21_588_1_67_13829_193943_1",
" PILOT21_588_1_42_19678_132419_1",
" PILOT21_588_1_67_4757_125247_1",
],
"Chr1_supercontig_000000000",
[" PILOT21_588_1_3_14602_59349_1"],
)
回答4:
Many good answers already, so I'll add one that does not rely on regexes, but rather on that the key-lines contain three space/tab delimited entries, and the values only one.
It will automatically strip leading whitespace and newlines, and so is somewhat convenient.
use strict;
use warnings;
my %hash;
my $key;
while (<DATA>) {
my @row = split;
if (@row > 1) {
$key = shift @row;
} else {
push @{$hash{$key}}, shift @row;
}
}
use Data::Dumper;
print Dumper \%hash;
__DATA__
Chr1_supercontig_000000000 1 500
PILOT21_588_1_3_14602_59349_1
Chr1_supercontig_000000001 5 100
PILOT21_588_1_21_7318_90709_1
PILOT21_588_1_43_18803_144592_1
PILOT21_588_1_67_13829_193943_1
PILOT21_588_1_42_19678_132419_1
PILOT21_588_1_67_4757_125247_1
回答5:
Here is another fairly short, clear version:
while (<>) {
if(/^Chr\S+/) {
$c=$&;
} else {
/\S+/;
push @{ $p{$c} }, $&;
}
}
And to print the results:
foreach my $pc ( sort keys %p ) {
print "$pc => ".join(", ", @{$p{$pc}})."\n";
}
This is a shorter print-results (but the first one seems more readable to me):
map { print "$_ => ".join(", ", @{$p{$_}})."\n" } sort keys %p;
One-liner from command line:
perl <1 -e 'while(<>){ if(/^Chr\S+/){ $c=$&; }else{ /\S+/; push(@{$p{$c}},$&);} } map { print "$_ => ".join(", ", @{$p{$_}})."\n" } sort keys %p;'
回答6:
Try this ,
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my ( $fh,$cur );
my $hash = ();
open $fh,'<' , 'file' or die "Can not open file\n";
while (<$fh> ) {
chomp;
if ( /^(Chr.+? ).+/ ) {
$cur = $1;
$hash->{$cur} = '';
}
else {
$hash->{$cur} = $hash->{$cur} .$_ . ',';
}
}
print Dumper $hash;