I would like to substitute each element in an array with their corresponding hash values. To make it more clear: I have two files 1) ref.tab 2) data.tab. The reference file contains data like:
A a
B b
C c
D d
The data file contains data like:
1 apple red A
2 orange orange B
3 grapes black C
4 kiwi green D
What I would like to do now using Perl is: Substitute all instances of values in column 4 of data.tab with the corresponding values from ref.tab.
My code is as follows:
#!/usr/bin/perl
use strict;
use warnings;
use diagnostics;
# Define file containing the reference values:
open DFILE, 'ref.tab' or die "Cannot open data file";
# Store each column to an array:
my @caps;
my @small;
while(<DFILE>) {
my @tmp = split/\t/;
push @caps,$tmp[0];
push @small,$tmp[1];
}
print join(' ', @caps),"\n";
print join(' ', @small),"\n";
# convert individual arrays to hashes:
my %replaceid;
@replaceid{@caps} = @small;
print "$_ $replaceid{$_}\n" for (keys %replaceid);
# Define the file in which column values are to be replaced:
open SFILE,'output.tab' or die "Cannot open source file";
# Store the required columns in an array:
my @col4;
while(<SFILE>) {
my @tmp1 = split/\t/;
push @col4,$tmp1[4];
}
for $_ (0..$#col4) {
if ($_ = keys $replaceid[$col4[$_]]){
~s/$_/values $replaceid[$col4[$_]]/g;
}
}
print "@col4";
close (DFILE);
close (SFILE);
exit;
The above program results in this error:
Use of uninitialized value $tmp1[3] in join or string at replace.pl line 4.
What is the solution?
New issue:
Another issue now. I would like to leave the field blank if there is no respective replacement. Any idea on how this could be done? That is,
ref.tab
A a
B b
C c
D d
F f
data.tab:
1 apple red A
2 orange orange B
3 grapes black C
4 kiwi green D
5 melon yellow E
6 citron green F
Desired output:
1 apple red a
2 orange orange b
3 grapes black c
4 kiwi green d
5 melon yellow
6 citron green f
How can I do this?
New issue, 2
I have another issue now with the AWK solution. It does leave the field blank if there is no match, but I have additional columns after the 4th; so whenever there is no match found, the value in the fifth column gets shifted to the fourth column.
1 apple red a sweet
2 orange orange b sour
3 grapes black c sweet
4 kiwi green d sweet
5 melon yellow sweet
6 citron green f sour
On line 5: Here you can notice what happens; the value in 5th column gets shifted to the 4th column where there is no replacement found.
Perl solution:
AWK solution:
Value in 4-th column is
$tmp1[3]
, not$tmp1[4]