I would like to match and print data from two files (File1.txt and File2.txt). Currently, I'm trying to match the first letter of the second column in File1 to the first letter of the third column in File2.txt.
File1.txt
1 H 35
1 C 22
1 H 20
File2.txt
A 1 HB2 MET 1
A 2 CA MET 1
A 3 HA MET 1
OUTPUT
1 MET HB2 35
1 MET CA 22
1 MET HA 20
Here is my script, I've tried following this submission: In Perl, mapping between a reference file and a series of files
#!/usr/bin/perl
use strict;
use warnings;
my %data;
open (SHIFTS,"file1.txt") or die;
open (PDB, "file2.txt") or die;
while (my $line = <PDB>) {
chomp $line;
my @fields = split(/\t/,$line);
$data{$fields[4]} = $fields[2];
}
close PDB;
while (my $line = <SHIFTS>) {
chomp($line);
my @columns = split(/\t/,$line);
my $value = ($columns[1] =~ m/^.*?([A-Za-z])/ );
}
print "$columns[0]\t$fields[3]\t$value\t$data{$value}\n";
close SHIFTS;
exit;
Here's one way using split() hackery:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my $f1 = 'file1.txt';
my $f2 = 'file2.txt';
my @pdb;
open my $pdb_file, '<', $f2
or die "Can't open the PDB file $f2: $!";
while (my $line = <$pdb_file>){
chomp $line;
push @pdb, $line;
}
close $pdb_file;
open my $shifts_file, '<', $f1
or die "Can't open the SHIFTS file $f1: $!";
while (my $line = <$shifts_file>){
chomp $line;
my $pdb_line = shift @pdb;
# - inner split: get the third element from the $pdb_line
# - outer split: get the first element (character) from the
# result of the inner split
my $criteria = (split('', (split('\s+', $pdb_line))[2]))[0];
# - compare the 2nd element of the file1.txt line against
# the above split() operations
if ((split('\s+', $line))[1] eq $criteria){
print "$pdb_line\n";
}
else {
print "**** >$pdb_line< doesn't match >$line<\n";
}
}
Files:
file1.txt (note I changed line two to ensure a non-match worked):
1 H 35
1 A 22
1 H 20
file2.txt:
A 1 HB2 MET 1
A 2 CA MET 1
A 3 HA MET 1
Output:
./app.pl
A 1 HB2 MET 1
****>A 2 CA MET 1< doesn't match >1 A 22<
A 3 HA MET 1