Given perl script cut the input sequence at "E" and skips those particular positions of "E" which is mentioned in @nobreak, and generates an array of fragments as an output. But I want a script which generates set of such array in output for every position which has been skipped taking all positions of @nobreak into account. say set 1 contains fragments resulted after skipping at "E" 37, set 2 after skipping at "E" 45, and so on. Below mentioned script which I wrote is not working correctly. I want to generate 4 different array in output taking one position of @nobreak at a time. Please help!
my $s = 'MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN';
print "Results of 1-Missed Cleavage:\n\n";
my @nobreak = (37, 45, 57, 59);
{
@nobreak = map { $_ - 1 } @nobreak;
foreach (@nobreak) {
substr($s, $_, 1) = "\0";
}
my @a = split /E(?!P)/, $s;
$_ =~ s/\0/E/g foreach (@a);
$result = join "E,", @a;
@final = split /,/, $result;
print "@final\n";
}
It looks like you want to split the string after all
E
characters, but not before anyP
characterThis code will do what you want. It works by changing the
E
at each offset in@nobreak
to ane
(much better than"\0"
for debugging) and splitting on/(?<=E)(?!P)/
- i.e. after anE
but not before aP
. Thee
is changed back to anE
afterwards usingtr/e/E/
output
Loop over @nobreak?
To split the string at every 'E' without consuming it in the process, use a lookbehind:
To assert finer control over which 'E' to split on (which you left unspecified), the change would be made to the regex.
In case a variable lookbehind is needed, one could use
\K
...