renaming fasta headers in order

2019-09-11 01:14发布

问题:

I have multiple fasta files and each file with 8 headers always in the same order (in term of species). For example it is like

 grep -o -E "^>\w+" batch1.seq

 jgi
 jgi
 augustus_masked
 augustus_masked
 augustus_masked
 jgi
 augustus_masked
 augustus_masked

and

 grep -o -E "^>\w+" batch2.seq

gives

jgi
jgi
maker
maker
maker
jgi
maker
maker

Irrespective of their headers, I want to rename all fasta headers (8 in number) for the files in the folder to

Ara
Soy
Gly
Tom
Whe
Cor
Nat
Blu

回答1:

awk to the rescue!

awk 'NR==FNR{names[NR]=$0; next} 
        /^>/{$1=">"names[++c]}1' names fasta > fasta.new

keep your new header list in the file names when using the script.



标签: awk sed grep