-->

Storing the Output to a FASTA file

2019-08-11 21:40发布

问题:

from Bio import SeqIO
from Bio import SeqRecord
from Bio import SeqFeature
for rec in SeqIO.parse("C:/Users/Siva/Downloads/sequence.gp","genbank"):
       if rec.features:
            for feature in rec.features:
                    if feature.type =="Region":
                          seq1 = feature.location.extract(rec).seq
                          print(seq1)
                          SeqIO.write(seq1,"region_AA_output1.fasta","fasta")

I am trying to write the output to a FASTA file but i am getting error. Can anybody help me? This the error which i got

  Traceback (most recent call last):
  File "C:\Users\Siva\Desktop\region_AA.py", line 10, in <module>
  SeqIO.write(seq1,"region_AA_output1.fasta","fasta")
  File "C:\Python34\lib\site-packages\Bio\SeqIO\__init__.py", line 472, in      write
  count = writer_class(fp).write_file(sequences)
  File "C:\Python34\lib\site-packages\Bio\SeqIO\Interfaces.py", line 211, in write_file
  count = self.write_records(records)
  File "C:\Python34\lib\site-packages\Bio\SeqIO\Interfaces.py", line 196, in write_records
  self.write_record(record)
  File "C:\Python34\lib\site-packages\Bio\SeqIO\FastaIO.py", line 190, in write_record
  id = self.clean(record.id)

AttributeError: 'str' object has no attribute 'id'

回答1:

First, you're trying to write a plain sequence as a fasta record. A fasta record consists of a sequence plus an ID line (prepended by ">"). You haven't provided an ID, so the fasta writer has nothing to write. You should either write the whole record, or turn the sequence into a fasta record by adding an ID yourself.

Second, even if your approach wrote anything, it's continually overwriting each new record into the same file. You'd end up with just the last record in the file.

A simpler approach is to store everything in a list, and then write the whole list when you're done the loop. For example:

new_fasta = []
for rec in SeqIO.parse("C:/Users/Siva/Downloads/sequence.gp","genbank"):
    if rec.features:
        for feature in rec.features:
            if feature.type =="Region":
                seq1 = feature.location.extract(rec).seq
                # Use an appropriate string for id 
                new_fasta.append('>%s\n%s' % (rec.id, seq1))  

with open('region_AA_output1.fasta', 'w') as f:
    f.write('\n'.join(new_fasta))