可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I am using biopython package and I would like to save result like tsv file. This output from print to tsv.
for record in SeqIO.parse("/home/fil/Desktop/420_2_03_074.fastq", "fastq"):
print ("%s %s %s" % (record.id,record.seq, record.format("qual")))
Thank you.
回答1:
That is fairly simple , instead of printing it you need to write that to a file.
with open("records.tsv", "w") as record_file:
for record in SeqIO.parse("/home/fil/Desktop/420_2_03_074.fastq", "fastq"):
record_file.write("%s %s %s\n" % (record.id,record.seq, record.format("qual")))
And if you want to name the various columns in the file then you can use:
record_file.write("Record_Id Record_Seq Record_Qal\n")
So the complete code may look like:
with open("records.tsv", "w") as record_file:
record_file.write("Record_Id Record_Seq Record_Qal\n")
for record in SeqIO.parse("/home/fil/Desktop/420_2_03_074.fastq", "fastq"):
record_file.write(str(record.id)+" "+str(record.seq)+" "+ str(record.format("qual"))+"\n")
回答2:
My preferred solution is to use the CSV module. It's a standard module, so:
- Somebody else has already done all the heavy lifting.
- It allows you to leverage all the functionality of the CSV module.
- You can be fairly confident it will function as expected (not always the case when I write it myself).
- You're not going to have to reinvent the wheel, either when you write the file or when you read it back in on the other end (I don't know your record format, but if one of your records contains a TAB, CSV will escape it correctly for you).
- It will be easier to support when the next person has to go in to update the code 5 years after you've left the company.
The following code snippet should do the trick for you:
#! /bin/env python3
import csv
with open('records.tsv', 'w') as tsvfile:
writer = csv.writer(tsvfile, delimiter='\t', newline='\n')
for record in SeqIO.parse("/home/fil/Desktop/420_2_03_074.fastq", "fastq"):
writer.writerow([record.id, record.seq, record.format("qual")])
Note that this is for Python 3.x. If you're using 2.x, the open
and writer = ...
will be slightly different.
回答3:
If you want to use the .tsv
to label your word embeddings in TensorBoard, use the following snippet. It uses the CSV module (see Doug's answer).
# /bin/env python3
import csv
def save_vocabulary():
label_file = "word2context/labels.tsv"
with open(label_file, 'w', encoding='utf8', newline='') as tsv_file:
tsv_writer = csv.writer(tsv_file, delimiter='\t', lineterminator='\n')
tsv_writer.writerow(["Word", "Count"])
for word, count in word_count:
tsv_writer.writerow([word, count])
word_count
is a list of tuples like this:
[('the', 222594), ('to', 61479), ('in', 52540), ('of', 48064) ... ]
回答4:
The following snippet:
from __future__ import print_function
with open("output.tsv", "w") as f:
print ("%s\t%s\t%s" % ("asd", "sdf", "dfg"), file=f)
print ("%s\t%s\t%s" % ("sdf", "dfg", "fgh"), file=f)
Yields a file output.tsv
containing
asd sdf dfg
sdf dfg fgh
So, in your case:
from __future__ import print_function
with open("output.tsv", "w") as f:
for record in SeqIO.parse("/home/fil/Desktop/420_2_03_074.fastq", "fastq"):
print ("%s %s %s" % (record.id,record.seq, record.format("qual")), file=f)
回答5:
I prefer using join()
in this type of code:
for record in SeqIO.parse("/home/fil/Desktop/420_2_03_074.fastq", "fastq"):
print ( '\t'.join((str(record.id), str(record.seq), str(record.format("qual"))) )
The 'tab' character is \t
and the join function takes the (3) arguments and prints them with a tab in between.