I’m writing a python script (version 2.7) that will change every input file (.nexus format) within the specified directory into .fasta format. The Biopython module SeqIO.convert handles the conversion perfectly for individually specified files but when I try to automate the process over a directory using os.walk I’m unable to correctly pass the pathname of each input file to SeqIO.convert. Where are I going wrong? Do I need to use join() from os.path module and pass the full path names on to SeqIO.convert?
#Import modules
import sys
import re
import os
import fileinput
from Bio import SeqIO
#Specify directory of interest
PSGDirectory = "/Users/InputDirectory”
#Create a class that will run the SeqIO.convert function repeatedly
def process(filename):
count = SeqIO.convert("files", "nexus", "files.fa", "fasta", alphabet= IUPAC.ambiguous_dna)
#Make sure os.walk works correctly
for path, dirs, files in os.walk(PSGDirectory):
print path
print dirs
print files
#Now recursively do the count command on each file inside PSGDirectory
for files in os.walk(PSGDirectory):
print("Converted %i records" % count)
process(files)
When I run the script I get this error message:
Traceback (most recent call last):
File "nexus_to_fasta.psg", line 45, in <module>
print("Converted %i records" % count)
NameError: name 'count' is not defined
This conversation was very helpful but I don’t know where to insert the join() function statements. Here is an example of one of my nexus files
Thanks for your help!
There are a few things going on.
First, your process function isn't returning 'count'. You probably want:
Also, when you write
for files in os.walk(PSGDirectory)
you're operating on the 3-tuple that os.walk returns, not individual files. You want to do something like this (note the use of os.path.join):Update:
So I looked at the documentation for seqIO.convert and it expects to be called with:
in_file is the name of the file to convert, and originally you were just calling seqIO.convert with "files".
so your process function should probably be something like this: