How can I run a python script on many files to get

I am new at programming and I have written a script to extract text from a vcf file. I am using a Linux virtual machine and running Ubuntu. I have run this script through the command line by changing my directory to the file with the vcf file in and then entering python script.py.

My script knows which file to process because the beginning of my script is:

my_file = open("inputfile1.vcf", "r+")
outputfile = open("outputfile.txt", "w")

The script puts the information I need into a list and then I write it to outputfile. However, I have many input files (all .vcf) and want to write them to different output files with a similar name to the input (such as input_processed.txt).

Do I need to run a shell script to iterate over the files in the folder? If so how would I change the python script to accommodate this? I.e writing the list to an outputfile?

标签： python linux bash shell python-2.7

5条回答

【Aperson】

2楼-- · 2019-08-15 08:28

You don't need write shell script, maybe this question will help you?

How to list all files of a directory?

0人赞添加讨论(0) 举报

霸刀☆藐视天下

3楼-- · 2019-08-15 08:34

I would integrate it within the Python script, which will allow you to easily run it on other platforms too and doesn't add much code anyway.

import glob
import os

# Find all files ending in 'vcf'
for vcf_filename in glob.glob('*.vcf'):
    vcf_file = open(vcf_filename, 'r+')

    # Similar name with a different extension
    output_filename = os.path.splitext(vcf_filename)[0] + '.txt'
    outputfile = open(output_filename, 'w')

    # Process the data
    ...

To output the resulting files in a separate directory I would:

import glob
import os

output_dir = 'processed'
os.makedirs(output_dir, exist_ok=True)

# Find all files ending in 'vcf'
for vcf_filename in glob.glob('*.vcf'):
    vcf_file = open(vcf_filename, 'r+')

    # Similar name with a different extension
    output_filename = os.path.splitext(vcf_filename)[0] + '.txt'
    outputfile = open(os.path.join(output_dir, output_filename), 'w')

    # Process the data
    ...

0人赞添加讨论(0) 举报

走好不送

4楼-- · 2019-08-15 08:41

You can use listdir(you need to write condition to filter the particular extension) or glob. I generally prefer glob. For example

import os
import glob
for file in glob.glob('*.py'):
    data = open(file, 'r+')
    output_name = os.path.splitext(file)[0]
    output = open(output_name+'.txt', 'w')
    output.write(data.read())

This code will read the content from input and store it in outputfile.

0人赞添加讨论(0) 举报

\"骚年 ilove

5楼-- · 2019-08-15 08:45

It depends on how you implement the iteration logic.

If you want to implement it in python, just do it;
If you want to implement it in a shell script, just change your python script to accept parameters, and then use shell script to call the python script with your suitable parameters.

0人赞添加讨论(0) 举报

唯我独甜

6楼-- · 2019-08-15 08:51

I have a script I frequently use which includes using PyQt5 to pop up a window that prompts the user to select a file... then it walks the directory to find all of the files in the directory:

pathname = first_fname[:(first_fname.rfind('/') + 1)] #figures out the pathname by finding the last '/'
new_pathname = pathname + 'for release/' #makes a new pathname to be added to the names of new files so that they're put in another directory...but their names will be altered 

file_list = [f for f in os.listdir(pathname) if f.lower().endswith('.xls') and not 'map' in f.lower() and not 'check' in f.lower()] #makes a list of the files in the directory that end in .xls and don't have key words in the names that would indicate they're not the kind of file I want

You need to import os to use the os.listdir command.

0人赞添加讨论(0) 举报

How can I run a python script on many files to get

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间