Python Read from Stdin with Arguments

2019-02-15 23:51发布

问题:

I want to read from python stdin but also to have input options in my program. When I try to pass an option to my programm I get the error file not found and my arguments are discarded.

For parsing the arguments I use the following code:

parser=argparse.ArgumentParser(description='Training and Testing Framework')

parser.add_argument('--text', dest='text',
                   help='The text model',required=True)
parser.add_argument('--features', dest='features',
                   help='The features model',required=True)
parser.add_argument('--test', dest='testingset',
                   help='The testing set.',required=True)
parser.add_argument('--vectorizer', dest='vectorizer',
                   help='The vectorizer.',required=True)
args = vars(parser.parse_args())

For reading from the stdin I use the following code:

for line in sys.stdin.readlines():
    print(preprocess(line,1))

Command Line

echo "dsfdsF" |python ensemble.py -h
/usr/local/lib/python2.7/dist-packages/pandas/io/excel.py:626: UserWarning: Installed openpyxl is not supported at this time. Use >=1.6.1 and <2.0.0.
  .format(openpyxl_compat.start_ver, openpyxl_compat.stop_ver))
Traceback (most recent call last):
  File "ensemble.py", line 38, in <module>
    from preprocess import preprocess
  File "/home/nikos/experiments/mentions/datasets/preprocess.py", line 7, in <module>
    with open(sys.argv[1], 'rb') as csvfile:
IOError: [Errno 2] No such file or directory: '-h'

回答1:

Your preprocess.py file is trying to read form sys.argv[1] and open it as a file.

If you pass -h to your command line, it is trying to open file with that name.

split command line parsing from processing

Your preprocess function shall not care about command line parameters, it shall get the open file descriptor as an argument.

So after you parse command line parameters, you shall take care about providing file descriptor, in your case it will be sys.stdin.

Sample solution using docopt

There is nothing wrong with argparse, my favourite parser is docopt and I will use it to illustrate typical split of command line parsing, preparing final function call and final function call. You can achieve the same with argparse too.

First install docopt:

$ pip install docopt

Here comes the fromstdin.py code:

"""fromstdin - Training and Testing Framework
Usage: fromstdin.py [options] <input>

Options:
    --text=<textmodel>         Text model [default: text.txt]
    --features=<features>      Features model [default: features.txt]
    --test=<testset>           Testing set [default: testset.txt]
    --vectorizer=<vectorizer>  The vectorizec [default: vector.txt]

Read data from <input> file. Use "-" for reading from stdin.
"""
import sys

def main(fname, text, features, test, vectorizer):
    if fname == "-":
        f = sys.stdin
    else:
        f = open(fname)
    process(f, text, features, test, vectorizer)
    print "main func done"

def process(f, text, features, test, vectorizer):
    print "processing"
    print "input parameters", text, features, test, vectorizer
    print "reading input stream"
    for line in f:
        print line.strip("\n")
    print "processing done"


if __name__ == "__main__":
    from docopt import docopt
    args = docopt(__doc__)
    print args
    infile = args["<input>"]
    textfile = args["--text"]
    featuresfile = args["--features"]
    testfile = args["--test"]
    vectorizer = args["--vectorizer"]
    main(infile, textfile, featuresfile, testfile, vectorizer)

Can be called like:

$ python fromstdin.py
Usage: fromstdin.py [options] <input>

Show the help:

$ python fromstdin.py -h
fromstdin - Training and Testing Framework
Usage: fromstdin.py [options] <input>

Options:
    --text=<textmodel>         Text model [default: text.txt]
    --features=<features>      Features model [default: features.txt]
    --test=<testset>           Testing set [default: testset.txt]
    --vectorizer=<vectorizer>  The vectorizec [default: vector.txt]

Read data from <input> file. Use "-" for reading from stdin.

Use it, feeding from stdin:

(so)javl@zen:~/sandbox/so/cmd$ ls | python fromstdin.py -
{'--features': 'features.txt',
 '--test': 'testset.txt',
 '--text': 'text.txt',
 '--vectorizer': 'vector.txt',
 '<input>': '-'}
processing
input parameters text.txt features.txt testset.txt vector.txt
reading input stream
bcmd.py
callit.py
fromstdin.py
scrmodule.py
processing done
main func done