I want to read from python stdin but also to have input options in my program. When I try to pass an option to my programm I get the error file not found and my arguments are discarded.
For parsing the arguments I use the following code:
parser=argparse.ArgumentParser(description='Training and Testing Framework')
parser.add_argument('--text', dest='text',
help='The text model',required=True)
parser.add_argument('--features', dest='features',
help='The features model',required=True)
parser.add_argument('--test', dest='testingset',
help='The testing set.',required=True)
parser.add_argument('--vectorizer', dest='vectorizer',
help='The vectorizer.',required=True)
args = vars(parser.parse_args())
For reading from the stdin I use the following code:
for line in sys.stdin.readlines():
print(preprocess(line,1))
Command Line
echo "dsfdsF" |python ensemble.py -h
/usr/local/lib/python2.7/dist-packages/pandas/io/excel.py:626: UserWarning: Installed openpyxl is not supported at this time. Use >=1.6.1 and <2.0.0.
.format(openpyxl_compat.start_ver, openpyxl_compat.stop_ver))
Traceback (most recent call last):
File "ensemble.py", line 38, in <module>
from preprocess import preprocess
File "/home/nikos/experiments/mentions/datasets/preprocess.py", line 7, in <module>
with open(sys.argv[1], 'rb') as csvfile:
IOError: [Errno 2] No such file or directory: '-h'
Your preprocess.py
file is trying to read form sys.argv[1]
and open it as a file.
If you pass -h
to your command line, it is trying to open file with that name.
split command line parsing from processing
Your preprocess
function shall not care about command line parameters, it shall get the open file descriptor as an argument.
So after you parse command line parameters, you shall take care about providing file descriptor, in your case it will be sys.stdin
.
Sample solution using docopt
There is nothing wrong with argparse, my favourite parser is docopt
and I will use it to illustrate typical split of command line parsing, preparing final function call and final function call. You can achieve the same with argparse too.
First install docopt:
$ pip install docopt
Here comes the fromstdin.py
code:
"""fromstdin - Training and Testing Framework
Usage: fromstdin.py [options] <input>
Options:
--text=<textmodel> Text model [default: text.txt]
--features=<features> Features model [default: features.txt]
--test=<testset> Testing set [default: testset.txt]
--vectorizer=<vectorizer> The vectorizec [default: vector.txt]
Read data from <input> file. Use "-" for reading from stdin.
"""
import sys
def main(fname, text, features, test, vectorizer):
if fname == "-":
f = sys.stdin
else:
f = open(fname)
process(f, text, features, test, vectorizer)
print "main func done"
def process(f, text, features, test, vectorizer):
print "processing"
print "input parameters", text, features, test, vectorizer
print "reading input stream"
for line in f:
print line.strip("\n")
print "processing done"
if __name__ == "__main__":
from docopt import docopt
args = docopt(__doc__)
print args
infile = args["<input>"]
textfile = args["--text"]
featuresfile = args["--features"]
testfile = args["--test"]
vectorizer = args["--vectorizer"]
main(infile, textfile, featuresfile, testfile, vectorizer)
Can be called like:
$ python fromstdin.py
Usage: fromstdin.py [options] <input>
Show the help:
$ python fromstdin.py -h
fromstdin - Training and Testing Framework
Usage: fromstdin.py [options] <input>
Options:
--text=<textmodel> Text model [default: text.txt]
--features=<features> Features model [default: features.txt]
--test=<testset> Testing set [default: testset.txt]
--vectorizer=<vectorizer> The vectorizec [default: vector.txt]
Read data from <input> file. Use "-" for reading from stdin.
Use it, feeding from stdin:
(so)javl@zen:~/sandbox/so/cmd$ ls | python fromstdin.py -
{'--features': 'features.txt',
'--test': 'testset.txt',
'--text': 'text.txt',
'--vectorizer': 'vector.txt',
'<input>': '-'}
processing
input parameters text.txt features.txt testset.txt vector.txt
reading input stream
bcmd.py
callit.py
fromstdin.py
scrmodule.py
processing done
main func done