Passing meta-characters to Python as arguments fro

2019-01-19 14:32发布

问题:

I'm making a Python program that will parse the fields in some input lines. I'd like to let the user enter the field separator as an option from the command line. I'm using optparse to do this. I'm running into the problem that entering something like \t will separate literally on \t, rather than on a tab, which is what I want. I'm pretty sure this is a Python thing and not the shell, since I've tried every combo of quotes, backslashes, and t's that I can think of.

If I could get optparse to let the argument be plain input (is there such a thing?) rather than raw_input, I think that would work. But I have no clue how to do that.

I've also tried various substitutions and regex tricks to turn the string from the two character "\t" into the one character tab, but without success.

Example, where input.txt is:

field 1[tab]field\t2

(Note: [tab] is a tab character and field\t2 is an 8 character string)

parseme.py:

#!/usr/bin/python
from optparse import OptionParser  
parser = OptionParser()  
parser.add_option("-d", "--delimiter", action="store", type="string",  
    dest="delimiter", default='\t')  
parser.add_option("-f", dest="filename")  
(options, args) = parser.parse_args()  
Infile = open(options.filename, 'r')  
Line = Infile.readline()  

Fields = Line.split(options.delimiter)  
print Fields[0]  
print options.delimiter  

Infile.close()  

This gives me:

$ parseme.py -f input.txt  
field 1  
[tab]

Hey, great, the default setting worked properly. (Yes, I know I could just make \t the default and forget about it, but I'd like to know how to deal with this type of problem.)

$ parseme.py -f input.txt -d '\t'  
field 1[tab]field  
\t

This is not what I want.

回答1:

>>> r'\t\n\v\r'.decode('string-escape')
'\t\n\x0b\r'


回答2:

The quick and dirty way is to to eval it, like this:

eval(options.delimiter, {}. {})

The extra empty dicts are there to prevent accidental clobbering of your program.



回答3:

solving it from within your script:

options.delimiter = re.sub("\\\\t","\t",options.delimiter)

you can adapt the re about to match more escaped chars (\n, \r, etc)

another way to solve the problem outside python:

when you call your script from shell, do it like this:

parseme.py -f input.txt -d '^V<tab>'

^V means "press Ctrl+V"

then press the normal tab key

this will properly pass the tab character to your python script;



回答4:

The callback option is a good way to handle tricky cases:

parser.add_option("-d", "--delimiter", action="callback", type="string",
                  callback=my_callback, default='\t')

with the corresponding function (to be defined before the parser, then):

def my_callback(option, opt, value, parser):
    val = value
    if value == '\\t':
        val = '\t'
    elif value == '\\n':
        val = '\n'
    parser.values.delimiter = val

You can check this works via the command line: python test.py -f test.txt -d \t (no quote around the \t, they're useless).

It has the advantage of handling the option via the 'optparse' module, not via post-processing the parsing results.