Parser in python3 does not take delimiter values f

2019-05-10 10:43发布

问题:

I have written a simple script as an advanced tool for my awk/sed requirements. In the script I compare two files on basis of values from one column of the query file and then extract whole entries from the master file. The script allows you to enter the values for columns and delimiters for each file.

The problem is that the 'delimiter' options are not recognized by script when given from command line.

Here is my code (partial):

##- - - - - - - -- - - - - - Arguments - - - - - - - - - - - - - -##
parser = argparse.ArgumentParser()

## Command line options
parser.add_argument("-m",  "--master",     dest="master", help="master file")
parser.add_argument("-q",  "--query",      dest="query",  help="queries to be extracted")
parser.add_argument("-d",  "--delimiter",  dest="delimiter",  default='\t', help="delimiter in master")
parser.add_argument("-p",  "--position",   dest="position",   default='1',  help="position/column of value in master")
parser.add_argument("-d2", "--delimiter2", dest="delimiter2", default='\t', help="delimiter in query")
parser.add_argument("-p2", "--position2",  dest="position2",  default='1',  help="position/column of value in query")

args = parser.parse_args()

def Extractor(master, query):

    out_file = ('%s_matched_%s' % (query,master))
    fh_out = open(out_file, 'w')

    query_set = () ## To unique query set
    for i in query:
        key = i.split('args.delimiter2')[int(args.position2)] ## Key is the value on which matching will be done
        query_set.add(key)

So as you see, I take options for the 'query file' delimiter from the command line and use them in the script via argparse, but that does not work. It only works if I explicitly mention the delimiter in the script like:

key = i.split('\t')[args.position2] ## Key is the value on which matching will be done

The command line option I give is:

$ py3 ExtractHeaders_v01.py -m ABC.csv -q XYZ.list -d2 \t -d , -p 1 -p2 0

where

  • ABC.csv is the master file from which to extract entries.
    • The second column will be used for matching (-p 1)
    • Its delimiter is comma (-d ,)
  • XYZ.list is the query file.
    • The first column will be used for matching (-p2 0)
    • Its delimiter is tab (-d2 \t)

Please help me understand why the delimiters are not used by script when given from the command line.

回答1:

You can also pass the Tab character in a *nix shell (bash for example) by pressing Ctrl+V followed by Tab enclosed in quotes (single or double), i.e. type " Ctrl+V Tab ".



回答2:

Your shell is interpreting the \t in your command line and what's getting passed to Python is, most likely, a single t. Try \\t or '\t' to get the literal two-character escape sequence into the argv. Then you'll need to unescape this string in Python:

delimiter = delimiter.decode("string-escape")