I would like to parse a string like this:
-o 1 --long "Some long string"
into this:
["-o", "1", "--long", 'Some long string']
or similar.
This is different than either getopt, or optparse, which start with sys.argv parsed input (like the output I have above). Is there a standard way to do this? Basically, this is "splitting" while keeping quoted strings together.
My best function so far:
import csv
def split_quote(string,quotechar='"'):
'''
>>> split_quote('--blah "Some argument" here')
['--blah', 'Some argument', 'here']
>>> split_quote("--blah 'Some argument' here", quotechar="'")
['--blah', 'Some argument', 'here']
'''
s = csv.StringIO(string)
C = csv.reader(s, delimiter=" ",quotechar=quotechar)
return list(C)[0]
I believe you want the shlex module.
Before I was aware of
shlex.split
, I made the following:This works with Python 2.x and 3.x. On Python 2.x, it works directly with byte strings and Unicode strings. On Python 3.x, it only accepts [Unicode] strings, not
bytes
objects.This doesn't behave exactly the same as shell argv splitting—it also allows quoting of CR, LF and TAB characters as
\r
,\n
and\t
, converting them to real CR, LF, TAB (shlex.split
doesn't do that). So writing my own function was useful for my needs. I guessshlex.split
is better if you just want plain shell-style argv splitting. I'm sharing this code in case it's useful as a baseline for doing something slightly different.