I want to receive a delimiter like '\t' (tab) from command line,
and use it to parse a text file.
If I put
delimiter = sys.argv[1]
in the code, and type from the command line
$ python mycode.py "\t"
delimiter is '\\t'
i.e., python does its thing to preserve input string as is.
I want to convert this to '\t' so that I can use e.g.,
'a\tb\tc'.split(delimiter)
to get ['a','b','c']
.
I've tried to convert '\' to '\', but failed.
Is there a built-in python function to handle regex from the command line?
In Python 2 you can use str.decode('string_escape')
:
>>> '\\t'.decode('string_escape')
'\t'
In Python 3 you have to encode the string to bytes first and then use unicode_escape
:
>>> '\\t'.encode().decode('unicode_escape')
'\t'
Both solutions accept any escape sequence and will decode them correctly, so you could even use some fancy unicode stuff:
>>> '\\t\\n\\u2665'.encode().decode('unicode_escape')
'\t\n♥'
It's not really regexp you're looking for, it's escape sequences.
You could use eval
, as long as you're perfectly aware of the terrible security consequences, or roll your own string replacement/regexp based escape sequence unescaper.
(Who knows, maybe arg = arg.replace("\\t", "\t")
is enough for you?)
As a workaround you could do
$ python mycode.py `echo -ne '\t'`
to (ab) use the Unix echo command to do the unescaping for you.