This question already has an answer here:
- Python Replace \\ with \ 6 answers
I need to replace \\
with \
in python3 in a complex string. I know that this question had been asked several times, but most of the time for simple strings so that none of the (accepted) answers really works for complex strings.
This is also different from this one where the problem could be solved with .decode('unicode_escape')
which does not work for this problem. See below.
Assuming the string is:
my_str = '\\xa5\\xc0\\xe6aK\\xf9\\x80\\xb1\\xc8*\x01\x12$\\xfbp\x1e(4\\xd6{;Z\\x'
Straight forward approach would be:
my_str.replace('\\','\')
which leads to:
SyntaxError: EOL while scanning string literal
This answer suggests using:
my_str.replace('\\\\','\\')
Which results in:
'\\xa5\\xc0\\xe6aK\\xf9\\x80\\xb1\\xc8*\x01\x12$\\xfbp\x1e(4\\xd6{;Z\\x'
So, there is no change.
This answer suggests:
b = bytes(my_str, encoding='utf-8')
b.decode('unicode-escape')
But this doesn't work for such a complex string:
UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 49-50: truncated \xXX escape
Using decode (as suggested here) results in:
my_str.decode('unicode_escape')
AttributeError: 'my_str' object has no attribute 'decode'
A combination of encoding and then decoding using unicode_esacpe
returns a totally different string (probably due to using utf-16
, but utf-8
results in an error, see above. Also, e.g. latin1
doesn't work):
my_str.encode('utf-16').decode('unicode_escape')
'ÿþ\\\x00x\x00a\x005\x00\\\x00x\x00c\x000\x00\\\x00x\x00e\x006\x00a\x00K\x00\\\x00x\x00f\x009\x00\\\x00x\x008\x000\x00\\\x00x\x00b\x001\x00\\\x00x\x00c\x008\x00*\x00\x01\x00\x12\x00$\x00\\\x00x\x00f\x00b\x00p\x00\x1e\x00(\x004\x00\\\x00x\x00d\x006\x00{\x00;\x00Z\x00\\\x00x\x00'