My aim is that converting base64 encoding "%EB" string to "\xEB". However, as soon as I tried, I found that it is hard and can't achieved by string.replace nor re.sub both.
My code failed as below:
target = '%EB%AF%B8%EB%9F%AC%EC%8A%A4%20%EC%97%A3%EC%A7%80'
target.replace('%','\x')
-> ValueError: invalid \x escape
re.sub('%','\x',target)
-> ValueError: invalid \x escape
UPDATED:
Thanks for comments, I tried '\x' and r'\x', however, it seems that those couldn't be a solution.
for example,
target = '%EB%AF%B8%EB%9F%AC%EC%8A%A4%20%EC%97%A3%EC%A7%80'
converted1 = target.replace('%',r'\x')
converted2 = target.replace('%','\\x')
converted1
-> '\\xEB\\xAF\\xB8\\xEB\\x9F\\xAC\\xEC\\x8A\\xA4\\x20\\xEC\\x97\\xA3\\xEC\\xA7\\x80'
converted2
-> '\\xEB\\xAF\\xB8\\xEB\\x9F\\xAC\\xEC\\x8A\\xA4\\x20\\xEC\\x97\\xA3\\xEC\\xA7\\x80'
Results:
print converted1
\xEB\xAF\xB8\xEB\x9F\xAC\xEC\x8A\xA4\x20\xEC\x97\xA3\xEC\xA7\x80
print converted2
\xEB\xAF\xB8\xEB\x9F\xAC\xEC\x8A\xA4\x20\xEC\x97\xA3\xEC\xA7\x80
What I want to have is:
print "\xEB\xAF\xB8\xEB\x9F\xAC\xEC\x8A\xA4\x20\xEC\x97\xA3\xEC\xA7\x80"
미러스 엣지
The method
replace
cannot decode URL-safe string. It just replace character%
to\x
. If you want to decode URL-safe string, you should useurllib.unquote
.Why is '\x' invalid in Python?
For the second part of your code, use:
Though this fixes your error, the best solution is the one by @kamae
I think you missed difference between CLI of interactive Python and the python source code. What you actually do in your code is changing character "%" in the string into "\x" characters.
What you do from the Python's command line is to enter string with escape code interpreted at the moment of string creation (when you pressed Enter). Your string then is unicode and contains binary representation of your Korean characters.
Converting unicode codepoints to UTF8 hex in Python may help you.