How can I replace '%' to '\x' in P

2019-07-20 13:37发布

My aim is that converting base64 encoding "%EB" string to "\xEB". However, as soon as I tried, I found that it is hard and can't achieved by string.replace nor re.sub both.

My code failed as below:

target = '%EB%AF%B8%EB%9F%AC%EC%8A%A4%20%EC%97%A3%EC%A7%80'

target.replace('%','\x')
-> ValueError: invalid \x escape

re.sub('%','\x',target)
-> ValueError: invalid \x escape

UPDATED:

Thanks for comments, I tried '\x' and r'\x', however, it seems that those couldn't be a solution.

for example,

target = '%EB%AF%B8%EB%9F%AC%EC%8A%A4%20%EC%97%A3%EC%A7%80'
converted1 = target.replace('%',r'\x')
converted2 = target.replace('%','\\x')
converted1
-> '\\xEB\\xAF\\xB8\\xEB\\x9F\\xAC\\xEC\\x8A\\xA4\\x20\\xEC\\x97\\xA3\\xEC\\xA7\\x80'
converted2
-> '\\xEB\\xAF\\xB8\\xEB\\x9F\\xAC\\xEC\\x8A\\xA4\\x20\\xEC\\x97\\xA3\\xEC\\xA7\\x80'

Results:

print converted1
\xEB\xAF\xB8\xEB\x9F\xAC\xEC\x8A\xA4\x20\xEC\x97\xA3\xEC\xA7\x80
print converted2
\xEB\xAF\xB8\xEB\x9F\xAC\xEC\x8A\xA4\x20\xEC\x97\xA3\xEC\xA7\x80

What I want to have is:

print "\xEB\xAF\xB8\xEB\x9F\xAC\xEC\x8A\xA4\x20\xEC\x97\xA3\xEC\xA7\x80"
미러스 엣지

3条回答
地球回转人心会变
2楼-- · 2019-07-20 14:06

The method replace cannot decode URL-safe string. It just replace character % to \x. If you want to decode URL-safe string, you should use urllib.unquote.

import urllib
target = '%EB%AF%B8%EB%9F%AC%EC%8A%A4%20%EC%97%A3%EC%A7%80'
print urllib.unquote(target)
查看更多
叼着烟拽天下
3楼-- · 2019-07-20 14:29
>>> target = '%EB%AF%B8%EB%9F%AC%EC%8A%A4%20%EC%97%A3%EC%A7%80'
>>> target.replace('%',r'\x')
'xEBxAFxB8xEBx9FxACxECx8AxA4x20xECx97xA3xECxA7x80'

Why is '\x' invalid in Python?

For the second part of your code, use:

print target.replace('%',r'\x').decode('string-escape')

Though this fixes your error, the best solution is the one by @kamae

查看更多
不美不萌又怎样
4楼-- · 2019-07-20 14:31

I think you missed difference between CLI of interactive Python and the python source code. What you actually do in your code is changing character "%" in the string into "\x" characters.

What you do from the Python's command line is to enter string with escape code interpreted at the moment of string creation (when you pressed Enter). Your string then is unicode and contains binary representation of your Korean characters.

Converting unicode codepoints to UTF8 hex in Python may help you.

查看更多
登录 后发表回答