Convert from hexa to Arabic text with Python

2019-09-06 05:25发布

问题:

I build a code that convert a hexadecimal string into Unicode format but after print, the output the conversion not work while when making copy from output and put it in print(u'output') the Arabic text appear

Python Code

input ="062A06450020062A62C062F064A062F0020";
i = 0 ;
n ="\\"+"u";

    while i < (len(input)):
        n +=   input[i:i+4] + "\\"+"u";
        i = i + 4;

output = str(n[0:(len(n)-2)]) ;
print (u'%s'%output)

Output:

\u062A\u0645\u0020\u062A\u62C0\u62F0\u64A0\u62F0\u020

Copy output and use print Unicode:

print (u'\u062A\u0645\u0020\u062A\u62C0\u62F0\u64A0\u62F0\u020')

Arabic text appear

回答1:

You can't produce Unicode codepoints by prepending \u in string values, no, because the \u sequence is part of the string literal syntax. It is used by the Python parser, no the interpreter, to produce Unicode values.

Your input is also too short; you'd need one more digit somewhere, it looks like you are missing a 0 in the middle before the presumably in the middle before 62C.

You essentially have hexadecimal UTF-16 in big-endian order; just decode from hex and decode as utf-16-be:

from binascii import unhexlify
unhexlify(input).decode('utf-16-be')

Demo, with corrected input data:

>>> from binascii import unhexlify
>>> input ="062A06450020062A062C062F064A062F0020"
>>> unhexlify(input).decode('utf-16-be')
'تم تجديد '
>>> print(unhexlify(input).decode('utf-16-be'))
تم تجديد