If I call os.urandom(64), I am given 64 random bytes. With reference to Convert bytes to a Python string I tried
a = os.urandom(64)
a.decode()
a.decode("utf-8")
but got the traceback error stating that the bytes are not in utf-8.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 0: invalid start byte
with the bytes
b'\x8bz\xaf$\xb6\x93q\xef\x94\x99$\x8c\x1eO\xeb\xed\x03O\xc6L%\xe70\xf9\xd8
\xa4\xac\x01\xe1\xb5\x0bM#\x19\xea+\x81\xdc\xcb\xed7O\xec\xf5\\}\x029\x122
\x8b\xbd\xa9\xca\xb2\x88\r+\x88\xf0\xeaE\x9c'
Is there a fullproof method to decode these bytes into some string representation? I am generating sudo random tokens to keep track of related documents across multiple database engines.
The code below will work on both Python 2.7 and 3:
You can use base-64 encoding. In this case:
Also note that I'm using
encode
here rather thandecode
, asdecode
is trying to take it from whatever format you specify into unicode. So in your example, you're treating the random bytes as if they form a validutf-8
string, which is rarely going to be the case with random bytes.You have random bytes; I'd be very surprised if that ever was decodable to a string.
If you have to have a unicode string, decode from Latin-1:
because it maps bytes one-on-one to corresponding Unicode code points.