I need to save a params file in python and this params file contains some parameters that I won't leave on plain text, so I codify the entire file to base64 (I know that this isn't the most secure encoding of the world but it works for the kind of data that I need to use).
With the encoding, everything works well. I encode the content of my file (a simply txt with a proper extension) and save the file. The problem comes with the decode. I print the text coded before save the file and the text coded from the file saved and there are exactly the same, but for a reason I don't know, the decode of the text of the file saved returns me this error UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8d in position 1: invalid start byte
and the decode of the text before save the file works well.
Any idea to resolve this issue?
This is my code, I have tried converting all to bytes, to string, and everything...
params = open('params.bpr','r').read()
paramsencoded = base64.b64encode(bytes(params,'utf-8'))
print(paramsencoded)
paramsdecoded = str(base64.b64decode(str(paramsencoded,'utf-8')),'utf-8')
newparams = open('paramsencoded.bpr','w+',encoding='utf-8')
newparams.write(str(paramsencoded))
newparams.close()
params2 = open('paramsencoded.bpr',encoding='utf-8').read()
print(params2)
paramsdecoded = str(base64.b64decode(str(paramsencoded,'utf-8')),'utf-8')
paramsdecoded = base64.b64decode(str(params2))
print(str(paramsdecoded,'utf-8'))
Your error lies in your handling of the
bytes
object returned bybase64.b64encode()
, you calledstr()
on the object:That doesn't decode the
bytes
object:Note the
b'...'
notation. You produced the representation of the bytes object, which is a string containing Python syntax that can reproduce the value for debugging purposes (you can copy that string value and paste it into Python to re-create the samebytes
value).This may not be that easy to notice at first, as
base64.b64encode()
otherwise only produces output with printable ASCII bytes.But your decoding problem originates from there, because when decoding the value read back from the file includes the
b'
characters at the start. Those first two characters are interpreted as Base64 data too; theb
is a valid Base64 character, and the'
is ignored by the parser:Note how the output is completely different, because the Base64 decoding is now starting from the wrong place, as
b
is the first 6 bits of the first byte (making the first decoded byte a 6C, 6D, 6E or 6F bytes, som
,n
,o
orp
ASCII).You could properly decode the value (using
paramsencoded.decode('ascii')
orstr(paramsencoded, 'ascii')
) but you should't treat any of this data as text.Instead, open your files in binary mode. Reading and writing then operates with
bytes
objects, and thebase64.b64encode()
andbase64.b64decode()
functions also operate onbytes
, making for a perfect match:I explicitly use
bytes.decode(codec)
rather thanstr(..., codec)
to avoid accidentalstr(...)
calls.