I'm trying to download BVLC-trained model and I'm stuck with this error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 110: invalid start byte
I think it's because of the following function (complete code)
# Closure-d function for checking SHA1.
def model_checks_out(filename=model_filename, sha1=frontmatter['sha1']):
with open(filename, 'r') as f:
return hashlib.sha1(f.read()).hexdigest() == sha1
Any idea how to fix this?
Since there is not a single hint in the documentation nor src code, I have no clue why, but using the b char (i guess for binary) totally works (tf-version: 1.1.0):
For more information, check out: gfile
You are opening a file that is not UTF-8 encoded, while the default encoding for your system is set to UTF-8.
Since you are calculating a SHA1 hash, you should read the data as binary instead. The
hashlib
functions require you pass in bytes:Note the addition of
b
in the file mode.See the
open()
documentation:and from the
hashlib
module documentation:You didn't specify to open the file in binary mode, so
f.read()
is trying to read the file as a UTF-8-encoded text file, which doesn't seem to be working. But since we take the hash of bytes, not of strings, it doesn't matter what the encoding is, or even whether the file is text at all: just open it, and then read it, as a binary file.but