I am new to coding and have ran into a problem trying to encode a string.
>>> import hashlib
>>> a = hashlib.md5()
>>> a.update('hi')
Traceback (most recent call last):
File "<pyshell#22>", line 1, in <module>
a.update('hi')
TypeError: Unicode-objects must be encoded before hashing
>>> a.digest()
b'\xd4\x1d\x8c\xd9\x8f\x00\xb2\x04\xe9\x80\t\x98\xec\xf8B~'
Is (a) now considered to be encoded?
Second question: When I run the same code above in a script I get this error:
import hashlib
a = hashlib.md5()
a.update('hi')
a.digest()
Traceback (most recent call last):
File "C:/Users/User/Desktop/Logger/Encoding practice.py", line 3, in
a.update('hi')
TypeError: Unicode-objects must be encoded before hashing
Why is the code working in the shell and not the script?
I am working with Windows and Python 3.4
Thanks.
Since you are encoding simple strings I deduce that you are running Python 3 where all strings are unicode objects, you have two options:
- Provide an encoding for the strings, e.g.:
"Nobody inspects".encode('utf-8')
Use binary strings as shown in the manuals:
m.update(b"Nobody inspects")
m.update(b" the spammish repetition")
The reason for the differing behaviour in the script to the shell is that the script stops on the error whereas in the shell the last line is a separate command but still not doing what you wish it to because of the previous error.
The solution I've found is to simply encode the data right away in the line where you're hashing it:
hashlib.sha256("a".encode('utf-8')).hexdigest()
It worked for me, hope it helps!
Under the different versions of Python is different,I use Python 2.7,same as you write, it works well.
hashlib.md5(data) function, the type of data parameters should be 'bytes'.That is to say, we must put the type of data into bytes before hashes.
Requirements before the hash code conversion, because the same string have different values under different coding systems(utf8\gbk.....), in order to ensure not happen ambiguity has to be a dominant conversion.
It's not working in the REPL. It's hashed nothing, since you've passed it nothing valid to hash. Try encoding first.
3>> hashlib.md5().digest()
b'\xd4\x1d\x8c\xd9\x8f\x00\xb2\x04\xe9\x80\t\x98\xec\xf8B~'
3>> a = hashlib.md5()
3>> a.update('hi'.encode('utf-8'))
3>> a.digest()
b'I\xf6\x8a\\\x84\x93\xec,\x0b\xf4\x89\x82\x1c!\xfc;'