Can someone explain this?
pickle.loads(b'\x80\x03X\x01\x00\x00\x00.q\x00h\x00\x86q\x01.') == pickle.loads(b'\x80\x03X\x01\x00\x00\x00.q\x00X\x01\x00\x00\x00.q\x01\x86q\x02.')
>>>True
pickle.loads(b'\x80\x03X\x01\x00\x00\x00.q\x00h\x00\x86q\x01.')
>>>('.', '.')
pickle.loads(b'\x80\x03X\x01\x00\x00\x00.q\x00X\x01\x00\x00\x00.q\x01\x86q\x02.')
>>>('.', '.')
There seems to be a long and short pickled version of tuples with the same element repeatedly.
Other examples:
pickle.loads(b'\x80\x03X\x01\x00\x00\x00#q\x00X\x01\x00\x00\x00#q\x01\x86q\x02.')
>>>('#', '#')
pickle.loads(b'\x80\x03X\x01\x00\x00\x00#q\x00h\x00\x86q\x01.')
>>>('#', '#')
pickle.loads(b'\x80\x03X\x01\x00\x00\x00$q\x00X\x01\x00\x00\x00$q\x01\x86q\x02.')
>>>('$', '$')
pickle.loads(b'\x80\x03X\x01\x00\x00\x00$q\x00h\x00\x86q\x01.')
>>>('$', '$')
I'm trying to index items by their pickle but I'm not finding the items because their pickles seem to be changing.
I'm using Python 3.3.2 on Ubuntu.
Pickles aren't unique; the pickle format is actually a tiny little programming language, and different programs (pickles) can produce the same output (unpickled object). From the docs:
There's even a
pickletools.optimize
function that will take a pickle and output a better pickle. You're going to need to redesign your program.