With Python 3:
>>> from collections import OrderedDict
>>> d1 = OrderedDict([('foo', 'bar')])
>>> d2 = OrderedDict([('foo', 'bar')])
I wanted to check for equality:
>>> d1 == d2
True
>>> d1.keys() == d2.keys()
True
But:
>>> d1.values() == d2.values()
False
Do you know why values are not equal?
I've tested this with Python 3.4 and 3.5.
Following this question, I posted on the Python-Ideas mailing list to have additional details:
https://mail.python.org/pipermail/python-ideas/2015-December/037472.html
In Python 3, dict.keys()
and dict.values()
return special iterable classes - respectively a collections.abc.KeysView
and a collections.abc.ValuesView
. The first one inherit it's __eq__
method from set
, the second uses the default object.__eq__
which tests on object identity.
In python3, d1.values()
and d2.values()
are collections.abc.ValuesView
objects:
>>> d1.values()
ValuesView(OrderedDict([('foo', 'bar')]))
Don't compare them as an object, c
onvert them to lists and then compare them:
>>> list(d1.values()) == list(d2.values())
True
Investigating why it works for comparing keys, in _collections_abc.py
of CPython, KeysView
is inheriting from Set
while ValuesView
does not:
class KeysView(MappingView, Set):
class ValuesView(MappingView):
Tracing for __eq__
in ValuesView
and its parents:
MappingView ==> Sized ==> ABCMeta ==> type ==> object
.
__eq__
is implemented only in object
and not overridden.
In the other hand, KeysView
inherits __eq__
directly from Set
.
Unfortunately, both current answers don't address why this is but focus on how this is done. That mailing list discussion was amazing, so I'll sum things up:
For odict.keys
/dict.keys
and odict.items
/dict.items
:
odict.keys
(subclass of dict.keys
) supports comparison due to its conformance to collections.abc.Set
(it's a set-like object). This is possible due to the fact that keys
inside a dictionary (ordered or not) are guaranteed to be unique and hashable.
odict.items
(subclass of dict.items
) also supports comparison for the same reason as .keys
does. itemsview
is allowed to do this since it raises the appropriate error if one of the item
s (specifically, the second element representing the value) is not hashable, uniqueness is guaranteed, though (due to keys
being unique):
>>> od = OrderedDict({'a': []})
>>> set() & od.items()
TypeErrorTraceback (most recent call last)
<ipython-input-41-a5ec053d0eda> in <module>()
----> 1 set() & od.items()
TypeError: unhashable type: 'list'
For both these views keys
, items
, the comparison uses a simple function called all_contained_in
(pretty readable) that uses the objects __contain__
method to check for membership of the elements in the views involved.
Now, about odict.values
/dict.values
:
As noticed, odict.values
(subclass of dict.values
[shocker]) doesn't compare like a set-like object. This is because the values
of a valuesview
cannot be represented as a set, the reasons are two-fold:
- Most importantly, the view might contain duplicates which cannot be dropped.
- The view might contain non-hashable objects (which, on it's own, isn't sufficient to not treat the view as set-like).
As stated in a comment by @user2357112 and by @abarnett in the mailing list, odict.values
/dict.values
is a multiset, a generalization of sets that allows multiple instances of it's elements.
Trying to compare these is not as trivial as comparing keys
or items
due to the inherent duplication, the ordering and the fact that you probably need to take into consideration the keys that correspond to those values. Should dict_values
that look like this:
>>> {1:1, 2:1, 3:2}.values()
dict_values([1, 1, 2])
>>> {1:1, 2:1, 10:2}.values()
dict_values([1, 1, 2])
actually be equal even though the values that correspond to the keys isn't the same? Maybe? Maybe not? It isn't straight-forward either way and will lead to inevitable confusion.
The point to be made though is that it isn't trivial to compare these as is with keys
and items
, to sum up, with another comment from @abarnett on the mailing list:
If you're thinking we could define what multisets should do, despite not having a standard multiset type or an ABC for them, and apply that to values views, the next question is how to do that in better than quadratic time for non-hashable values. (And you can't assume ordering here, either.) Would having a values view hang for 30 seconds and then come back with the answer you intuitively wanted instead of giving the wrong answer in 20 millis be an improvement? (Either way, you're going to learn the same lesson: don't compare values views. I'd rather learn that in 20 millis.)