Why are the values of an OrderedDict not equal?

2019-02-01 16:36发布

问题:

With Python 3:

>>> from collections import OrderedDict
>>> d1 = OrderedDict([('foo', 'bar')])
>>> d2 = OrderedDict([('foo', 'bar')])

I wanted to check for equality:

>>> d1 == d2
True
>>> d1.keys() == d2.keys()
True

But:

>>> d1.values() == d2.values()
False

Do you know why values are not equal?

I've tested this with Python 3.4 and 3.5.


Following this question, I posted on the Python-Ideas mailing list to have additional details:

https://mail.python.org/pipermail/python-ideas/2015-December/037472.html

回答1:

In Python 3, dict.keys() and dict.values() return special iterable classes - respectively a collections.abc.KeysView and a collections.abc.ValuesView. The first one inherit it's __eq__ method from set, the second uses the default object.__eq__ which tests on object identity.



回答2:

In python3, d1.values() and d2.values() are collections.abc.ValuesView objects:

>>> d1.values()
ValuesView(OrderedDict([('foo', 'bar')]))

Don't compare them as an object, c onvert them to lists and then compare them:

>>> list(d1.values()) == list(d2.values())
True

Investigating why it works for comparing keys, in _collections_abc.py of CPython, KeysView is inheriting from Set while ValuesView does not:

class KeysView(MappingView, Set):

class ValuesView(MappingView):
  • Tracing for __eq__ in ValuesView and its parents:

    MappingView ==> Sized ==> ABCMeta ==> type ==> object.

    __eq__ is implemented only in object and not overridden.

  • In the other hand, KeysView inherits __eq__ directly from Set.



回答3:

Unfortunately, both current answers don't address why this is but focus on how this is done. That mailing list discussion was amazing, so I'll sum things up:

For odict.keys/dict.keys and odict.items/dict.items:

  • odict.keys (subclass of dict.keys) supports comparison due to its conformance to collections.abc.Set (it's a set-like object). This is possible due to the fact that keys inside a dictionary (ordered or not) are guaranteed to be unique and hashable.
  • odict.items (subclass of dict.items) also supports comparison for the same reason as .keys does. itemsview is allowed to do this since it raises the appropriate error if one of the items (specifically, the second element representing the value) is not hashable, uniqueness is guaranteed, though (due to keys being unique):

    >>> od = OrderedDict({'a': []})
    >>> set() & od.items()
    TypeErrorTraceback (most recent call last)
    <ipython-input-41-a5ec053d0eda> in <module>()
    ----> 1 set() & od.items()
    
    TypeError: unhashable type: 'list'
    

    For both these views keys, items, the comparison uses a simple function called all_contained_in (pretty readable) that uses the objects __contain__ method to check for membership of the elements in the views involved.

Now, about odict.values/dict.values:

  • As noticed, odict.values (subclass of dict.values [shocker]) doesn't compare like a set-like object. This is because the values of a valuesview cannot be represented as a set, the reasons are two-fold:

    1. Most importantly, the view might contain duplicates which cannot be dropped.
    2. The view might contain non-hashable objects (which, on it's own, isn't sufficient to not treat the view as set-like).

As stated in a comment by @user2357112 and by @abarnett in the mailing list, odict.values/dict.values is a multiset, a generalization of sets that allows multiple instances of it's elements. Trying to compare these is not as trivial as comparing keys or items due to the inherent duplication, the ordering and the fact that you probably need to take into consideration the keys that correspond to those values. Should dict_values that look like this:

>>> {1:1, 2:1, 3:2}.values()
dict_values([1, 1, 2])
>>> {1:1, 2:1, 10:2}.values()
dict_values([1, 1, 2])

actually be equal even though the values that correspond to the keys isn't the same? Maybe? Maybe not? It isn't straight-forward either way and will lead to inevitable confusion.

The point to be made though is that it isn't trivial to compare these as is with keys and items, to sum up, with another comment from @abarnett on the mailing list:

If you're thinking we could define what multisets should do, despite not having a standard multiset type or an ABC for them, and apply that to values views, the next question is how to do that in better than quadratic time for non-hashable values. (And you can't assume ordering here, either.) Would having a values view hang for 30 seconds and then come back with the answer you intuitively wanted instead of giving the wrong answer in 20 millis be an improvement? (Either way, you're going to learn the same lesson: don't compare values views. I'd rather learn that in 20 millis.)