id() vs `is` operator. Is it safe to compare `id`s

2020-02-12 07:04发布

How much can I rely on the object's id() and its uniqueness in practice? E.g.:

  • Does id(a) == id(b) mean a is b or vice versa? What about the opposite?
  • How safe is it to save an id somewhere to be used later (e.g. into some registry instead of the object itself)?

(Written as a proposed canonical in response to Canonicals for Python: are objects with the same id() the same object, `is` operator, unbound method objects)

1条回答
祖国的老花朵
2楼-- · 2020-02-12 07:29

According to the id() documentation, an id is only guaranteed to be unique

  1. for the lifetime of the specific object, and
  2. within a specific interpreter instance

As such, comparing ids is not safe unless you also somehow ensure that both objects whose ids are taken are still alive at the time of comparison (and are associated with the same Python interpreter instance, but you need to really try to make that become false).

Which is exactly what is does -- which makes comparing ids redundant. If you cannot use the is syntax for whatever reason, there's always operator.is_.


Now, whether an object is still alive at the time of comparison is not always obvious (and sometimes is grossly non-obvious):

  • Accessing some attributes (e.g. bound methods of an object) creates a new object each time. So, the result's id may or may not be the same on each attribute access.

    Example:

    >>> class C(object): pass
    >>> c=C()
    >>> c.a=1
    
    >>> c.a is c.a
    True        # same object each time
    
    >>> c.__init__ is c.__init__
    False       # a different object each time
    
    # The above two are not the only possible cases.
    # An attribute may be implemented to sometimes return the same object
    # and sometimes a different one:
    @property
    def page(self):
        if check_for_new_version():
            self._page=get_new_version()
        return self._page
    
  • If an object is created as a result of calculating an expression and not saved anywhere, it's immediately discarded,1 and any object created after that can take up its id.

    • This is even true within the same code line. E.g. the result of id(create_foo()) == id(create_bar()) is undefined.

      Example:

      >>> id([])     #the list object is discarded when id() returns
      39733320L
      >>> id([])     #a new, unrelated object is created (and discarded, too)
      39733320L      #its id can happen to be the same
      >>> id([[]])
      39733640L      #or not
      >>> id([])
      39733640L      #you never really know
      

Due to the above safety requirements when comparing ids, saving an id instead of the object is not very useful because you have to save a reference to the object itself anyway -- to ensure that it stays alive. Neither is there any performance gain: is implementation is as simple as comparing pointers.


Finally, as an internal optimization (and implementation detail, so this may differ between implementations and releases), CPython reuses some often-used simple objects of immutable types. As of this writing, that includes small integers and some strings. So even if you got them from different places, their ids might coincide.

This does not (technically) violate the above id() documentation's uniqueness promises: the reused object stays alive through all the reuses.

This is also not a big deal because whether two variables point to the same object or not is only practical to know if the object is mutable: if two variables point to the same mutable object, mutating one will (unexpectedly) change the other, too. Immutable types don't have that problem, so for them, it doesn't matter if two variables point to two identical objects or to the same one.


1Sometimes, this is called "unnamed expression".

查看更多
登录 后发表回答