I'm doing some things in python (Using python 3.3.3), and I came across something that is confusing me since to my understanding class's get a new id each time they are called.
Lets say you have this in some .py file:
class someClass: pass
print(someClass())
print(someClass())
The above returns the same id which is confusing me since I'm calling on it so it shouldn't be the same, right? Is this how python works when the same class is called twice in a row or not? It gives a different id when I wait a few seconds but if I do it at the same like the example above it doesn't seem to work that way, which is confusing me.
>>> print(someClass());print(someClass())
<__main__.someClass object at 0x0000000002D96F98>
<__main__.someClass object at 0x0000000002D96F98>
It returns the same thing, but why? I also notice it with ranges for example
for i in range(10):
print(someClass())
Is there any particular reason for python doing this when the class is called quickly? I didn't even know python did this, or is it possibly a bug? If it is not a bug can someone explain to me how to fix it or a method so it generates a different id each time the method/class is called? I'm pretty puzzled on how that is doing it because if I wait, it does change but not if I try to call the same class two or more times.
The
id
of an object is only guaranteed to be unique during that object's lifetime, not over the entire lifetime of a program. The twosomeClass
objects you create only exist for the duration of the call toprint
- after that, they are available for garbage collection (and, in CPython, deallocated immediately). Since their lifetimes don't overlap, it is valid for them to share an id.It is also unsuprising in this case, because of a combination of two CPython implementation details: first, it does garbage collection by reference counting (with some extra magic to avoid problems with circular references), and second, the
id
of an object is related to the value of the underlying pointer for the variable (ie, its memory location). So, the first object, which was the most recent object allocated, is immediately freed - it isn't too surprising that the next object allocated will end up in the same spot (although this potentially also depends on details of how the interpreter was compiled).If you are relying on several objects having distinct
id
s, you might keep them around - say, in a list, so that their lifetimes overlap. Otherwise, you might implement a class-specific id that has different guarantees - eg:Try this, try calling the following:
You'll see something different. Why? Cause the memory that was released by the first object in the "foo" loop was reused. On the other hand
a
is not reused since it's retained.If you read the documentation for
id
, it says:And that's exactly what's happening: you have two objects with non-overlapping lifetimes, because the first one is already out of scope before the second one is ever created.
But don't trust that this will always happen, either. Especially if you need to deal with other Python implementations, or with more complicated classes. All that the language says is that these two objects may have the same
id()
value, not that they will. And the fact that they do depends on two implementation details:The garbage collector has to clean up the first object before your code even starts to allocate the second object—which is guaranteed to happen with CPython or any other ref-counting implementation (when there are no circular references), but pretty unlikely with a generational garbage collector as in Jython or IronPython.
The allocator under the covers have to have a very strong preference for reusing recently-freed objects of the same type. This is true in CPython, which has multiple layers of fancy allocators on top of basic C
malloc
, but most of the other implementations leave a lot more to the underlying virtual machine.One last thing: The fact that the
object.__repr__
happens to contain a substring that happens to be the same as theid
as a hexadecimal number is just an implementation artifact of CPython that isn't guaranteed anywhere. According to the docs:The fact that CPython's
object
happens to puthex(id(self))
(actually, I believe it's doing the equivalent ofsprintf
-ing its pointer through%p
, but since CPython'sid
just returns the same pointer cast to along
that ends up being the same) isn't guaranteed anywhere. Even if it has been true since… beforeobject
even existed in the early 2.x days. You're safe to rely on it for this kind of simple "what's going on here" debugging at the interactive prompt, but don't try to use it beyond that.A example where the memory location (and id) is not released is:
Now the ids are all unique.
I sense a deeper problem here. You should not be relying on
id
to track unique instances over the lifetime of your program. You should simply see it as a non-guaranteed memory location indicator for the duration of each object instance. If you immediately create and release instances then you may very well create consecutive instances in the same memory location.Perhaps what you need to do is track a class static counter that assigns each new instance with a unique id, and increments the class static counter for the next instance.
It's releasing the first instance since it wasn't retained, then since nothing has happened to the memory in the meantime, it instantiates a second time to the same location.