I've often heard that these methods (Object.hashCode
and System.identityHashCode
) return the address of the object, or something computed quickly from the address; but I'm also pretty sure the garbage collector moves and compacts objects. Since the hash code cannot change, this presents a problem. I know this is not something one needs to know for everyday work, but I'd like to understand the internals. So, does anyone know how this is implemented in Java? Or .NET, since they are probably similar.
相关问题
- Delete Messages from a Topic in Apache Kafka
- Jackson Deserialization not calling deserialize on
- How to maintain order of key-value in DataFrame sa
- StackExchange API - Deserialize Date in JSON Respo
- Difference between Types.INTEGER and Types.NULL in
.NET's implementation is intentionally not published (and when you attempt to decompile it, you will find that it makes an unmanaged framework call). The only documentation as such is here, which only states that it is "not guaranteed to produce a different value for each object", and "may change between framework versions". Making any assumptions about how it actually works is probably ill-advised.
Java's is more well-understood (though presumably could differ across JVMs), and is covered specifically in this question: Will .hashcode() return a different int due to compaction of tenure space?
The gist of the Java implementation is that by contract, the value of an object's hashcode is not relevant until it is retrieved for the first time. After that, it must remain constant. Thus the GC moving the object doesn't matter until the object's hashcode() method is called for the first time. After that, a cached value is used.
In .net the getHash() method will be impacted by the GC and therefore its recommended that developers use their own hash implementations. I cant find the link to the internal implementation at them moment. I will post it later if I find it..
Found the link... This question was answered here
The identityHashCode does not change for an object. So any moving is done beneath that level.
A rudimentary implementation would have a logical address --> physical address mapping for every object.
More sophisticated implementations would only have the mapping at a page level, so perhaps the last 6 bits is the memory offset, and the rest are the page id. The indirection would happen at the page id --> actual page address level.