.Net equivalent of Java's System.identityHashC

2019-07-13 20:21发布

问题:

Java's System.identityHashCode()

Returns the same hash code for the given object as would be returned by the default method hashCode(), whether or not the given object's class overrides hashCode().

That hash code is based on the object identity, so it will always be the same for the same object, no matter if the object is mutated between calls to identityHashCode().

In addition to that, there will not be hash collisions between any two living objects (with some Java runtimes): (the former is an inaccurate statement by Oracle in the source given below, as Jai's answer shows, and as another bug report points out as well - which basically invalidates my original question...)

[...] garbage objects are readily reclaimed and the address space is reused. The collisons result from address space reuse. If the original object remains live (not GCed) then you will not encounter this problem.

Source

In .Net, there is RuntimeHelpers.GetHashCode(), which fulfills the first condition, but not the second:

Note that GetHashCode always returns identical hash codes for equal object references. However, the reverse is not true: equal hash codes do not indicate equal object references. A particular hash code value is not unique to a particular object reference; different object references can generate identical hash codes.

So is there anything like Java's identityHashCode() in .Net?

Edit:

It was suggested that this is the same as Memory address of an object in C# which it is not, as the memory address cannot be used here (solely), as memory management moves objects around, hence the address may change during the lifetime of an object.

回答1:

Currently Java's Object#hashCode() and System#identifyHashCode() do not ensure unique values to be returned. There is already questions on this, and this is an example.

You have mentioned a bug report which states that collision occurred because objects were garbage collected, and the same memory address is reused. However modifying the same test case would prove otherwise:

List<Object> allObjs = new ArrayList<>(); // Used to prevent GC
Set<Integer> hashes = new HashSet<Integer>(1024);

int colls = 0;
for (int n = 0; n < 100000; n++)
{
    Integer obj = new Integer(88);
    allObjs.add(obj); // keep a strong reference to prevent GC
    int ihash = System.identityHashCode(obj);
    Integer iho = Integer.valueOf(ihash);
    if (hashes.contains(iho))
    {
        System.err.println("System.identityHashCode() collision!");
        colls++;
    }
    else
    {
        hashes.add(iho);
    }
}

System.out.println("created 100000 different objects - "
        + colls
        + " times with the same value for System.identityHashCode()");

System.out.println("Size of all objects is " + allObjs.size());
System.out.println("Size of hashset of hash values is " + hashes.size());

Result:

System.identityHashCode() collision!
System.identityHashCode() collision!
System.identityHashCode() collision!
created 100000 different objects - 3 times with the same value for System.identityHashCode()
Size of all objects is 100000
Size of hashset of hash values is 99997

In the linked SO question, it was also mentioned that in some implementations of JRE, the rate of collision is greatly reduced. However, it does seem like no implementation has managed to prevent all collisions. Therefore, there is no way of ensuring uniqueness of hash codes even in Java.

Therefore, don't simply believe based on one source. The person commenting it is also just a member of the Oracle team, and he or she is most likely not the person designing this.

In both C# and Java, you would have to create your own unique number generator of some kind. So the solution provided by NPras seems to do that for .NET.



回答2:

I would refer you to the following answer from Eric Lippert (who was part of the C# language design & compiler team) where he suggested using ObjectIDGenerator.

To generate unique ids for objects you could use the aptly named ObjectIDGenerator that we conveniently provide for you

Looking at the reference source (good thing they open-sourced the framework now), it does use RuntimeHelpers.GetHashCode() but also handles the potential collision by storing the references separately.

Do note his warning about object lifetime. If you need it for transient objects, he suggested you reimplement the generator - which is now much easier that you have access to the source.



标签: java c# .net hash