I have some code utilizing standard Java collections: arrays, ArrayDeques, HashMaps, Lists, HashSets.
My code is expected to be deterministic: the hash codes of all the elements, the initial contents of the collections, and etc. are expected to depend only on the input data. The final output of the code is produced from a keySet()
of some HashMap.
I've noticed that the order of the elements in the resulting keySet sometimes changes from run to run. What does it mean?
- Some standard collections have non-deterministic behavior (e.g., hash codes of some internal objects are non-deterministic)?
- My own code is actually non-deterministic (e.g., I forgot to properly override hashCode() for some class)? This might mean I have a bug somewhere (that wasn't caught by the tests yet), that's why I'm concerned.
- Something else?
This happens quite consistently with JDK 1.7.0_60, x86 on Windows 7 x64. Allegedly, this does not happen (or happens rarely) with JDK 1.8.0_05. Also, the order of elements in the resulting keySet (and the overall order in which the data items are processed) changes when switching between the above JDKs.
I suspect, this is some feature of a HashSet, but could not trace it to a specific line of code.
UPD 1 I do not really need deterministic sets, and I know that HashSet provides no guarantees. I'm just trying to find the reason of non-deterministic behavior. If it is in library code - fine. But if it is in my code - I'll likely have to fix it. Hence the question.
UPD 2 Of course, I managed to find the answer right after posting the question. Just in the beginning of 1.7's HashMap.java:
/**
* A randomizing value associated with this instance that is applied to
* hash code of keys to make hash collisions harder to find. If 0 then
* alternative hashing is disabled.
*/
transient int hashSeed = 0;
In 1.8, this randomization doesn't seem to be there any more.
From the documentation of HashSet:
Just copying here what is already included in the question. A quick look through the sources showed that in 1.7, HashMap indeed has non-deterministic behavior, and each instance seeds the hashes of the elements with some random value. In 1.8, the implementation was changed, and the randomization doesn't seem to be there any more.
If you need a stable ordered HashSet, then you should be using
LinkedHashSet
per the javadoc,And per the
HashSet
javadoc,