What values should I pass to create an efficient HashMap
/ HashMap
based structures for N items?
In an ArrayList
, the efficient number is N (N already assumes future grow). What should be the parameters for a HashMap
? ((int)(N * 0.75d), 0.75d)? More? Less? What is the effect of changing the load factor?
The answer Yuval gave is only correct for Hashtable. HashMap uses power-of-two buckets, so for HashMap, Zarkonnen is actually correct. You can verify this from the source code:
So, although the load factor of 0.75f is still the same between Hashtable and HashMap, you should use an initial capacity n*2 where n is the number of elements you plan on storing in the HashMap. This will ensure the fastest get/put speeds.
Regarding the load factor, I'll simply quote from the HashMap javadoc:
Meaning, the load factor should not be changed from
.75
, unless you have some specific optimization you are going to do. Initial capacity is the only thing you want to change, and set it according to yourN
value - meaning(N / 0.75) + 1
, or something in that area. This will ensure that the table will always be large enough and no rehashing will occur.For very large HashMaps in critical systems, where getting the initial capacity wrong can be very problematic, you may need empirical information to determine how best to initialize your Map.
CollectionSpy (collectionspy.com) is a new Java profiler which lets you see in the blink of an eye which HashMaps are close to needing rehashing, how many times they have been rehashed in the past, and more. An ideal tool to determine safe initial capacity arguments to capacity-based container constructors.