Why does hashmap does not have ensureCapacity() me

2019-04-12 12:03发布

问题:

ArrayList and HashMap both have constructors to set initial capacity yet ArrayList provides ensureCapacity() to ensure that internal array is already increased if some good amount of elements are expected to be inserted. Same can happen with a HashMap too in some situation. Then why does HashMap not have an ensure capacity method that will already keep the buckets ready?

回答1:

The short answer is, it's not very useful.

Structures like ArrayList and HashMap have a concept of capacity, which is the length of some internal array that isn't directly visible to users. The capacity is distinct from the size, which is the number of elements or entries logically contained in the structure.

The word "capacity" is really a misnomer, as it doesn't actually represent any limit that's significant to users. It's an implementation detail. As elements or entries are added, internal arrays will be resized automatically and transparently. There are no semantics to changing the capacity. You can't tell whether a call to ensureCapacity() has actually changed the capacity, and if it did change the capacity, the list or map is still equal to anything it was equal to before.

The reason for having the notion of capacity at all in the API is to improve performance in the cases where the user knows that a lot of elements are going to be added. This helps avoid the overhead of repeated resizing, in cases where the user knows that a lot of elements are going to be added. The most common case for this is at construction time, where it's most likely that you'll know how many elements you're going to add.

Note that the batch addition methods (addAll or putAll) will look at the size of what's about to be added, and do any necessary resizing of the destination once.

You'd call Arraylist.ensureCapacity() if you have an existing list that you want to add lot of elements to; you have a good idea of how many are going to be added; you have to add them one at a time, instead of in a batch; and your application is so performance-sensitive that you have to avoid multiple resizing. This seems pretty rare.

One could imagine an API HashMap.ensureCapacity(). If necessary, it would resize the internal table, and then rehash all the elements into the buckets of this table. This would help avoid repeated resizes/rehashes if a lot of entries were added in the future. This is a semantically a reasonable thing to do, but the number of cases where it's really useful seems quite small.

The bottom line is that HashMap.ensureCapacity() could be added, but there's little enough use for it that it's never been a priority to add it.



回答2:

HashMap is fundamentally different to ArrayList.

The number of 'buckets' in an ArrayList is just the size of the backing array, when it's full, it's full.

The number of 'buckets' in a HashMap is not really a good indication of how many objects it can store, given that multiple objects can hash to the same bucket, and the collision resolution strategy that Java employs is chaining (i.e. creating a linked list, or similar, for the bucket if multiple objects hash there). (citation required!)

Ensuring the number of buckets in a HashMap doesn't ensure that you will fill them all before you've already reached a load factor that reduces your performance intolerably. So load factor is a better way to ensure that you have the performance you want.