Recently, in an interview I was asked, what exactly is a bucket in hashmap? Whether it is an array or a arraylist or what?
I got confused. I know hashmaps are backed by arrays. So can I say that bucket is an array with a capacity of 16 in the start storing hashcodes and to which linked lists have their start pointer ?
I know how a hashmap internally works, just wanted to know what exactly is a bucket in terms of data structures.
No, a bucket is each element in the array you are referring to. In earlier Java versions, each bucket contained a linked list of Map entries. In new Java versions, each bucket contains either a tree structure of entries or a linked list of entries.
From the implementation notes in Java 8:
/*
* Implementation notes.
*
* This map usually acts as a binned (bucketed) hash table, but
* when bins get too large, they are transformed into bins of
* TreeNodes, each structured similarly to those in
* java.util.TreeMap. Most methods try to use normal bins, but
* relay to TreeNode methods when applicable (simply by checking
* instanceof a node). Bins of TreeNodes may be traversed and
* used like any others, but additionally support faster lookup
* when overpopulated. However, since the vast majority of bins in
* normal use are not overpopulated, checking for existence of
* tree bins may be delayed in the course of table methods.
...
I hope this may help you to understand the implementation of hash map well.
Buckets exactly is an array of Nodes. So single bucket is an instance of class java.util.HashMap.Node. Each Node is a data structure similar to LinkedList, or may be like a TreeMap (since Java 8), HashMap decides itself what is better for performance--keep buckets as LinkedList or TreeMap. TreeMap will be only chosen in case of poorly designed hashCode() function, when lots of entries will be placed in single bucket.
See how buckets look like in HashMap:
/**
* The table, initialized on first use, and resized as
* necessary. When allocated, length is always a power of two.
* (We also tolerate length zero in some operations to allow
* bootstrapping mechanics that are currently not needed.)
*/
transient Node<K,V>[] table;
Buckets are basically a data structure that is being used in the Paging algorithm of the Operating System . To be in a very Laymans language.
The objects representing a particular hashcode is being stored in that bucket.(basically you can consider the header of the linked list data structure to be the hashcode value which is represented in the terms of bucket)
The references of the object is being stored in the link list , whose header represents the value of the Hashcode.
The JVM creates them and the size, depends upon the memory being allocated by the JVM.