If I want to use extendible hashing to store a maximum of 100 records, then what is the minimum array size that I need?
I am guessing that an array of 100 would be sufficient, but I could be wrong. I also suspect that I can use a smaller array.
If I want to use extendible hashing to store a maximum of 100 records, then what is the minimum array size that I need?
I am guessing that an array of 100 would be sufficient, but I could be wrong. I also suspect that I can use a smaller array.
What do you know about your hash function?
You mentioned extendible hashing.
With extendible hashing you look at your hash as a bit string and typically implement the bucket lookup via a trie. Instead of a trie based lookup though I assume you are converting this to an index into your array.
You mentioned you will have at most 100 elements. If you wanted all distinct hashes you'd have 128 possibilities since that's the closest combination of bits with 7 bits.
If your hashing function can hash each element to have 7 of 7 (or more) different bits, then you have the most optimal solution with a bucket size of 1. Leaving 128 leaf nodes, or an array of size 128.
If your hashing function can hash each element to have 6 of 7 (or more) different bits, then you have a bucket size of 2. You would have 64 leaf nodes/combinations/array size.
If your hashing function can hash each element to have 5 of 7 (or more) different bits, then you have a bucket size of 4. You would have 32 leaf nodes/combinations/array size.
Since you said you want a bucket size of 4 I think your answer would be 32 and you have a hard requirement that you have a good hashing function that can give you at least 5 of the first bits as distinct.
I think it depends on whether you need high performance or saving storage. You can just save elements into an array of 100. I don't know a lot about extendible hashing, but my general understanding about hashing is that it will have some kinds of collision, and if you use a bigger array to store it, the number of collision can reduce and the performance in adding/deleting and querying will also be faster. I think you should use at least 128 (just to be 2^k, I am not an expert in hashing):)