I'm working with a large ArrayList<HashMap<A,B>>
, and I would repeatedly need to select a random key from a random HashMap (and do some stuff with it). Selecting the random HashMap is trivial, but how should I select a random key from within this HashMap?
Speed is important (as I need to do this 10000 times and the hashmaps are large), so just selecting a random number k in [0,9999] and then doing .next()
on the iterator k times, is really not an option. Similarly, converting the HashMap to an array or ArrayList on every random pick is really not an option. Please, read this before replying.
Technically I feel that this should be possible, since the HashMap stores its keys in an Entry[]
internally, and selecting at random from an array is easy, but I can't figure out how to access this Entry[]
. So any ideas to access the internal Entry[]
are more than welcome. Other solutions (as long as they don't consume linear time in the hashmap size) are also welcome of course.
Note: heuristics are fine, so if there's a method that excludes 1% of the elements (e.g. because of multi-filled buckets) that's no problem at all.
I'm assuming you are using
HashMap
as you need to look something up at a later date?If not the case, then just change your
HashMap
to anArray
/ArrayList
.If this is the case, why not store your objects in a
Map
AND anArrayList
so you can look up randomly or by key.Alternatively, could you use a
TreeMap
instead ofHashMap
? I don't know what type your key is but you useTreeMap.floorKey()
in conjunction with some key randomizer.You need access to the underlying entry table.
This still has to traverse the entries to find one which is there so the worst case is O(n) but the typical behaviour is O(1).
I managed to find a solution without performance loss. I will post it here since it may help other people -- and potentially answer several open questions on this topic (I'll search for these later).
What you need is a second custom
Set
-like data structure to store the keys -- not a list as some suggested here. Lists-like data structures are to expensive to remove items from. The operations needed are adding/removing elements in constant time (to keep it up-to-date with the HashMap) and a procedure to select the random element. The following classMySet
does exactly thisSounds like you should consider either an ancillary List of keys or a real object, not a Map, to store in your list.
Since Java 8, there is an O(log(N)) approach with O(log(N)) additional memory: create a
Spliterator
viamap.entrySet().spliterator()
, make log(map.size())trySplit()
calls and choose either the first or the second half randomly. When there are say less than 10 elements left in aSpliterator
, dump them into a list and make a random pick.from top of my head
then just