可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I'm working with a large ArrayList<HashMap<A,B>>
, and I would repeatedly need to select a random key from a random HashMap (and do some stuff with it). Selecting the random HashMap is trivial, but how should I select a random key from within this HashMap?
Speed is important (as I need to do this 10000 times and the hashmaps are large), so just selecting a random number k in [0,9999] and then doing .next()
on the iterator k times, is really not an option. Similarly, converting the HashMap to an array or ArrayList on every random pick is really not an option. Please, read this before replying.
Technically I feel that this should be possible, since the HashMap stores its keys in an Entry[]
internally, and selecting at random from an array is easy, but I can't figure out how to access this Entry[]
. So any ideas to access the internal Entry[]
are more than welcome. Other solutions (as long as they don't consume linear time in the hashmap size) are also welcome of course.
Note: heuristics are fine, so if there's a method that excludes 1% of the elements (e.g. because of multi-filled buckets) that's no problem at all.
回答1:
from top of my head
List<A> keysAsArray = new ArrayList<A>(map.keySet())
Random r = new Random()
then just
map.get(keysAsArray.get(r.nextInt(keysAsArray.size()))
回答2:
You need access to the underlying entry table.
// defined staticly
Field table = HashMap.class.getDeclaredField("table");
table.setAccessible(true);
Random rand = new Random();
public Entry randomEntry(HashMap map) {
Entry[] entries = (Entry[]) table.get(map);
int start = rand.nextInt(entries.length);
for(int i=0;i<entries.length;i++) {
int idx = (start + i) % entries.length;
Entry entry = entries[idx];
if (entry != null) return entry;
}
return null;
}
This still has to traverse the entries to find one which is there so the worst case is O(n) but the typical behaviour is O(1).
回答3:
Sounds like you should consider either an ancillary List of keys or a real object, not a Map, to store in your list.
回答4:
I managed to find a solution without performance loss. I will post it here since it may help other people -- and potentially answer several open questions on this topic (I'll search for these later).
What you need is a second custom Set
-like data structure to store the keys -- not a list as some suggested here. Lists-like data structures are to expensive to remove items from. The operations needed are adding/removing elements in constant time (to keep it up-to-date with the HashMap) and a procedure to select the random element. The following class MySet
does exactly this
class MySet<A> {
ArrayList<A> contents = new ArrayList();
HashMap<A,Integer> indices = new HashMap<A,Integer>();
Random R = new Random();
//selects random element in constant time
A randomKey() {
return contents.get(R.nextInt(contents.size()));
}
//adds new element in constant time
void add(A a) {
indices.put(a,contents.size());
contents.add(a);
}
//removes element in constant time
void remove(A a) {
int index = indices.get(a);
contents.set(index,contents.get(contents.size()-1));
contents.remove(contents.size()-1);
indices.set(contents.get(contents.size()-1),index);
indices.remove(a);
}
}
回答5:
I'm assuming you are using HashMap
as you need to look something up at a later date?
If not the case, then just change your HashMap
to an Array
/ArrayList
.
If this is the case, why not store your objects in a Map
AND an ArrayList
so you can look up randomly or by key.
Alternatively, could you use a TreeMap
instead of HashMap
? I don't know what type your key is but you use TreeMap.floorKey()
in conjunction with some key randomizer.
回答6:
After spending some time, I came to the conclusion that you need to create a model which can be backed by a List<Map<A, B>>
and a List<A>
to maintain your keys. You need to keep the access of your List<Map<A, B>>
and List<A>
, just provide the operations/methods to the caller. In this way, you will have the full control over implementation, and the actual objects will be safer from external changes.
Btw, your questions lead me to,
- Why does the java.util.Set<V> interface not provide a get(Object o) method?, and
- Bimap: I was trying to be clever but, of course, its
values()
method also returns Set
.
This example, IndexedSet, may give you an idea about how-to.
[edited]
This class, SetUniqueList, might help you if you decide to create your own model. It explicitly states that it wraps the list
, not copies. So, I think, we can do something like,
List<A> list = new ArrayList(map.keySet());
SetUniqueList unikList = new SetUniqueList(list, map.keySet);
// Now unikList should reflect all the changes to the map keys
...
// Then you can do
unikList.get(i);
Note: I didn't try this myself. Will do that later (rushing to home).
回答7:
Since Java 8, there is an O(log(N)) approach with O(log(N)) additional memory: create a Spliterator
via map.entrySet().spliterator()
, make log(map.size()) trySplit()
calls and choose either the first or the second half randomly. When there are say less than 10 elements left in a Spliterator
, dump them into a list and make a random pick.
回答8:
If you absolutely need to access the Entry array in HashMap, you can use reflection. But then your program will be dependent on that concrete implementation of HashMap.
As proposed, you can keep a separate list of keys for each map. You would not keep deep copies of the keys, so the actual memory denormalisation wouldn't be that big.
Third approach is to implement your own Map implementation, the one that keeps keys in a list instead of a set.
回答9:
How about wrapping HashMap in another implementation of Map? The other map maintains a List, and on put() it does:
if (inner.put(key, value) == null) listOfKeys.add(key);
(I assume that nulls for values aren't permitted, if they are use containsKey, but that's slower)