I would like to flatten a Map
which associates an Integer
key to a list of String
, without losing the key mapping.
I am curious as though it is possible and useful to do so with stream
and lambda
.
We start with something like this:
Map<Integer, List<String>> mapFrom = new HashMap<>();
Let's assume that mapFrom is populated somewhere, and looks like:
1: a,b,c
2: d,e,f
etc.
Let's also assume that the values in the lists are unique.
Now, I want to "unfold" it to get a second map like:
a: 1
b: 1
c: 1
d: 2
e: 2
f: 2
etc.
I could do it like this (or very similarly, using foreach
):
Map<String, Integer> mapTo = new HashMap<>();
for (Map.Entry<Integer, List<String>> entry: mapFrom.entrySet()) {
for (String s: entry.getValue()) {
mapTo.put(s, entry.getKey());
}
}
Now let's assume that I want to use lambda instead of nested for
loops. I would probably do something like this:
Map<String, Integer> mapTo = mapFrom.entrySet().stream().map(e -> {
e.getValue().stream().?
// Here I can iterate on each List,
// but my best try would only give me a flat map for each key,
// that I wouldn't know how to flatten.
}).collect(Collectors.toMap(/*A String value*/,/*An Integer key*/))
I also gave a try to flatMap
, but I don't think that it is the right way to go, because although it helps me get rid of the dimensionality issue, I lose the key in the process.
In a nutshell, my two questions are :
- Is it possible to use
streams
andlambda
to achieve this? - Is is useful (performance, readability) to do so?
You should use
flatMap
as follows:SimpleImmutableEntry
is a nested class inAbstractMap
.This should work. Please notice that you lost some keys from List.
You need to use
flatMap
to flatten the values into a new stream, but since you still need the original keys for collecting into aMap
, you have to map to a temporary object holding key and value, e.g.The
Map.Entry
is a stand-in for the nonexistent tuple type, any other type capable of holding two objects of different type is sufficient.An alternative not requiring these temporary objects, is a custom collector:
This differs from
toMap
in overwriting duplicate keys silently, whereastoMap
without a merger function will throw an exception, if there is a duplicate key. Basically, this custom collector is a parallel capable variant ofBut note that this task wouldn’t benefit from parallel processing, even with a very large input map. Only if there were additional computational intense task within the stream pipeline that could benefit from SMP, there was a chance of getting a benefit from parallel streams. So perhaps, the concise, sequential Collection API solution is preferable.
Hope this would do it in simplest way. :))