Java 8 stream.collect( … groupingBy ( … mapping( …

2020-07-07 10:53发布

问题:

I played around with a solution using groupingBy, mapping and reducing to the following question: Elegantly create map with object fields as key/value from object stream in Java 8. Summarized the goal was to get a map with age as key and the hobbies of a person as a Set.

One of the solutions I came up with (not nice, but that's not the point) had a strange behaviour.

With the following list as input:

List<Person> personList = Arrays.asList(
     new Person(/* name */ "A", /* age */ 23, /* hobbies */ asList("a")),
     new Person("BC", 24, asList("b", "c")),
     new Person("D", 23, asList("d")),
     new Person("E", 23, asList("e"))
);

and the following solution:

Collector<List<String>, ?, Set<String>> listToSetReducer = Collectors.reducing(new HashSet<>(), HashSet::new, (strings, strings2) -> {
  strings.addAll(strings2);
  return strings;
});
Map<Integer, Set<String>> map = personList.stream()
                                          .collect(Collectors.groupingBy(o -> o.age, 
                                                                         Collectors.mapping(o -> o.hobbies, listToSetReducer)));
System.out.println("map = " + map);

I got:

map = {23=[a, b, c, d, e], 24=[a, b, c, d, e]}

clearly not what I was expecting. I rather expected this:

map = {23=[a, d, e], 24=[b, c]}

Now if I just replace the order of (strings, strings2) of the binary operator (of the reducing collector) to (strings2, strings) I get the expected result. So what did I miss here? Did I misinterpret the reducing-collector? Or which documentation piece did I miss that makes it obvious that my usage was not working as expected?

Java version is 1.8.0_121 if that matters.

回答1:

Reduction should never modify the incoming objects. In your case, you are modifying the incoming HashSet that is supposed to be the identity value and return it, so all groups will have the same HashSet instance as result, containing all values.

What you need is a Mutable Reduction, which can be implemented via Collector.of(…) like it has been already implemented with the prebuilt collectors Collectors.toList(), Collectors.toSet(), etc.

Map<Integer, Set<String>> map = personList.stream()
    .collect(Collectors.groupingBy(o -> o.age,
        Collector.of(HashSet::new, (s,p) -> s.addAll(p.hobbies), (s1,s2) -> {
            s1.addAll(s2);
            return s1;
        })));

The reason, we need a custom collector at all, is that Java 8 doesn’t have the flatMapping collector, which Java 9 is going to introduce. With that, the solution will look like:

Map<Integer, Set<String>> map = personList.stream()
    .collect(Collectors.groupingBy(o -> o.age,
        Collectors.flatMapping(p -> p.hobbies.stream(), Collectors.toSet())));