Limit groupBy in Java 8

2019-04-04 11:00发布

问题:

How can I limit groupBy by each entry?

For example (based on this example: stream groupBy):

studentClasses.add(new StudentClass("Kumar", 101, "Intro to Web"));
studentClasses.add(new StudentClass("White", 102, "Advanced Java"));
studentClasses.add(new StudentClass("Kumar", 101, "Intro to Cobol"));
studentClasses.add(new StudentClass("White", 101, "Intro to Web"));
studentClasses.add(new StudentClass("White", 102, "Advanced Web"));
studentClasses.add(new StudentClass("Sargent", 106, "Advanced Web"));
studentClasses.add(new StudentClass("Sargent", 103, "Advanced Web"));
studentClasses.add(new StudentClass("Sargent", 104, "Advanced Web"));
studentClasses.add(new StudentClass("Sargent", 105, "Advanced Web"));

This method return simple group:

   Map<String, List<StudentClass>> groupByTeachers = studentClasses
            .stream().collect(
                    Collectors.groupingBy(StudentClass::getTeacher));

What if I want to limit the returned collections? Let's assume I want only the first N classes for every teacher. How can it be done?

回答1:

It would be possible to introduce a new collector that limits the number of elements in the resulting list.

This collector will retain the head elements of the list (in encounter order). The accumulator and combiner throw away every elements when the limit has been reached during collection. The combiner code is a little tricky but this has the advantage that no additional elements are added only to be thrown away later.

private static <T> Collector<T, ?, List<T>> limitingList(int limit) {
    return Collector.of(
                ArrayList::new, 
                (l, e) -> { if (l.size() < limit) l.add(e); }, 
                (l1, l2) -> {
                    l1.addAll(l2.subList(0, Math.min(l2.size(), Math.max(0, limit - l1.size()))));
                    return l1;
                }
           );
}

And then use it like this:

Map<String, List<StudentClass>> groupByTeachers = 
       studentClasses.stream()
                     .collect(groupingBy(
                          StudentClass::getTeacher,
                          limitingList(2)
                     ));


回答2:

You could use collectingAndThen to define a finisher operation on the resulting list. This way you can limit, filter, sort, ... the lists:

int limit = 2;

Map<String, List<StudentClass>> groupByTeachers =
    studentClasses.stream()
                  .collect(
                       groupingBy(
                           StudentClass::getTeacher,
                           collectingAndThen(
                               toList(),
                               l -> l.stream().limit(limit).collect(toList()))));


回答3:

For this you need to .stream() the result of your Map. You can do this by doing:

// Part that comes from your example
Map<String, List<StudentClass>> groupByTeachers = studentClasses
            .stream().collect(
                    Collectors.groupingBy(StudentClass::getTeacher));

// Create a new stream and limit the result
groupByTeachers =
    groupByTeachers.entrySet().stream()
        .limit(N) // The actual limit
        .collect(Collectors.toMap(
            e -> e.getKey(),
            e -> e.getValue()
        ));

This isn't a very optimal way to do it. But if you .limit() on the initial list, then the grouping results would be incorrect. This is the safest way to guarantee the limit.

EDIT:

As stated in the comments this limits the teacher, not the class per teacher. In that case you can do:

groupByTeachers =
        groupByTeachers.entrySet().stream()
            .collect(Collectors.toMap(
                e -> e.getKey(),
                e -> e.getValue().stream().limit(N).collect(Collectors.toList()) // Limit the classes PER teacher
            ));


回答4:

This would give you the desired result, but it still categorizes all the elements of the stream:

final int N = 10;
final HashMap<String, List<StudentClass>> groupByTeachers = 
        studentClasses.stream().collect(
            groupingBy(StudentClass::getTeacher, HashMap::new,
                collectingAndThen(toList(), list -> list.subList(0, Math.min(list.size(), N)))));