Counting and order with Java 8 Stream API

2019-05-10 21:07发布

问题:

I wonder how could this be ordered by COUNT then ASC.

Stream<String> fruits = Stream.of("apple", "orange", "ananas");

Map<String, Long> letters =
   fruits.map(w -> w.split(""))
              .flatMap(Arrays::stream)
              .collect(groupingBy(identity(), counting()));

Output:

{p=2, a=5, r=1, s=1, e=2, g=1, l=1, n=3, o=1}`

Desired output:

{a=5, n=3, e=2, p=2, g=1, l=1, r=1, s=1, o=1}

回答1:

It’s unavoidable to do it in two mapping steps as you need the counts first, before you can sort according to the counts:

Map<String, Long> letters = fruits
    .flatMap(Pattern.compile("")::splitAsStream)
    .collect(groupingBy(identity(), counting()))
    .entrySet().stream().sorted(Map.Entry.comparingByValue(reverseOrder()))
    .collect(LinkedHashMap::new, (m,e) -> m.put(e.getKey(), e.getValue()), Map::putAll);

If you assume that there are only ASCII lower case letters (or any other small fixed-size set of characters), you can try an alternative approach which might be more efficient. It will process the characters and counts as primitive values, being stored in a fixed size array. Objects are only generated for the final sorting and Map generation:

long[] histogram=fruits.flatMapToInt(String::chars)
    .filter(c -> c>='a' && c<='z')// just to be sure, remove if you prefer exceptions
    .collect(()->new long[26],(a,c)->a[c-'a']++, (a,b)->Arrays.setAll(a, ix->a[ix]+b[ix]));
Map<String, Long> letters=IntStream.range(0, 26).filter(i->histogram[i]!=0)
    .boxed().sorted(comparingLong(i -> -histogram[i]))
    .collect(LinkedHashMap::new, (m,i)->m.put(""+(char)(i+'a'),histogram[i]), Map::putAll);


回答2:

You can't order a map by its values. I think the best you can achieve is store the sorted entries into a LinkedHashMap, so that when you iterate other it you'll have the expected result (since you will add the mappings in the desired sorted order).

For this you need a first group by operation to know how to build the mapping 'Letter -> occurrences' (you might also consider a Map<Character, Long>).

Then you have to iterate over the entry set again, and sort the stream so that the entries are first sorted by their values and then by the natural ordering of the keys. So the comparator will looks like:

//need to provide explicit type parameters due to limited type inference at the moment
Comparator<Map.Entry<String, Long>> cmp = 
    Map.Entry.<String, Long>comparingByValue(reverseOrder()).thenComparing(Map.Entry.comparingByKey());

Putting the pieces all together, it yields:

Map<String, Long> letters =
    fruits.flatMap(w -> Arrays.stream(w.split("")))
          .collect(groupingBy(identity(), counting()))
          .entrySet()
          .stream()
          .sorted(Map.Entry.<String, Long>comparingByValue(reverseOrder()).thenComparing(Map.Entry.comparingByKey()))
          .collect(toMap(Map.Entry::getKey, Map.Entry::getValue, (a, b) -> {throw new IllegalStateException();}, LinkedHashMap::new));

which produces:

{a=5, n=3, e=2, p=2, g=1, l=1, o=1, r=1, s=1}