I have a stream of words and I would like to sort them according to the occurrence of same elements (=words).
e.g.: {hello, world, hello}
to
Map<String, List<String>>
hello, {hello, hello}
world, {world}
What i have so far:
Map<Object, List<String>> list = streamofWords.collect(Collectors.groupingBy(???));
Problem 1: The stream seems to lose the information that he is processing Strings, therefore the compiler forces me to change the type to Object, List
Problem 2: I don't know what to put inside the parentesis to group it by the same occurrence. I know that I am able to process single elements within th lambda-expression but I have no idea how to reach "outside" each element to check for equality.
Thank You
The KeyExtractor you are searching for is the identity function:
Map<String, List<String>> list = streamofWords.collect(Collectors.groupingBy(Function.identity()));
EDIT added explanation:
Function.identity()
retuns a 'Function' with one method that does nothing more than returning the argument it gets.
Collectors.groupingBy(Function<S, K> keyExtractor)
provides a collector, which collects all elements of the stream to a Map<K, List<S>>
. It is using the keyExtractor implementation it gets to inspect the stream's objects of type S
and deduce a key of type K
from them. This key is the map's key used to get (or create) the list in the result map the stream element is added to.
To get a Map<String, List<String>>
, you just need to tell to the groupingBy
collector that you want to group the values by identity, so the function x -> x
.
Map<String, List<String>> occurrences =
streamOfWords.collect(groupingBy(str -> str));
However this a bit useless, as you see you have the same type of informations two times. You should look into a Map<String, Long>
, where's the value indicates the occurrences of the String in the Stream.
Map<String, Long> occurrences =
streamOfWords.collect(groupingBy(str -> str, counting()));
Basically instead of having a groupingBy
that return values as List
, you use the downstream collector counting()
to tell that you want to count the number of times this value appears.
Your sort requirement should imply that you should have a Map<Long, List<String>>
(what if different Strings appear the same number of times?), and as the default toMap
collector returns an HashMap
, it has no notions of ordering, but you could store the elements in a TreeMap
instead.
I've tried to summarize a bit what I've said in the comments.
You seems to have troubles with how str -> str
can tell whether "hello" or "world" are different.
First of all str -> str
is a function, that is, for an input x yields a value f(x). For example, f(x) = x + 2
is a function that for any value x
returns x + 2
.
Here we are using the identity function, that is f(x) = x
. When you collect the elements from the pipeline in the Map
, this function will be called before to obtain the key from the value. So in your example, you have 3 elements for which the identity function yields:
f("hello") = "hello"
f("world") = "world"
So far so good.
Now when collect()
is called, for every value in the stream you'll apply the function on it and evaluate the result (which will be the key in the Map
). If a key already exists, we take the currently mapped value and we merge in a List
the value we wanted to put (i.e the value from which you just applied the function on) with this previous mapped value. That's why you get a Map<String, List<String>>
at the end.
Let's take another example. Now the stream contains the values "hello", "world" and "hey" and the function that we want to apply to group the elements is str -> str.substring(0, 2)
, that is, the function that takes the first two characters of the String.
Similarly, we have:
f("hello") = "he"
f("world") = "wo"
f("hey") = "he"
Here you see that both "hello" and "hey" yields the same key when applying the function and hence they will be grouped in the same List
when collecting them, so that the final result is:
"he" -> ["hello", "hey"]
"wo" -> ["world"]
To have an analogy with mathematics, you could have take any non-bijective function, such as x2. For x = -2
and x = 2
we have that f(x) = 4
. So if we grouped integers by this function, -2 and 2 would have been in the same "bag".
Looking at the source code won't help you to understand what's going on at first. It's useful if you want to know how it's implemented under the hood. But try first to think of the concept with a higher level of abstraction and then maybe things will become clearer.
Hope it helps! :)
If you want to group by some fields of an object, not a whole object and you don't want to change your equals and hashCode methods I'd create a class holding a set of keys for grouping purposes:
import java.util.Arrays;
@Getter
public class MultiKey {
public MultiKey(Object... keys) {
this.keys = keys;
}
private Object[] keys;
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
MultiKey multiKey = (MultiKey) o;
return Arrays.equals(keys, multiKey.keys);
}
@Override
public int hashCode() {
return Arrays.hashCode(keys);
}
}
And the groupingBy
itself:
Map<MultiKey, List<VhfEventView>> groupedList = list
.stream()
.collect(Collectors.groupingBy(
e -> new MultiKey(e.getGroupingKey1(), e.getGroupingKey2())));