Transform and filter a Java Map with streams

2020-05-14 15:01发布

问题:

I have a Java Map that I'd like to transform and filter. As a trivial example, suppose I want to convert all values to Integers then remove the odd entries.

Map<String, String> input = new HashMap<>();
input.put("a", "1234");
input.put("b", "2345");
input.put("c", "3456");
input.put("d", "4567");

Map<String, Integer> output = input.entrySet().stream()
        .collect(Collectors.toMap(
                Map.Entry::getKey,
                e -> Integer.parseInt(e.getValue())
        ))
        .entrySet().stream()
        .filter(e -> e.getValue() % 2 == 0)
        .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));


System.out.println(output.toString());

This is correct and yields: {a=1234, c=3456}

However, I can't help but wonder if there's a way to avoid calling .entrySet().stream() twice.

Is there a way I can perform both transform and filter operations and call .collect() only once at the end?

回答1:

Yes, you can map each entry to another temporary entry that will hold the key and the parsed integer value. Then you can filter each entry based on their value.

Map<String, Integer> output =
    input.entrySet()
         .stream()
         .map(e -> new AbstractMap.SimpleEntry<>(e.getKey(), Integer.valueOf(e.getValue())))
         .filter(e -> e.getValue() % 2 == 0)
         .collect(Collectors.toMap(
             Map.Entry::getKey,
             Map.Entry::getValue
         ));

Note that I used Integer.valueOf instead of parseInt since we actually want a boxed int.


If you have the luxury to use the StreamEx library, you can do it quite simply:

Map<String, Integer> output =
    EntryStream.of(input).mapValues(Integer::valueOf).filterValues(v -> v % 2 == 0).toMap();


回答2:

One way to solve the problem with much lesser overhead is to move the mapping and filtering down to the collector.

Map<String, Integer> output = input.entrySet().stream().collect(
    HashMap::new,
    (map,e)->{ int i=Integer.parseInt(e.getValue()); if(i%2==0) map.put(e.getKey(), i); },
    Map::putAll);

This does not require the creation of intermediate Map.Entry instances and even better, will postpone the boxing of int values to the point when the values are actually added to the Map, which implies that values rejected by the filter are not boxed at all.

Compared to what Collectors.toMap(…) does, the operation is also simplified by using Map.put rather than Map.merge as we know beforehand that we don’t have to handle key collisions here.

However, as long as you don’t want to utilize parallel execution you may also consider the ordinary loop

HashMap<String,Integer> output=new HashMap<>();
for(Map.Entry<String, String> e: input.entrySet()) {
    int i = Integer.parseInt(e.getValue());
    if(i%2==0) output.put(e.getKey(), i);
}

or the internal iteration variant:

HashMap<String,Integer> output=new HashMap<>();
input.forEach((k,v)->{ int i = Integer.parseInt(v); if(i%2==0) output.put(k, i); });

the latter being quite compact and at least on par with all other variants regarding single threaded performance.



回答3:

Guava's your friend:

Map<String, Integer> output = Maps.filterValues(Maps.transformValues(input, Integer::valueOf), i -> i % 2 == 0);

Keep in mind that output is a transformed, filtered view of input. You'll need to make a copy if you want to operate on them independently.



回答4:

You could use the Stream.collect(supplier, accumulator, combiner) method to transform the entries and conditionally accumulate them:

Map<String, Integer> even = input.entrySet().stream().collect(
    HashMap::new,
    (m, e) -> Optional.ofNullable(e)
            .map(Map.Entry::getValue)
            .map(Integer::valueOf)
            .filter(i -> i % 2 == 0)
            .ifPresent(i -> m.put(e.getKey(), i)),
    Map::putAll);

System.out.println(even); // {a=1234, c=3456}

Here, inside the accumulator, I'm using Optional methods to apply both the transformation and the predicate, and, if the optional value is still present, I'm adding it to the map being collected.



回答5:

Another way to do this is to remove the values you don't want from the transformed Map:

Map<String, Integer> output = input.entrySet().stream()
        .collect(Collectors.toMap(
                Map.Entry::getKey,
                e -> Integer.parseInt(e.getValue()),
                (a, b) -> { throw new AssertionError(); },
                HashMap::new
         ));
output.values().removeIf(v -> v % 2 != 0);

This assumes you want a mutable Map as the result, if not you can probably create an immutable one from output.


If you are transforming the values into the same type and want to modify the Map in place this could be alot shorter with replaceAll:

input.replaceAll((k, v) -> v + " example");
input.values().removeIf(v -> v.length() > 10);

This also assumes input is mutable.


I don't recommend doing this because It will not work for all valid Map implementations and may stop working for HashMap in the future, but you can currently use replaceAll and cast a HashMap to change the type of the values:

((Map)input).replaceAll((k, v) -> Integer.parseInt((String)v));
Map<String, Integer> output = (Map)input;
output.values().removeIf(v -> v % 2 != 0);

This will also give you type safety warnings and if you try to retrieve a value from the Map through a reference of the old type like this:

String ex = input.get("a");

It will throw a ClassCastException.


You could move the first transform part into a method to avoid the boilerplate if you expect to use it alot:

public static <K, VO, VN, M extends Map<K, VN>> M transformValues(
        Map<? extends K, ? extends VO> old, 
        Function<? super VO, ? extends VN> f, 
        Supplier<? extends M> mapFactory){
    return old.entrySet().stream().collect(Collectors.toMap(
            Entry::getKey, 
            e -> f.apply(e.getValue()), 
            (a, b) -> { throw new IllegalStateException("Duplicate keys for values " + a + " " + b); },
            mapFactory));
}

And use it like this:

    Map<String, Integer> output = transformValues(input, Integer::parseInt, HashMap::new);
    output.values().removeIf(v -> v % 2 != 0);

Note that the duplicate key exception can be thrown if, for example, the old Map is an IdentityHashMap and the mapFactory creates a HashMap.



回答6:

Here is code by AbacusUtil

Map<String, String> input = N.asMap("a", "1234", "b", "2345", "c", "3456", "d", "4567");

Map<String, Integer> output = Stream.of(input)
                          .groupBy(e -> e.getKey(), e -> N.asInt(e.getValue()))
                          .filter(e -> e.getValue() % 2 == 0)
                          .toMap(Map.Entry::getKey, Map.Entry::getValue);

N.println(output.toString());

Declaration: I'm the developer of AbacusUtil.