Java Stream: is there a way to iterate taking two

2019-01-24 01:20发布

问题:

Let's say we have this stream

Stream.of("a", "b", "err1", "c", "d", "err2", "e", "f", "g", "h", "err3", "i", "j");

and I want to save in a map the couples of adjacent strings in which the first one starts with "err".

What I thought of is something like this

Map<String, String> map = new HashMap<>();

Stream.of("a", "b", "err1", "c", "d", "err2", "e", "f", "g", "h", "err3", "i", "j")
.reduce((acc, next) -> {
    if (acc.startsWith("err"))
        map.put(acc,next);
    if (next.startsWith("err"))
        return next;
    else
        return "";
});

But I'm not totally satisfied with it for two main reasons

  1. I'm "misusing" reduce function. In Stream API every function has its clear, well defined purpose: max is supposed to calcuate max value, filter is supposed to filter based on a condition, reduce is supposed to produce an incrementally accumulated value and so on.
  2. Doing that prevents me from using Streams powerful mechanisms: what if I wanted to limit my search to the first two results?

Here I used reduce because (as far as I know) it's the only function that lets you compare couple of values that you can, somehow, lead back to something similar to "current value" and "next value" concepts.

Is there a more straightforward way? Something that allows you to iterate the stream considering more than one value for each iteration?

EDIT

What I'm thinking about is some mechanism that, given the current element, allows you to define a "window of elements" to consider, for each iteration.

Something like

<R> Stream<R> mapMoreThanOne(
    int elementsBeforeCurrent,
    int elementsAfterCurrent,
    Function<List<? super T>, ? extends R> mapper);

instead of

<R> Stream<R> map(Function<? super T, ? extends R> mapper);

That would be a powerful "upgrade" to current API.

EDIT2

I appreciate the effort of people proposing their solution, but the issue is not the algorithm per se. There are different ways to achieve my goal by putting together streams, indexes, temp variables to store previous values... but I was wondering if there was some method in Stream API that was designed for the task of dealing with elements other than the current without breaking the "stream paradigm". Something like this

List<String> list =
        Stream.of("a", "b", "err1", "c", "d", "err2", "e", "f", "g", "h", "err3", "i", "j")
        .filterFunctionImWonderingIfExist(/*filters couples of elements*/)
        .limit(2)
        .collect(Collectors.toList());

Given the answers, I think there's no "clear and quick" solution, unless using StreamEx library

回答1:

You can build a custom Collector for this task.

Map<String, String> map = 
    Stream.of("a", "b", "err1", "c", "d", "err2", "e", "f", "g", "h", "err3", "i", "j")
          .collect(MappingErrors.collector());

with:

private static final class MappingErrors {

    private Map<String, String> map = new HashMap<>();

    private String first, second;

    public void accept(String str) {
        first = second;
        second = str;
        if (first != null && first.startsWith("err")) {
            map.put(first, second);
        }
    }

    public MappingErrors combine(MappingErrors other) {
        throw new UnsupportedOperationException("Parallel Stream not supported");
    }

    public Map<String, String> finish() {
        return map;
    }

    public static Collector<String, ?, Map<String, String>> collector() {
        return Collector.of(MappingErrors::new, MappingErrors::accept, MappingErrors::combine, MappingErrors::finish);
    }

}

In this collector, two running elements are kept. Each time a String is accepted, they are updated and if the first starts with "err", the two elements are added to a map.


Another solution is to use the StreamEx library which provides a pairMap method that applies a given function to the every adjacent pair of elements of this stream. In the following code, the operation returns a String array consisting of the first and second element of the pair if the first element starts with "err", null otherwise. null elements are then filtered out and the Stream is collected into a map.

Map<String, String> map = 
    StreamEx.of("a", "b", "err1", "c", "d", "err2", "e", "f", "g", "h", "err3", "i", "j")
            .pairMap((s1, s2) -> s1.startsWith("err") ? new String[] { s1, s2 } : null)
            .nonNull()
            .toMap(a -> a[0], a -> a[1]);

System.out.println(map);


回答2:

You can write a custom collector, or use the much simpler approach of streaming over the list's indexes:

Map<String, String> result = IntStream.range(0, data.size() - 1)
        .filter(i -> data.get(i).startsWith("err"))
        .boxed()
        .collect(toMap(data::get, i -> data.get(i+1)));

This assumes that your data is in a random access friendly list or that you can temporarily dump it into one.

If you cannot randomly access the data or load it into a list or array for processing, you can always make a custom pairing collector so you can write

Map<String, String> result = data.stream()
        .collect(pairing(
                (a, b) -> a.startsWith("err"), 
                AbstractMap.SimpleImmutableEntry::new,
                toMap(Map.Entry::getKey, Map.Entry::getValue)
        ));

Here's the source for the collector. It's parallel-friendly and might come in handy in other situations:

public static <T, V, A, R> Collector<T, ?, R> pairing(BiPredicate<T, T> filter, BiFunction<T, T, V> map, Collector<? super V, A, R> downstream) {

    class Pairing {
        T left, right;
        A middle = downstream.supplier().get();
        boolean empty = true;

        void add(T t) {
            if (empty) {
                left = t;
                empty = false;
            } else if (filter.test(right, t)) {
                downstream.accumulator().accept(middle, map.apply(right, t));
            }
            right = t;
        }

        Pairing combine(Pairing other) {
            if (!other.empty) {
                this.add(other.left);
                this.middle = downstream.combiner().apply(this.middle, other.middle);
                this.right = other.right;
            }
            return this;
        }

        R finish() {
            return downstream.finisher().apply(middle);
        }
    }

    return Collector.of(Pairing::new, Pairing::add, Pairing::combine, Pairing::finish);
}


回答3:

Things would be easier if your input is located in the random-access list. This way you can utilize good old List.subList method like this:

List<String> list = Arrays.asList("a", "b", "err1", "c", "d", "err2", "e", 
     "f", "g", "h", "err3", "i", "j");

Map<String, String> map = IntStream.range(0, list.size()-1)
    .mapToObj(i -> list.subList(i, i+2))
    .filter(l -> l.get(0).startsWith("err"))
    .collect(Collectors.toMap(l -> l.get(0), l -> l.get(1)));

The same thing could be done with already mentioned StreamEx library (written by me) in a little bit shorter manner:

List<String> list = Arrays.asList("a", "b", "err1", "c", "d", "err2", "e", 
     "f", "g", "h", "err3", "i", "j");

Map<String, String> map = StreamEx.ofSubLists(list, 2, 1)
    .mapToEntry(l -> l.get(0), l -> l.get(1))
    .filterKeys(key -> key.startsWith("err"))
    .toMap();

Though if you don't want third-party dependency, the poor Stream API solution looks also not very bad.



回答4:

Here's a simple one liner using an off-the-shelf collector:

Stream<String> stream = Stream.of("a", "b", "err1", "c", "d", "err2", "e", "f", "g", "h", "err3", "i", "j");

Map<String, String> map = Arrays.stream(stream
        .collect(Collectors.joining(",")).split(",(?=(([^,]*,){2})*[^,]*$)"))
    .filter(s -> s.startsWith("err"))
    .map(s -> s.split(","))
    .collect(Collectors.toMap(a -> a[0], a -> a[1]));

The "trick" here is to first join all terms together into a single String, then split it into Strings of pairs, eg "a,b", "err1,c", etc. Once you have a stream of pairs, processing is straightforward.