Java Stream: find an element with a min/max value

2019-01-26 03:36发布

问题:

I have a stream of objects and I would like to find the one with a maximal value of some attribute that's expensive to calculate.

As a specific simple example, say that we have a list of strings and we want to find the coolest one, given a coolnessIndex function.

The following should work:

String coolestString = stringList
        .stream()
        .max((s1, s2) -> Integer.compare(coolnessIndex(s1), coolnessIndex(s2)))
        .orElse(null);

Now, there are two problems with this. First, assuming the coolnessIndex is expensive to calculate, this probably won't be very efficient. I suppose the max method will need to use the comparator repeatedly, which in turn will call the coolnessIndex repeatedly and at the end it will be called more than once for each string.

Second, having to provide the comparator leads to some redundancy in the code. I would much prefer syntax like this:

String coolestString = stringList
        .stream()
        .maxByAttribute(s -> coolnessIndex(s))
        .orElse(null);

However, I haven't been able to find a matching method in the Stream API. This surprises me, since finding min/max by an attribute seems like a common pattern. I wonder if there's a better way than using the comparator (other than a for loop).

回答1:

Here's a variant using an Object[] as a tuple, not the prettiest code but concise

String coolestString = stringList
        .stream()
        .map(s -> new Object[] {s, coolnessIndex(s)})
        .max(Comparator.comparingInt(a -> (int)a[1]))
        .map(a -> (String)a[0])
        .orElse(null);


回答2:

Stream<String> stringStream = stringList.stream();
String coolest = stringStream.reduce((a,b)-> 
    coolnessIndex(a) > coolnessIndex(b) ? a:b;
).get()


回答3:

Thanks everyone for suggestions. At last I found the solution I like the most at Efficiency of the way comparator works -- the answer from bayou.io:

Have a general purpose cache method:

public static <K,V> Function<K,V> cache(Function<K,V> f, Map<K,V> cache)
{
    return k -> cache.computeIfAbsent(k, f);
}

public static <K,V> Function<K,V> cache(Function<K,V> f)
{
    return cache(f, new IdentityHashMap<>());
}

This could then be used as follows:

String coolestString = stringList
        .stream()
        .max(Comparator.comparing(cache(CoolUtil::coolnessIndex)))
        .orElse(null);


回答4:

How about using two streams, one to create a map with the pre-calculated values and a second using the map's entry set to find the max value:

        String coolestString = stringList
            .stream()
            .collect(Collectors.toMap(Function.identity(), Test::coolnessIndex))
            .entrySet()
            .stream()
            .max((s1, s2) -> Integer.compare(s1.getValue(), s2.getValue()))
            .orElse(null)
            .getKey();


回答5:

I would create a local class (a class defined inside a method—rare, but perfectly legal), and map your objects to that, so the expensive attribute is computed exactly once for each:

class IndexedString {
    final String string;
    final int index;

    IndexedString(String s) {
        this.string = Objects.requireNonNull(s);
        this.index = coolnessIndex(s);
    }

    String getString() {
        return string;
    }

    int getIndex() {
        return index;
    }
}

String coolestString = stringList
    .stream()
    .map(IndexedString::new)
    .max(Comparator.comparingInt(IndexedString::getIndex))
    .map(IndexedString::getString)
    .orElse(null);


回答6:

You can utilize the idea of collecting the results from the stream appropriately. The constraint of expensive coolness calculation function makes you consider calling that function exactly once for each element of the stream.

Java 8 provides the collect method on the Stream and a variety of ways in which you can use collectors. It appears that if you used the TreeMap to collect your results, you can retain the expressiveness and at the same time remain considerate of efficiency:

public class Expensive {
    static final Random r = new Random();
    public static void main(String[] args) {
        Map.Entry<Integer, String> e =
        Stream.of("larry", "moe", "curly", "iggy")
                .collect(Collectors.toMap(Expensive::coolness,
                                          Function.identity(),
                                          (a, b) -> a,
                                          () -> new TreeMap<>
                                          ((x, y) -> Integer.compare(y, x))
                        ))
                .firstEntry();
        System.out.println("coolest stooge name: " + e.getKey() + ", coolness: " + e.getValue());
    }

    public static int coolness(String s) {
        // simulation of a call that takes time.
        int x = r.nextInt(100);
        System.out.println(x);
        return x;
    }
}

This code prints the stooge with maximum coolness and the coolness method is called exactly once for each stooge. The BinaryOperator that works as the mergeFunction ((a, b) ->a) can be further improved.



回答7:

This is a reduction problem. Reducing a list down to a specific value. In general reduce works down the list operating on a partial solution and an item in the list. In this case, that would mean comparing the previous 'winning' value to the new value from the list which will calculate the expensive operation twice on each comparison.

According to https://docs.oracle.com/javase/tutorial/collections/streams/reduction.html an alternative is to use collect instead of reduce.

A custom consumer class will allow keeping track of the expensive operations as it reduces the list. Consumer can get around the multiple calls to the expensive calculation by working with mutable state.

    class Cooler implements Consumer<String>{

    String coolestString = "";
    int coolestValue = 0;

    public String coolest(){
        return coolestString;
    }
    @Override
    public void accept(String arg0) {
        combine(arg0, expensive(arg0));
    }

    private void combine (String other, int exp){
        if (coolestValue < exp){
            coolestString = other;
            coolestValue = exp;
        }
    }
    public void combine(Cooler other){
        combine(other.coolestString, other.coolestValue);
    }
}

This class accepts a string and if it is cooler than the previous winner, it replaces it and saves the expensive calculated value.

Cooler cooler =  Stream.of("java", "php", "clojure", "c", "lisp")
                 .collect(Cooler::new, Cooler::accept, Cooler::combine);
System.out.println(cooler.coolest());


回答8:

just create your (object,metric) pairs first:

public static <T> Optional<T> maximizeOver(List<T> ts, Function<T,Integer> f) {
    return ts.stream().map(t -> Pair.pair(t, f.apply(t)))
        .max((p1,p2) -> Integer.compare(p1.second(), p2.second()))
        .map(Pair::first);
}

(those are com.googlecode.totallylazy.Pair's)