How to choose one of the largest elements in a col

2019-04-09 13:35发布

问题:

I have a list of tuples, and I want to find the tuples with the largest x value. In the case where there are multiple largest x values, I want to choose one at random. I can't figure out how to implement this random selection functionality. Below is the code I have so far:

public void testSelectRandomFromLargestVals() {
    List<Tuple<Integer, String>> list = new ArrayList<>();
    list.add(new Tuple<>(5, "five-1"));
    list.add(new Tuple<>(2, "two"));
    list.add(new Tuple<>(3, "three"));
    list.add(new Tuple<>(5, "five-2"));
    list.add(new Tuple<>(5, "five-3"));

    Optional<Tuple<Integer, String>> largestTuple = list.stream().max((t1, t2) -> Integer.compare(t1.x, t2.x));
    System.out.println("Largest tuple is: " + largestTuple.get().x + " value is: " + largestTuple.get().y);
}

public class Tuple<X, Y> {
    public final X x;
    public final Y y;
    public Tuple(X x, Y y) {
        this.x = x;
        this.y = y;
    }
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Tuple<?, ?> tuple = (Tuple<?, ?>) o;
        if (!x.equals(tuple.x)) return false;
        return y.equals(tuple.y);
    }
    @Override
    public int hashCode() {
        int result = x.hashCode();
        result = 31 * result + y.hashCode();
        return result;
    }
}

回答1:

It turns out that Misha's one-pass random selector (nice work, +1) can be combined with my one-pass max-value collector from this other answer into a single collector. This allows a random element from the set of maximal ones to be selected in a single pass.

Here's the merged collector:

static <T> Collector<T, ?, Optional<T>> rndMax(Comparator<? super T> cmp) {
    class RndMax {
        T val;
        int cnt;

        void add(T t) {
            int c;
            if (cnt == 0 || (c = cmp.compare(t, val)) > 0) {
                cnt = 1;
                val = t;
            } else if (c == 0) {
                cnt++;
                if (ThreadLocalRandom.current().nextInt(cnt) == 0) {
                    val = t;
                }
            }
        }

        RndMax merge(RndMax other) {
            if (cnt == 0) {
                return other;
            }

            if (other.cnt == 0) {
                return this;
            }

            int c = cmp.compare(val, other.val);
            if (c < 0) {
                return other;
            } else if (c > 0) {
                return this;
            } else {
                cnt += other.cnt;
                if (ThreadLocalRandom.current().nextInt(cnt) < other.cnt) {
                    val = other.val;
                }
                return this;
            }
        }

        Optional<T> finish() {
            return cnt == 0 ? Optional.empty() : Optional.of(val);
        }
    }

    return Collector.of(RndMax::new, RndMax::add, RndMax::merge, RndMax::finish);
}

You'd invoke it like this:

List<Tuple<Integer,String>> list = ... ;
Optional<Tuple<Integer,String>> max =
    list.stream().collect(rndMax(Comparator.comparingInt(t -> t.x)));


回答2:

For most practical purposes, @Bohemian's shuffle-based solution is the best, but since you expressed an interest in an approach that's constant in memory, you can do it with a custom Collector that chooses a random element:

public static <T> Collector<T, ?, Optional<T>> random() {
    class Rnd {

        T val;
        int cnt;

        void add(T t) {
            cnt++;
            if (ThreadLocalRandom.current().nextInt(cnt) == 0) {
                val = t;
            }
        }

        Rnd merge(Rnd other) {
            cnt += other.cnt;
            if (ThreadLocalRandom.current().nextInt(cnt) < other.cnt) {
                val = other.val;
            }
            return this;
        }

        Optional<T> finish() {
            return cnt == 0 ? Optional.empty() : Optional.of(val);
        }

    }

    return Collector.of(Rnd::new, Rnd::add, Rnd::merge, Rnd::finish);
}

With that, you can make one pass to find the largest x and another to pick a random matching tuple:

int largestX = list.stream().mapToInt(t -> t.x).max()
        .getAsInt();  // or throw if list is empty

Tuple<Integer, String> randomLargestTuple = list.stream()
        .filter(t -> largestX == t.x)
        .collect(random())
        .get();


回答3:

Simple answer is to shuffle first:

List<Tuple<Integer, String>> copy = new ArrayList<>(list);
Collections.shuffle(copy);
Tuple<Integer, String> largestTuple = Collections.max(copy, Comparator.comparingInt(t -> t.x)); // credit @Holger

The element returned by max() is influenced by encounter order, so shuffling effectively makes it random.

If the list is not too big (not thousands of elements), the shuffle will be pretty fast.

I made a copy of the list so as to not affect the order of the original list. If that isn't important, just shuffle as use the original list.



回答4:

Might not be an optimal solution

    final Optional<Tuple<Integer, String>> largestTuple = list.stream().max((t1, t2) -> Integer.compare(t1.x, t2.x));
    final List<Tuple<Integer, String>> allMaxElements = list.stream().filter(z -> z.x == largestTuple.get().x).collect(Collectors.toList());
    final Tuple<Integer, String> randomMaxTuple = allMaxElements.get(new SecureRandom().nextInt(allMaxElements.size()));
    System.out.println("Largest tuple is: " + randomMaxTuple.x + " value is: " + randomMaxTuple.y);

On the equals and hashCode methods you can utilize guava

@Override
public int hashCode()
{
    return com.google.common.base.Objects.hashCode(x, y);
}

@Override
public boolean equals(final Object object)
{
    if (object instanceof Tuple) {
        final Tuple<?, ?> that = (Tuple<?, ?>) object;
        return com.google.common.base.Objects.equal(this.x, that.x) && com.google.common.base.Objects.equal(this.y, that.y);
    }
    return false;
}   


回答5:

You could collect your tuples to a TreeMap, with the key being each possible x value from the tuples and the values being a list of all tuples that have that x value:

TreeMap<Integer, List<Tuple<Integer, String>>> map = list.stream()
    .collect(Collectors.groupingBy(
        t -> t.x, 
        TreeMap::new, 
        Collectors.toList()));

Then, you could grab the entry with the maximum key by using the TreeMap.lastEntry() method:

List<Tuple<Integer, String>> largestTuples = map.lastEntry().getValue();

Then, simply pick one element randomly from the largestTuples list.



回答6:

@Misha's custom Collector can also be implemented with an anonymous type, instead of a local class:

public static <T> Collector<T, ?, Optional<T>> random() {
  return Collector.of(
    () -> new Object() {
      T val;
      int cnt;
    },
    (this_, t) -> {
      this_.cnt++;
      if (ThreadLocalRandom.current().nextInt(this_.cnt) == 0) {
        this_.val = t;
      }
    },
    (this_, other) -> {
      this_.cnt += other.cnt;
      if (ThreadLocalRandom.current().nextInt(this_.cnt) < other.cnt) {
        this_.val = other.val;
      }
      return this_;
    },
    this_ -> this_.cnt == 0
      ? Optional.empty()
      : Optional.of(this_.val)
  );
}