Using Collections.binarySearch() for predicate sea

2019-08-28 21:31发布

问题:

I have a list of timestamps sorted in ascending order:

List<Instant> timestamps = ...; // note: sorted in ascending order

Now, given an input timestamp Instant inputTs, I want to find an entry t in timestamps that satisfies t.isBefore(inputTs) && inputTs.isBefore(t.plusMillis(SOME_CONSTANT)), i.e., I am simply looking for a t such that inputTs lies within the bounds of some fixed-length duration starting at t. I acknowledge that there can theoretically be multiple such ts, so the search is allowed to arbitrarily choose between these.

The Collections.binarySearch(...) overloads expect a key, indicating that the common/intended usecase is to search for a "complete match"/identical entry (in lack of better words, sorry). However, in my case inputTs will be different from the entries present in timestamps as inputTs is expected to be a point in time shortly after some entry t in timestamps.

My idea is to simply make the Comparator<Instant> that I provide to Collections.binarySearch(...) return 0 when the predicate holds:

public class TimestampFinder {
    private static final long SOME_CONSTANT = 10_000;
    private List<Instant> timestamps = ... ; // initialize and sort

    public Instant find(Instant inputTs) {
        int index = Collections.binarySearch(timestamps, inputTs, (t, key) -> {
            if (t.isBefore(key) && key.isBefore(t.plusMillis(SOME_CONSTANT))) {
                // inputTs is part of the duration after t
                // return 0 to indicate that we've found a match
                return 0;
            }
            // inputTs not in interval
            // use Instant's implementation of Comparator to indicate to binarySearch if it should continue the search in the upper or lower half of the list
            return t.compareTo(key);
        });
        return index >= 0 ? timestamps.get(index) : null;
    }
} 

Is this a proper (efficient) way to solve this problem, or is there a better alternative that I've overlooked? Note that the number of calls to find(Instant) will vastly outnumber the number of elements in timestamps, which is why I consider the overhead incurred by sorting timestamps to be warranted.

回答1:

Collections.binarySearch doesn't have to be used for exact matches. As specified in the documentation, if an exact match isn't found, it'll return -1 - i where i is the index of the next-higher element in the list.

Just do a search for inputTs with the natural ordering. If it isn't found, then you can derive the index of the next-higher Instant from inputTs (just do -1 - resultOfBinarySearch). If the Instant at that index is before(inputTs.plusMillis(CONSTANT)), then you're done, otherwise, no such Instant exists.

I do think your existing solution abuses binarySearch in concerning ways, for what it's worth.