Is there a numpy-thonic way, e.g. function, to find the nearest value in an array?
Example:
np.find_nearest( array, value )
Is there a numpy-thonic way, e.g. function, to find the nearest value in an array?
Example:
np.find_nearest( array, value )
Here is a fast vectorized version of @Dimitri's solution if you have many
values
to search for (values
can be multi-dimensional array):Benchmarks
> 100 times faster than using a
for
loop with @Demitri's solution`With slight modification, the answer above works with arrays of arbitrary dimension (1d, 2d, 3d, ...):
Or, written as a single line:
Maybe helpful for
ndarrays
:Summary of answer: If one has a sorted
array
then the bisection code (given below) performs the fastest. ~100-1000 times faster for large arrays, and ~2-100 times faster for small arrays. It does not require numpy either. If you have an unsortedarray
then ifarray
is large, one should consider first using an O(n logn) sort and then bisection, and ifarray
is small then method 2 seems the fastest.First you should clarify what you mean by nearest value. Often one wants the interval in an abscissa, e.g. array=[0,0.7,2.1], value=1.95, answer would be idx=1. This is the case that I suspect you need (otherwise the following can be modified very easily with a followup conditional statement once you find the interval). I will note that the optimal way to perform this is with bisection (which I will provide first - note it does not require numpy at all and is faster than using numpy functions because they perform redundant operations). Then I will provide a timing comparison against the others presented here by other users.
Bisection:
Now I'll define the code from the other answers, they each return an index:
Now I'll time the codes: Note methods 1,2,4,5 don't correctly give the interval. Methods 1,2,4 round to nearest point in array (e.g. >=1.5 -> 2), and method 5 always rounds up (e.g. 1.45 -> 2). Only methods 3, and 6, and of course bisection give the interval properly.
For a large array bisection gives 4us compared to next best 180us and longest 1.21ms (~100 - 1000 times faster). For smaller arrays it's ~2-100 times faster.
All the answers are beneficial to gather the information to write efficient code. However, I have written a small Python script to optimize for various cases. It will be the best case if the provided array is sorted. If one searches the index of the nearest point of a specified value, then
bisect
module is the most time efficient. When one search the indices correspond to an array, thenumpy searchsorted
is most efficient.In [63]: %time bisect.bisect_left(xlist, 0.3) CPU times: user 0 ns, sys: 0 ns, total: 0 ns Wall time: 22.2 µs
In [64]: %time np.searchsorted(xar, 0.3, side="left") CPU times: user 0 ns, sys: 0 ns, total: 0 ns Wall time: 98.9 µs
%time np.searchsorted(xar, randpts, side="left") CPU times: user 4 ms, sys: 0 ns, total: 4 ms Wall time: 1.2 ms
If we follow the multiplicative rule, then numpy should take ~100 ms which implies ~83X faster.