I have a gridded dataset, with data available at the following locations:
lon <- seq(-179.75,179.75, by = 0.5)
lat <- seq(-89.75,89.75, by = 0.5)
I would like to find all of the data points that are within 500 km of the location:
mylat <- 47.9625
mylon <- -87.0431
I aim to use the geosphere package in R, but the method I've currently written does not seem very efficient:
require(geosphere)
dd2 <- array(dim = c(length(lon),length(lat)))
for(i in 1:length(lon)){
for(ii in 1:length(lat)){
clon <- lon[i]
clat <- lat[ii]
dd <- as.numeric(distm(c(mylon, mylat), c(clon, clat), fun = distHaversine))
dd2[i,ii] <- dd <= 500000
}
}
Here, I loop through each grid in the data and find if the distance is less than 500 km. I then store a variable with either TRUE or FALSE, which I can then use to average the data (other variable). From this method, I want a matrix with TRUE or FALSE for the locations within 500 km from the lat and lon shown. Is there a more efficient method for doing this?
Timings:
Comparing @nicola's and my version gives:
My original solution: (IMHO nicola's second version is much cleaner and faster.)
You can do the following (explanation below)
Explanation:
For the loop i apply the following logic:
outer_loop_state
is initialized with 0. If a row with at least one raster-point inside the circle is foundouter_loop_state
is set to 1. Once there are no more points within the circle for a given rowi
break.The
distm
call in @nicola version basically does the same without this trick. So it calculates all rows.Code for timings:
The
dist*
functions of thegeosphere
package are vectorized, so you only need to prepare better your inputs. Try this:As the @Floo0 answer showed, there is a lot of unnecessary calculations. We can follow another strategy: we first determine the lon and lat range that can be closer than the threshold and then we use only them to calculate the distance:
In this way, you calculate just
lg+ln+lg*ln
(lg
andln
are the length oflatgood
andlongood
), i.e. 531 distances, opposed to the 259200 with my previous method.