R: find nearest index

2019-01-19 03:05发布

I have two vectors with a few thousand points, but generalized here:

A <- c(10, 20, 30, 40, 50)
b <- c(13, 17, 20)

How can I get the indicies of A that are nearest to b? The expected outcome would be c(1, 2, 2).

I know that findInterval can only find the first occurrence, and not the nearest, and I'm aware that which.min(abs(b[2] - A)) is getting warmer, but I can't figure out how to vectorize it to work with long vectors of both A and b.

3条回答
该账号已被封号
2楼-- · 2019-01-19 03:42

FindInterval gets you very close. You just have to pick between the offset it returns and the next one:

#returns the nearest occurence of x in vec
nearest.vec <- function(x, vec)
{
    smallCandidate <- findInterval(x, vec, all.inside=TRUE)
    largeCandidate <- smallCandidate + 1
    #nudge is TRUE if large candidate is nearer, FALSE otherwise
    nudge <- 2 * x > vec[smallCandidate] + vec[largeCandidate]
    return(smallCandidate + nudge)
}

nearest.vec(b,A)

returns (1,2,2), and should comparable to FindInterval in performance.

查看更多
够拽才男人
3楼-- · 2019-01-19 03:44

You can just put your code in a sapply. I think this has the same speed as a for loop so isn't technically vectorized though:

sapply(b,function(x)which.min(abs(x - A)))
查看更多
forever°为你锁心
4楼-- · 2019-01-19 03:46

Here's a solution that uses R's often overlooked outer function. Not sure if it'll perform better, but it does avoid sapply.

A <- c(10, 20, 30, 40, 50)
b <- c(13, 17, 20)

dist <- abs(outer(A, b, '-'))
result <- apply(dist, 2, which.min)

# [1] 1 2 2
查看更多
登录 后发表回答