I got this code which create two vectors and for each element from a
I want to get the closest element in b
:
a = rnorm(100)
b = rnorm(100)
c = vapply(a, function(x) which.min(abs(b - x)), 1)
table(duplicated(c))
FALSE TRUE
61 39
As you can see this method is prompt to give a lot of duplicates which is normal but I would like to not have duplicates. I thought of deleting occurence from b
once an index has been selected but I don't know how to do it under vapply
.
The closest match you are going to get is by sorting the vectors and then pairing them off. The following permuation on b
should allow you to do that.
p <- order(b)[order(order(a))] # order on b and then back transform the ordering of a
sum(abs(a-b[p]))
[1] 20.76788
Clearly, allowing duplicates does make things much closer:
sum(abs(a-b[c]))
[1] 2.45583
This is very bad programming, but may work and is vectorized...
a <- rnorm(100)
b <- rnorm(100)
#make a copy of b (you'll see why)
b1<-b
res<- vapply(a, function(x) {ret<-which.min(abs(b1 - x));b1[ret]<<-NA;return(ret)}, 1)
I believe this is the best you can get: sum(abs(sort(a) - sort(b)))
I am using data.table
to preserve the original sorting of a
:
require(data.table)
set.seed(1)
a <- rnorm(100)
b <- rnorm(100)
sum(abs(a - b))
sum(abs(sort(a) - sort(b)))
dt <- data.table(a = a, b = b)
dt[, id := .I]
# sort dt by a
setkey(dt, a)
# sort b
dt[, b := sort(b)]
# return to original order
setkey(dt, id)
dt
dt[, sum(abs(a - b))]
This solution gives better result if compared to Chase's solution:
dt2 <- as.data.table(foo(a,b))
dt2[, sum(abs(a - bval))]
dt[, sum(abs(a - b))]
Result:
> dt2[, sum(abs(a - bval))]
[1] 24.86731
> dt[, sum(abs(a - b))]
[1] 20.76788
This can almost certainly be improved upon through vectorization, but appears to work and may get the job done:
set.seed(1)
a = rnorm(5)
b = rnorm(5)
foo <- function(a,b) {
out <- cbind(a, bval = NA)
for (i in seq_along(a)) {
#which value of B is closest?
whichB <- which.min(abs(b - a[i]))
#Assign that value to the bval column
out[i, "bval"] <- b[whichB]
#Remove that value of B from being chosen again
b <- b[-whichB]
}
return(out)
}
#In action
foo(a,b)
---
a bval
[1,] -0.6264538 -0.8204684
[2,] 0.1836433 0.4874291
[3,] -0.8356286 -0.3053884
[4,] 1.5952808 0.7383247
[5,] 0.3295078 0.5757814