Calculating Euclidean Distance for Large DataSets

2019-07-07 02:55发布

I have to calculate Euclidean distance between train and test data. the total length of train data is 1389 and for test data is 364. It is basically the data from the handwritten ZIP codes on envelopes from U.S. postal mail, downloaded from the website of "Elements of Statistical learning".

I am a beginner and just read the data in R package. I'm unable to start calculating distance between train and test data. Can anyone help me out to give me an idea that how to generate a loop for this data?

I would be thankful.

标签: r distance
1条回答
forever°为你锁心
2楼-- · 2019-07-07 03:32

For Euclidian distances, I like using rdist from the fields packages. One advantage over dist from the stats package, is that it can take two matrices as input:

train.data <- matrix(runif(1389*2), ncol = 2)
test.data  <- matrix(runif(364*2),  ncol = 2)

library(fields)
distances <- rdist(train.data, test.data)
dim(distances)
# [1] 1389  364
查看更多
登录 后发表回答