I have many points and I want to build distance matrix i.e. distance of every point with all of other points but I want to don't use from loop because take too time...
Is a better way for building this matrix?
this is my loop: for a setl with size: 10000x3 this method take a lot of my time :(
for i=1:size(setl,1)
for j=1:size(setl,1)
dist = sqrt((xl(i)-xl(j))^2+(yl(i)-yl(j))^2+...
(zl(i)-zl(j))^2);
distanceMatrix(i,j) = dist;
end
end
How about using some linear algebra? The distance of two points can be computed from the inner product of their position vectors,
D(x, y) = ∥y – x∥ = √ (
xT x + yT y – 2 xT y ),
and the inner product for all pairs of points can be obtained through a simple matrix operation.
x = [xl(:)'; yl(:)'; zl(:)'];
IP = x' * x;
d = sqrt(bsxfun(@plus, diag(IP), diag(IP)') - 2 * IP);
For 10000 points, I get the following timing results:
- ahmad's loop + shoelzer's preallocation: 7.8 seconds
- Dan's vectorized indices: 5.3 seconds
- Mohsen's bsxfun: 1.5 seconds
- my solution: 1.3 seconds
You can use bsxfun
which is generally a faster solution:
s = [xl(:) yl(:) zl(:)];
d = sqrt(sum(bsxfun(@minus, permute(s, [1 3 2]), permute(s, [3 1 2])).^2,3));
You can do this fully vectorized like so:
n = numel(xl);
[X, Y] = meshgrid(1:n,1:n);
Ix = X(:)
Iy = Y(:)
reshape(sqrt((xl(Ix)-xl(Iy)).^2+(yl(Ix)-yl(Iy)).^2+(zl(Ix)-zl(Iy)).^2), n, n);
If you look at Ix
and Iy
(try it for like a 3x3 dataset), they make every combination of linear indexes possible for each of your matrices. Now you can just do each subtraction in one shot!
However mixing the suggestions of shoelzer and Jost will give you an almost identical performance performance boost:
n = 50;
xl = rand(n,1);
yl = rand(n,1);
zl = rand(n,1);
tic
for t = 1:100
distanceMatrix = zeros(n); %// Preallocation
for i=1:n
for j=min(i+1,n):n %// Taking advantge of symmetry
distanceMatrix(i,j) = sqrt((xl(i)-xl(j))^2+(yl(i)-yl(j))^2+(zl(i)-zl(j))^2);
end
end
d1 = distanceMatrix + distanceMatrix'; %'
end
toc
%// Vectorized solution that creates linear indices using meshgrid
tic
for t = 1:100
[X, Y] = meshgrid(1:n,1:n);
Ix = X(:);
Iy = Y(:);
d2 = reshape(sqrt((xl(Ix)-xl(Iy)).^2+(yl(Ix)-yl(Iy)).^2+(zl(Ix)-zl(Iy)).^2), n, n);
end
toc
Returns:
Elapsed time is 0.023332 seconds.
Elapsed time is 0.024454 seconds.
But if I change n
to 500
then I get
Elapsed time is 1.227956 seconds.
Elapsed time is 2.030925 seconds.
Which just goes to show that you should always bench mark solutions in Matlab before writing off loops as slow! In this case, depending on the scale of your solution, loops could be significantly faster.
Be sure to preallocate distanceMatrix
. Your loops will run much, much faster and vectorization probably isn't needed. Even if you do it, there may not be any further speed increase.
The latest versions (Since R2016b) of MATLAB support Implicit Broadcasting (See also noted on bsxfun()
).
Hence the fastest way for distance matrix is:
function [ mDistMat ] = CalcDistanceMatrix( mA, mB )
mDistMat = sum(mA .^ 2).' - (2 * mA.' * mB) + sum(mB .^ 2);
end
Where the points are along the columns of the set.
In your case mA = mB
.
Have a look on my Calculate Distance Matrix Project.