optimization of pairwise L2 distance computations

2019-08-04 14:07发布

I need help optimizing this loop. matrix_1 is a (nx 2) int matrix and matrix_2 is a (m x 2), m & n very.

index_j = 1;
for index_k = 1:size(Matrix_1,1)
    for index_l = 1:size(Matrix_2,1)
        M2_Index_Dist(index_j,:) = [index_l, sqrt(bsxfun(@plus,sum(Matrix_1(index_k,:).^2,2),sum(Matrix_2(index_l,:).^2,2)')-2*(Matrix_1(index_k,:)*Matrix_2(index_l,:)'))];
        index_j = index_j + 1;
    end
 end

I need M2_Index_Dist to provide a ((n*m) x 2) matrix with the index of matrix_2 in the first column and the distance in the second column.

Output example:

M2_Index_Dist = [ 1, 5.465
                  2, 56.52
                  3, 6.21
                  1, 35.3
                  2, 56.52
                  3, 0
                  1, 43.5
                  2, 9.3
                  3, 236.1
                  1, 8.2
                  2, 56.52
                  3, 5.582]

2条回答
干净又极端
2楼-- · 2019-08-04 14:21

If I understand correctly, this does what you want:

ind = repmat((1:size(Matrix_2,1)).',size(Matrix_1,1),1); %'// first column: index
d = pdist2(Matrix_2,Matrix_1); %// compute distance between each pair of rows
d = d(:); %// second column: distance
result = [ind d]; %// build result from first column and second column

As you see, this code calls pdist2 to compute the distance between every pair of rows of your matrices. By default this function uses Euclidean distance.

If you don't have pdist2 (which is part of the the Statistics Toolbox), you can replace line 2 above with bsxfun:

d = squeeze(sqrt(sum(bsxfun(@minus,Matrix_2,permute(Matrix_1, [3 2 1])).^2,2)));
查看更多
三岁会撩人
3楼-- · 2019-08-04 14:29

Here's how to apply bsxfun with your formula (||A-B|| = sqrt(||A||^2 + ||B||^2 - 2*A*B)):

d = real(sqrt(bsxfun(@plus, dot(Matrix_1,Matrix_1,2), ...
    bsxfun(@minus, dot(Matrix_2,Matrix_2,2).', 2 * Matrix_1*Matrix_2.')))).';

You can avoid the final transpose if you change your interpretation of the matrix.

Note: There shouldn't be any complex values to handle with real but it's there in case of very small differences that may lead to tiny negative numbers.


Edit: It may be faster without dot:

d = sqrt(bsxfun(@plus, sum(Matrix_1.*Matrix_1,2), ...
    bsxfun(@minus, sum(Matrix_2.*Matrix_2,2)', 2 * Matrix_1*Matrix_2.'))).';

Or with just one call to bsxfun:

d = sqrt(bsxfun(@plus, sum(Matrix_1.*Matrix_1,2), sum(Matrix_2.*Matrix_2,2)') ...
    - 2 * Matrix_1*Matrix_2.').';

Note: This last order of operations gives identical results to you, rather than with an error ~1e-14.


Edit 2: To replicate M2_Index_Dist:

II = ndgrid(1:size(Matrix_2,1),1:size(Matrix_2,1));
M2_Index_Dist = [II(:) d(:)];
查看更多
登录 后发表回答