Remove for loop from clustering algorithm in MATLA

2019-01-28 09:34发布

I am trying to improve the performance of the OPTICS clustering algorithm. The implementation i've found in open source makes a use of a for loop for each sample and can run for hours...

I believe some use of repmat() function may aid in improving its performance when the system has enough amount of RAM. You are more than welcome to suggest other ways of improving the implementation.

Here is the code:

x is the data: a [mxn] array where m is the sample size and n is the feature dimensionality, which is most of the time significantly greater than one.

[m,n] = size(x);

for i = 1:m
    D(i,:) = sum(((repmat(x(i,:),m,1)-x).^2),2).';
end

many thanks.

标签： algorithm performance matlab cluster-analysis vectorization

1条回答

Melony?

2楼-- · 2019-01-28 10:20

With enough RAM to play with, you can use few approaches here.

Approach #1: With bsxfun & permute -

D = squeeze(sum(bsxfun(@minus,permute(x,[3 2 1]),x).^2,2))

Approach #2: With pdist & squareform -

D = squareform(pdist(x).^2)

Approach #3 With matrix-multiplication based euclidean distance calculations -

xt = x.';  %//'
[m,n] = size(x);
D = [x.^2 ones(size(x)) -2*x ]*[ones(size(xt)) ; xt.^2 ; xt];
D(1:m+1:end) = 0;

For performance, my bet would be on approach #3!

0人赞添加讨论(0) 举报

Remove for loop from clustering algorithm in MATLA

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间