Remove for loop from clustering algorithm in MATLA

2019-01-28 10:03发布

站内文章 / 前端开发

19 0

啃猪蹄的小仙女

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I am trying to improve the performance of the OPTICS clustering algorithm. The implementation i've found in open source makes a use of a for loop for each sample and can run for hours...

I believe some use of repmat() function may aid in improving its performance when the system has enough amount of RAM. You are more than welcome to suggest other ways of improving the implementation.

Here is the code:

x is the data: a [mxn] array where m is the sample size and n is the feature dimensionality, which is most of the time significantly greater than one.

[m,n] = size(x);

for i = 1:m
    D(i,:) = sum(((repmat(x(i,:),m,1)-x).^2),2).';
end

many thanks.

回答1:

With enough RAM to play with, you can use few approaches here.

Approach #1: With bsxfun & permute -

D = squeeze(sum(bsxfun(@minus,permute(x,[3 2 1]),x).^2,2))

Approach #2: With pdist & squareform -

D = squareform(pdist(x).^2)

Approach #3 With matrix-multiplication based euclidean distance calculations -

xt = x.';  %//'
[m,n] = size(x);
D = [x.^2 ones(size(x)) -2*x ]*[ones(size(xt)) ; xt.^2 ; xt];
D(1:m+1:end) = 0;

For performance, my bet would be on approach #3!

标签： algorithm performance matlab cluster-analysis vectorization

啃猪蹄的小仙女

女 | 书童

私信

收藏的人(0)

Ta的文章更多文章

0条评论

还没有人评论过~

Remove for loop from clustering algorithm in MATLA

问题:

回答1:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮