Remove from the loop the clustering algorithm in MATLAB

I am trying to improve the performance of the OPTICS clustering algorithm. The implementation I found in open source uses a for loop for each sample and can run for hours ...

I believe that using the repmat () function can help improve its performance when the system has enough RAM. You can more than suggest other ways to improve the implementation.

Here is the code:

x is data: array [mxn], where m is the sample size, and n is the dimension of features, which in most cases is much larger than one.

[m,n] = size(x);

for i = 1:m
    D(i,:) = sum(((repmat(x(i,:),m,1)-x).^2),2).';
end

      

many thanks.

+3


source to share


1 answer


With enough RAM to play the game, you can take several approaches here.

Approach # 1: C bsxfun

and permute

-

D = squeeze(sum(bsxfun(@minus,permute(x,[3 2 1]),x).^2,2))

      

Approach # 2: C pdist

and squareform

-



D = squareform(pdist(x).^2)

      

Approach # 3 . matrix-multiplication based euclidean distance calculations

-

xt = x.';  %//'
[m,n] = size(x);
D = [x.^2 ones(size(x)) -2*x ]*[ones(size(xt)) ; xt.^2 ; xt];
D(1:m+1:end) = 0;

      

For performance, my bet will be on approach # 3!

+2


source







All Articles