Matlab remove for loop in matrix calculation

I am working on a Matlab problem according to Matrix. I think my code could be improved by removing the for loop. But I really don't know how to fix this. Can anyone help me? code:

K = 3;
X = [1 2; 3 4; 5 6; 7 8];
idx = [1;2;3;1];
for i = 1:K
    ids = (idx == i);
    centroids(i,:) = sum(bsxfun(@times, X, ids))./ sum(ids);
end

      

in this code, the X data points are 4x2. There are K = 3 centroids, so centroids are a 3x2 matrix. This code is part of a K-mean function that uses data points and their nearest centroids to find the new centroid position. I want to do the code as something without a FOR loop, perhaps starting something like this:

ids = bsxfun(@eq, idx, 1:K);
centroids = ..............

      

+3


source to share


2 answers


You can avoid bsxfun

with boolean indexing, it seems to be worth the performance increase, at least for small matrices X

. This is best for small K

and small numbers of lines X

.

K = 3;
X = [1 2; 3 4; 5 6; 7 8];
idx = [1;2;3;1];
centroids=zeros(K,2);
for i = 1:K
    ids = (idx == i);
    centroids(i,:) = sum(X(ids,:),1)./sum(ids);
end

      

If X

has a large number of lines, this method is faster:



K = 3;
X = [1 2; 3 4; 5 6; 7 8];
idx = [1;2;3;1];
centroids=zeros(K,2);
t=bsxfun(@eq,idx,1:K);
centroids=bsxfun(@rdivide,t.'*X,sum(t).');

      

And if K

very large, the Luis method accumarray

is the fastest.

+4


source


You can apply accumarray

. Note that accumarray

only works when X

is a column. So, if it X

has two columns , you can call twice accumarray

:

centroids(:,1) = accumarray(idx, X(:,1), [], @mean)
centroids(:,2) = accumarray(idx, X(:,2), [], @mean)

      

Alternatively, if it X

contains two columns of real numbers , you can use complex

to "package" two columns into one complex column, and then unpack the results:



centroids = accumarray(idx, complex(X(:,1),X(:,2)), [], @mean);
centroids = [ real(centroids) imag(centroids)];

      

If it X

has an arbitrary number of columns , possibly complex numbers , you can iterate over the columns:

centroids = NaN(K, size(X,2)); %// preallocate
for col = 1:size(X,2);
    centroids(:,col) = accumarray(idx, X(:,col), [], @mean);
end

      

+4


source







All Articles