Matlab remove for loop in matrix calculation
I am working on a Matlab problem according to Matrix. I think my code could be improved by removing the for loop. But I really don't know how to fix this. Can anyone help me? code:
K = 3; X = [1 2; 3 4; 5 6; 7 8]; idx = [1;2;3;1]; for i = 1:K ids = (idx == i); centroids(i,:) = sum(bsxfun(@times, X, ids))./ sum(ids); end
in this code, the X data points are 4x2. There are K = 3 centroids, so centroids are a 3x2 matrix. This code is part of a K-mean function that uses data points and their nearest centroids to find the new centroid position. I want to do the code as something without a FOR loop, perhaps starting something like this:
ids = bsxfun(@eq, idx, 1:K);
centroids = ..............
You can avoid bsxfun
with boolean indexing, it seems to be worth the performance increase, at least for small matrices X
. This is best for small K
and small numbers of lines X
.
K = 3; X = [1 2; 3 4; 5 6; 7 8]; idx = [1;2;3;1]; centroids=zeros(K,2); for i = 1:K ids = (idx == i); centroids(i,:) = sum(X(ids,:),1)./sum(ids); end
If X
has a large number of lines, this method is faster:
K = 3;
X = [1 2; 3 4; 5 6; 7 8];
idx = [1;2;3;1];
centroids=zeros(K,2);
t=bsxfun(@eq,idx,1:K);
centroids=bsxfun(@rdivide,t.'*X,sum(t).');
And if K
very large, the Luis method accumarray
is the fastest.
You can apply accumarray
. Note that accumarray
only works when X
is a column. So, if it X
has two columns , you can call twice accumarray
:
centroids(:,1) = accumarray(idx, X(:,1), [], @mean)
centroids(:,2) = accumarray(idx, X(:,2), [], @mean)
Alternatively, if it X
contains two columns of real numbers , you can use complex
to "package" two columns into one complex column, and then unpack the results:
centroids = accumarray(idx, complex(X(:,1),X(:,2)), [], @mean);
centroids = [ real(centroids) imag(centroids)];
If it X
has an arbitrary number of columns , possibly complex numbers , you can iterate over the columns:
centroids = NaN(K, size(X,2)); %// preallocate
for col = 1:size(X,2);
centroids(:,col) = accumarray(idx, X(:,col), [], @mean);
end