MATLAB - Pass columns of a matrix given by a matrix of indices
Given the matrix X, I want to set the k smallest elements of each column to zero. For vector x, do the following:
[~, ind] = sort(x)
x(ind(1:k)) = 0
Now for matrix X this doesn't work:
[~, ind] = sort(x)
x(ind(1:k)) = 0
Just sets the smallest element k of the first column to 0. How do I index it correctly?
source to share
Solution code: One approach with sort
(to get column-sorted indexes) and then bsxfun
(to get linear sorted indexes) -
%// Get 2D array of column-sorted indices for input X
[~,sorted_idx] = sort(X,1)
%// Get linear indices for the first k rows of sorted indices
lin_idx = bsxfun(@plus,sorted_idx(1:k,:),[0:size(X,2)-1]*size(X,1))
%// Use those indices to set them in X as zeros
X(lin_idx) = 0;
Run example
1) Inputs:
X =
61 67 86 54 49 40 13
48 91 28 70 34 98 87
79 7 27 86 71 58 52
16 10 45 60 79 4 3
56 36 49 50 31 48 87
k =
3
2) Sorted indexes:
>> [~,sorted_idx] = sort(X,1)
sorted_idx =
2 3 2 1 1 1 1
4 4 3 4 2 4 3
5 5 4 5 5 5 4
1 1 5 2 3 3 2
3 2 1 3 4 2 5
3) Select only the first k indices from each column:
>> sorted_idx(1:k,:)
ans =
4 3 3 5 5 4 4
2 4 2 1 2 1 1
5 5 4 4 1 5 3
4) We need to convert these columnar indexes linear indices corresponding to the 2D-array X
. So after indexing the columns used in MATLAB, the first column remains as it is, the second column should have an offset added number of rows in X
, the third column will be added, 2*number of rows in X
and so on until all columns are covered.
To say mathematically, we would have [0 5 10 15 20 25 30]
, i.e. [0:6]*5
, i.e. put as a general case [0:size(X,2)-1]*size(X,1)
added to sorted_idx(1:k,:)
. Since we need to do this for each line sorted_idx(1:k,:)
, we can use automatic decomposition and summation (s @plus
) with bsxfun
. Note that this will be done in vector format. Thus, here the decomposition [0:size(X,2)-1]*size(X,1)
occurs line by line, and then elementary sums sorted_idx(1:k,:)
with using will be performed @plus
. So we would have some much needed linear indices such as -
>> lin_idx = bsxfun(@plus,sorted_idx(1:k,:),[0:size(X,2)-1]*size(X,1))
lin_idx =
4 8 13 20 25 29 34
2 9 12 16 22 26 31
5 10 14 19 21 30 33
5) Finally, we use these indices to selectively set zeros in X
with X(lin_idx) = 0
.
source to share
Use quantile
(statistics panel):
X = X .* bsxfun(@ge, X, quantile(X, k/size(X,1)));
How it works :
-
quantile(X, k/size(X,1))
gives a number (quantile) for each column, so the fraction ofk/size(X,1))
records in that column is less than that number. This means that each column has exactlyk
less entries than quantize the column. - Comparing each column with the corresponding quantile (
bsxfun(@ge, ...)
) yields a matrix containing,0
for records less than the quantile, and1
otherwise. - Elementary multiplication
A
by the result of 2 makes the desired valuesA
equal0
.
Example :
>> X = rand(5,3)
X =
0.088188645571510 0.907109055220371 0.805984932289666
0.683710335821638 0.860456667336885 0.868488116302772
0.120400876857723 0.338451384118250 0.669646599875533
0.010699003144174 0.027158829325862 0.807778862315076
0.557268230074914 0.800859355130033 0.897498282302820
>> k=2;
>> X = X.*bsxfun(@ge, X, quantile(X,k/size(X,1)))
X =
0 0.907109055220371 0
0.683710335821638 0.860456667336885 0.868488116302772
0.120400876857723 0 0
0 0 0.807778862315076
0.557268230074914 0.800859355130033 0.897498282302820
source to share