Find if every line is repeated in Matlab or not
I have a matrix A
in Matlab of dimension MXN
. I want to create a vector of B
dimension Mx1
where B(i)=1
if is A(i,:)
never repeated A
more than once and 0
otherwise.
for example
A=[1 2 3;
4 5 6;
1 2 3;
7 8 9];
B=[0;1;0;1];
This code
[vu,vx,vx]=unique(A,'rows'); n=accumarray(vx,1); C=[vu n]
helps to find the number of occurrences of a string. Hence, by adding a loop, for example, I should be able C
to get B
at will. However, in my actual case it is M
very large (80,000). Is there anything faster that I could use?
source to share
The problem with your code is that the second output unique
only returns the first occurrence of each unique value, while the third output returns an array of indices on the first output that matches the input. None of them can directly get what you want. If you combine them, you can get what you want.
[~, a, b] = unique(A, 'rows');
B = accumarray(a(b), 1, [size(A, 1), 1]) == 1;
Another alternative is to use the second exit ismember
with the option 'rows'
to find the indexes of the shared rows. Then you can use accumarray
to determine how many times the string is repeated and compare the result with1
[~, bi] = ismember(A, A, 'rows');
B = accumarray(bi, 1, [size(A, 1), 1]) == 1;
In a simple test, the using option unique
tends to give better results
function comparison()
nRows = round(linspace(100, 100000, 30));
times1 = nan(size(nRows));
times2 = nan(size(nRows));
for k = 1:numel(nRows)
A = randi(10, nRows(k), 4);
times1(k) = timeit(@()option1(A));
times2(k) = timeit(@()option2(A));
end
figure
p(1) = plot(nRows, times1 * 1000, 'DisplayName', 'unique');
hold on
p(2) = plot(nRows, times2 * 1000, 'DisplayName', 'ismember');
ylabel('Execution time (ms)')
xlabel('Rows in A')
legend(p)
end
function B = option1(A)
[~, a, b] = unique(A, 'rows');
B = accumarray(a(b), 1, [size(A, 1), 1]) == 1;
end
function B = option2(A)
[~, bi] = ismember(A, A, 'rows');
B = accumarray(bi, 1, [size(A, 1), 1]) == 1;
end
source to share