Unique rows after removing columns from matrix

Question

Unique rows after removing columns from matrix

Given a binary matrix of columns n

and columns m

. n

and m

maybe up to 2000.

We need to tell if the column can be removed so that the rows of the remaining matrix are unique.

Example for n = 3

and m = 4

:

1010
0001
1001

The answer is yes. We can remove the second column, and the remaining rows ( 110

, 001

, 101

) will be unique.

Example for n = 4

and m = 2

:

The answer is no. No matter which column we choose, line 0

and 0

will remain.

I has a brute-force O (m * m * n) algorithm. I check the uniqueness of the rows by deleting each column.

Do you know a faster algorithm?

+3

language-agnostic algorithm

code test 30 Aug '14 at 11:00

source to share

2 answers

libik · Answer 1 · 2014-08-30T11:19:13+0000

! EDIT: My solution is unfortunately only half way to resolve this, sorry.

Well, I'm sure I can do it in O (m * n) time.

You can create a tree at n * m time. We just go over one line by one and update this structure:

Node{
 int accessed;
 Node nextZero;
 Node nextOne;
}

If you create this tree, you only need to check the last line if it has "zeros" and "ones" equal or greater than two or not.

There is a visual example of what it looks like after processing two numbers.

distinct rows

You just step by line, always start at root.

For example, when you start processing the second line, you start at root. The number on the second line is "101". You take the first number, which is "1", so you go to the nextOne node. Then you get "0", so you get to nextZero. Then you get a "1" which doesn't exist, so you create it.

After all, you are only interested in the "available" number in most depth nodes, if they all have "access" of 1, they are all different, otherwise they are not.

pseudocode

Node{
  int accessed;
  Node nextZero;
  Node nextOne;
}

bool isDistinct(){
  Node root = new Node();
  Node next;
  for (int i=0;i<arr.length;i++){
    Node actual = root;
    for (int j=0;j<arr[i].length;j++){
      if (arr[i][j] == 0){
        next = actual.nextZero;
        if (next == null){
          next = new Node();
          acutal.nextZero = next;
        }
      } else {
        next = actual.nextOne;
        if (next == null){
          next = new Node();
          acutal.nextOne = next;
        }
      }

      actual = next;
      actual.accessed++;
      if ((j == arr[i].length - 1) && (actual >= 2)){
        return false;
      }
    }
  }
  return true;
}

Sorry, it's really just a "midfielder" to do this, I didn't read correctly what exactly I should be doing. But with some thinking, perhaps you can remove the node from the tree and balance it effectively ...

Dejan · Answer 2 · 2014-08-30T12:58:10+0000

Each line represents some number in base 10.

We can calculate all these numbers in O(n*m)

.
We will get an array of a

length n

.
We can create an array b

where the position b[i]

will be the number of times the number i

in the array a

inO(n)
If for some i

b[i]>1

, the answer is No
No, we will try to delete the columns one at a time, which will change the numbers accordingly. For example, if we remove the k

th column, we need to make an array c

that has the same value as the array b

, but without k

th column. To do this, we initialize c[i]=b[i]

if i<2^k

, and if b[i]=1

for i>=2^k

, then we update c[i-2^k]++

. If we get c[i]>1

for some i

, the answer will be No and continue with the next column. Otherwise, the answer is "Yes". This can be done in O(n*m)

.

Edit:

Difficulty for the whole solution then O(n*m)

.

Since the numbers will be large, you can represent the array b

as a sparse array using a dictionary, and for numbers, you can use some library for large numbers. The whole decision must be faster than brute force.

Unique rows after removing columns from matrix

More articles: