Algorithm for distinguishing n images

Suppose you give N bit arrays of length k. each vector v i has the form {1,0,0,1 ... 1} (k components), etc.

now I want to find the minimum set of indices S = {I 0, I am 1..., I am q-1}, where 0 <= i i <= k-1 such that

when v i is evaluated on S, which gives the binary sequence {v i [I 0], v i [I 1] ..} is unique for each v i...

So, for example, if there are 64 vectors {v 0... v 63}, the minimum set of indices | S | can have size 6 if v irather long and quite varied. Since 2 6 = 64.

The app is that with a fixed list of black and white images 0 ... 63 and you know the input would be one of these images, with a large enough image you would only need to check 6 pixels of the input to figure out which pattern.

This problem seems to have to be NP proven already or solved somewhere.

Any hint? Thank you.

+3


source to share


1 answer


I haven't been able to find an NP-hard problem that matches this, although I'm pretty sure it's NP-hard. Hopefully someone can dig it up, as I think I ran into this issue in several different guises. If not, I can try to decrease it later if I get the time.

Each column splits a set of images into two parts; we start with one set containing all the images and our goal is to reach a section in which each part (defined in the section) consists of one image and that uses the fewest possible columns.

One approach is to solve and solve the Minimum Set Cover problem in which the main set consists of all k (k-1) / 2 pairs of images, and for each v_r column, we create a set consisting of all pairs of images that this column distinguishes, i.e. All pairs of images (i, j) are such that v_r [i]! = V_r [j]. Then any cover will distinguish each image from any other image, and the minimum size cover will do so using the smallest possible columns.



The fastest way to solve this set cover problem would probably be to formulate and solve an ILP , but branch and bound will work as well, and can also be easily turned into a (fast, but possibly suboptimal) heuristic.

Anyway, here are some simple pruning rules that you can apply before trying to fix the problem (and after adding each column if you are using B&B) that will often shrink the size of the problem:

  • The column shows the same “distinctive power” if you change all of your 1s to 0 and vice versa. Therefore, "canonicalize" columns by inverting them if their first element is, say, 1.
  • Collapse all the same columns into one column, as any of them is just as good as any other. This can be easily done by first sorting them: then all the same columns form a single continuous block. Doing this after the first contraction will result in more contraction.
  • You can, of course, remove any "permanent columns" - they may not be useful for pattern recognition.
  • If you have a pair of identical lines, then there is no solution: that is, even including all k bits will not allow you to distinguish between the two lines (images).
+2


source







All Articles