Algorithm to differentiate n pictures

2020-07-23 04:52发布

Suppose you are give N bit arrays of length k. meaning each vector vi is of form {1,0,0,1...1} (k components) etc.

now I want to find the minimal index set S = {I0, I1..., Iq-1}, with 0 <= Ii <= k-1 such that

when vi evaluated on S, which gives a binary sequence {vi[I0], vi[I1] ..} is unique for each vi.

So for example if there are 64 vectors {v0...v63}, the minimal index set |S| could have size 6 given vi are pretty long and sufficiently different. Because 26 = 64.

The application is that given a fixed list of black-white pictures, 0...63, and you know the input would be one of those pictures, with picture sufficiently large, you only need to check 6 pixel of the input to figure out which picture it is.

This problem seems should already be proved NP or solved somewhere.

Any hint? thanks.

标签: c++ algorithm
1条回答
甜甜的少女心
2楼-- · 2020-07-23 05:31

I wasn't able to find an NP-hard problem that corresponds to this, though I'm certain it is NP-hard. I hope someone can dig one up, as I think I have run into this problem in a few different guises. If not, I might attempt a reduction later if I get time.

Each column partitions the set of images into two parts; we start with a single set containing all images, and our goal is to reach a partition in which every part (set in the partition) consists of a single image, and which uses the fewest possible columns.

One approach would be to make and solve a Minimum Set Cover problem in which the ground set consists of all k(k-1)/2 pairs of images, and for each column v_r we make a set consisting of all image pairs that this column distinguishes -- i.e. all pairs of images (i, j) such that v_r[i] != v_r[j]. Then any cover will distinguish every image from every other image, and a cover of minimum size will do so using the fewest possible columns.

The fastest way to actually solve this set cover problem is likely to be by formulating and solving an ILP, but branch and bound would also work, and can also be easily turned into a (fast but possibly suboptimal) heuristic.

Either way, here are a few simple reduction rules you can apply before trying to solve the problem (and after each column is added, if you use B&B) that will often reduce the problem size a bit:

  1. A column provides the same "distinguishing power" if you change all its 1s to 0s and vice versa. Therefore "canonicalise" the columns by inverting them if their first element is, say, 1.
  2. Collapse all identical columns into a single column, since any of them is exactly as good as any other. This can easily be done by first sorting them: then all identical columns form a single contiguous block. Doing this after performing the first reduction will result in more reduction.
  3. You can of course delete any "constant columns" -- there is no way they can be of use in distinguishing images.
  4. If you have any pair of identical rows, then there is no solution: that is, even including all k bits would not enable you to distinguish these two rows (images).
查看更多
登录 后发表回答