Binary subset expression

Given a list of 256 numbers in order (0-255), I want to express a subset of 128 numbers from that list. Each number will be unique and not repeated.

What is the most compact way to express this subset?

So far I have gotten a 256 bit array of bits and set the indices accordingly to 1. This method obviously requires 256 bits to represent 128 values, but is there any other, more compact way

Thank!

+3


source to share


2 answers


There are 256! / (128! * (256 - 128)!) Unique combinations of 128 items from a set of 256 items when order doesn't matter (see the wiki for combinations).

If you calculate this number and take the base-2 logarithm, you will find that it is 251.6. This means that you need at least 252 bits to represent a unique selection of 128 items out of 256. Since .NET cannot represent bits anyway (only whole bytes) there is no reason to really know how to do this.



128 is the worst number in this regard. If you chose 5 elements, or 251 out of 256, this could be represented with 34 bits, and it would be useful to try to find such an efficient representation.

0


source


Since you don't care about the order of the subset and you don't care about restoring each item to its position in the original array, this is just a case of creating a random subset of the array, which is similar to drawing a card from a deck.

To take unique elements from an array, you can simply shuffle the original array and then take multiple elements at the first X-indices:



int[] srcArray = Enumerable.Range(0, 256).ToArray();

Random r = new Random();
var subset = srcArray.OrderBy(i => r.Next()).Take(128).ToArray();

      

Note. I am using the above randomization method to keep the example concise. For a more robust approach to shuffling, I recommend the Fisher-Yates algorithm as described in this post .

0


source







All Articles