Calculating and placing all 36 nCr 10 possibilities in a list in Python

I need to display a set of 1200 random combinations of 36 nCr 10. Since there are 254,186,856 combinations of 36 nCr 10, I guess I can't put all of them in the Python list.

How can I solve this problem? Should I be using something other than Python or looking for a different algorithm? (I'm using this right now: http://docs.python.org/library/itertools.html?highlight=itertools.combinations#itertools.combinations )

EDIT: the combinations shouldn't be duplicated, as this won't be an nCr problem. Just thought I'd clarify this.

Here is the code so far ...

def combinations(iterable, r):
# combinations('ABCD', 2) --> AB AC AD BC BD CD
# combinations(range(4), 3) --> 012 013 023 123
pool = tuple(iterable)
n = len(pool)
if r > n:
    return
indices = range(r)
yield tuple(pool[i] for i in indices)
while True:
    for i in reversed(range(r)):
        if indices[i] != i + n - r:
            break
    else:
        return
    indices[i] += 1
    for j in range(i+1, r):
        indices[j] = indices[j-1] + 1
    yield tuple(pool[i] for i in indices)

if __name__ == '__main__':
    teamList = list(combinations(range(36), 10))

      

After that, Python uses 2 gigabytes of my RAM, but never completes the computation.

+3


source to share


6 answers


Try the following implementation.

>>> def nCr(data,r,size):
    result=set()
    while len(result) < size:
        result.add(''.join(random.sample(data,r)))
    return list(result)

      



To create 1200 unique samples 36 C r for data of length 36, you can do something like this

>>> data = string.ascii_letters[:36]
>>> print nCr(data,10,1200)

      

0


source


Don't I think about it?



from random import sample

dataset = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
for i in xrange(1200):
  print sample(dataset,10)

      

+2


source


You cannot use random.sample directly in a combination iterator, but you can use it to generate random indices:

indices = random.sample(xrange(number_of_combinations), 1200)
comb = itertools.combinations(range(36), 10)

prev = 0
for n in sorted(indices):
    print next(itertools.islice(comb, n-prev, None))
    prev = n

      

random.sample will only select each index once, so you don't have to worry about duplicates. There is also no need to generate 254 186 856 indexes to select 1200 of them.

If you have SciPy, you can easily calculate the number of combinations using scipy.misc.comb , which is a fast and efficient way to calculate it:

number_of_combinations = scipy.misc.comb(36, 10, exact=True)

      

Otherwise, you can use this snippet :

def number_of_combinations(n, k):
    if k < 0 or k > n:
        return 0
    if k > n - k: # take advantage of symmetry
        k = n - k
    c = 1
    for i in range(k):
        c = c * (n - (k - (i+1)))
        c = c // (i+1)
    return c

      

+1


source


You can use a technique in which you repeat the entire sequence, but slightly increase the likelihood that the item will be selected for each step. This will give an unbiased random sample, with 1200 (depending on which probability you choose).

This will force you to generate more or less the entire sequence, but you don't have to put items in memory. See for example: http://www.javamex.com/tutorials/random_numbers/random_sample.shtml

0


source


I think this will give you the answer you are looking for:

I iterate over a generator containing all possible combinations and randomly choose n from them.

from itertools import combinations
import random as rn
from math import factorial
import time

def choose_combos(n,r,n_chosen):

    total_combs = factorial(n)/(factorial(n-r)*factorial(r))

    combos = combinations(range(n),r)

    chosen_indexes = rn.sample(xrange(total_combs),n_chosen)

    random_combos = []

    for i in xrange(total_combs):
        ele = combos.next()
        if i in chosen_indexes:
            random_combos.append(ele)

    return random_combos

start_time = time.time()

print choose_combos(36,10,5) #you would have to use 1200 instead of 5

print 'time taken', time.time() - start_time

      

It will take a while, but it's not crazy:

[(0, 6, 10, 13, 22, 27, 28, 29, 30, 35), (3, 4, 11, 12, 13, 16, 19, 26, 31, 33), (3, 7, 8, 9, 11, 19, 20, 23, 28, 30), (3, 10, 15, 19, 23, 24, 29, 30, 33, 35), (7, 14, 16, 20, 22, 25, 29, 30, 33, 35)]

time taken 111.286000013

      

0


source


Seems like applying my recent answer to another question

<sub> psub> From <sub> ksub>:

def choose(n, k):
    '''Returns the number of ways to choose k items from n items'''
    reflect = n - k
    if k > reflect:
        if k > n:
            return 0
        k = reflect
    if k == 0:
        return 1
    for nMinusIPlus1, i in zip(range(n - 1, n - k, -1), range(2, k + 1)):
        n = n * nMinusIPlus1 // i
    return n

      

Combination by index in a lexicographically sorted list of all combinations n C k:

def iterCombination(index, n, k):
    '''Yields the items of the single combination that would be at the provided
    (0-based) index in a lexicographically sorted list of combinations of choices
    of k items from n items [0,n), given the combinations were sorted in 
    descending order. Yields in descending order.
    '''
    nCk = 1
    for nMinusI, iPlus1 in zip(range(n, n - k, -1), range(1, k + 1)):
        nCk *= nMinusI
        nCk //= iPlus1
    curIndex = nCk
    for k in range(k, 0, -1):
        nCk *= k
        nCk //= n
        while curIndex - nCk > index:
            curIndex -= nCk
            nCk *= (n - k)
            nCk -= nCk % k
            n -= 1
            nCk //= n
        n -= 1
        yield n

      

Random sampling of combinations without creating a list of combinations:

import random

def iterRandomSampleOfCombinations(items, combinationSize, sampleSize):
    '''Yields a random sample of sampleSize combinations, each composed of
    combinationSize elements chosen from items.
    The sample is as per random.sample, thus any sub-slice will also be a valid
    random sample.
    Each combination will be a reverse ordered list of items - one could reverse
    them or shuffle them post yield if need be.
    '''
    n = len(items)
    if n < 1 or combinationSize < 1 or combinationSize > n:
        return
    nCk = choose(n, combinationSize)
    if sampleSize > nCk:
        return
    for sample in random.sample(range(nCk), sampleSize):
        yield [items[i] for i in iterCombination(sample, n, combinationSize)]

      

Example: a sample of 29 combinations of length 10, selected from 36 elements [AZ] + [aj]:

>>> items = [chr(i) for i in range(65, 91)] + [chr(i) for i in range(97, 107)]
>>> len(items)
36
>>> for combination in combinations.iterRandomSampleOfCombinations(items, 10, 29):
...     sampledCombination
...
['i', 'e', 'b', 'Z', 'U', 'Q', 'N', 'M', 'H', 'A']
['j', 'i', 'h', 'g', 'f', 'Z', 'P', 'I', 'G', 'E']
['e', 'a', 'Z', 'U', 'Q', 'L', 'G', 'F', 'C', 'B']
['i', 'h', 'f', 'Y', 'X', 'W', 'V', 'P', 'I', 'H']
['g', 'Y', 'V', 'S', 'R', 'N', 'M', 'L', 'K', 'I']
['j', 'i', 'f', 'e', 'd', 'b', 'Z', 'X', 'W', 'L']
['g', 'f', 'e', 'Z', 'T', 'S', 'P', 'L', 'J', 'E']
['d', 'c', 'Z', 'X', 'V', 'U', 'S', 'I', 'H', 'C']
['f', 'e', 'Y', 'U', 'I', 'H', 'D', 'C', 'B', 'A']
['j', 'd', 'b', 'W', 'Q', 'P', 'N', 'M', 'F', 'B']
['j', 'a', 'V', 'S', 'P', 'N', 'L', 'J', 'H', 'G']
['g', 'f', 'e', 'a', 'W', 'V', 'O', 'N', 'J', 'D']
['a', 'Z', 'Y', 'W', 'Q', 'O', 'N', 'F', 'B', 'A']
['i', 'g', 'a', 'X', 'V', 'S', 'Q', 'P', 'H', 'D']
['c', 'b', 'a', 'T', 'P', 'O', 'M', 'I', 'D', 'B']
['i', 'f', 'b', 'Y', 'X', 'W', 'V', 'U', 'M', 'A']
['j', 'd', 'U', 'T', 'S', 'K', 'G', 'F', 'C', 'B']
['c', 'Z', 'X', 'U', 'T', 'S', 'O', 'M', 'F', 'D']
['g', 'f', 'X', 'S', 'P', 'M', 'F', 'D', 'C', 'B']
['f', 'Y', 'W', 'T', 'P', 'M', 'J', 'H', 'D', 'C']
['h', 'b', 'Y', 'X', 'W', 'Q', 'K', 'F', 'C', 'B']
['j', 'g', 'Z', 'Y', 'T', 'O', 'L', 'G', 'E', 'D']
['h', 'Z', 'Y', 'S', 'R', 'Q', 'H', 'G', 'F', 'E']
['i', 'c', 'X', 'V', 'R', 'P', 'N', 'L', 'J', 'E']
['f', 'b', 'Z', 'Y', 'W', 'V', 'Q', 'N', 'G', 'D']
['f', 'd', 'c', 'b', 'V', 'T', 'S', 'R', 'Q', 'B']
['i', 'd', 'W', 'U', 'S', 'O', 'N', 'M', 'K', 'G']
['g', 'f', 'a', 'W', 'V', 'T', 'S', 'R', 'H', 'B']
['g', 'f', 'a', 'W', 'T', 'S', 'O', 'L', 'K', 'G']

      

Note: the combinations themselves are (inverse) ordered, but the pattern is random (just like any slice of it, since it was random.sample

used for the object range

), if random order is also required, just do random.shuffle(combination)

.

It's pretty fast (though maybe a different order of combinations could be faster?):

>>> samples = 1000
>>> sampleSize = 1200
>>> combinationSize = 10
>>> len(items)
36
>>> while 1:
...     start = time.clock()
...     for i in range(samples):
...             for combination in iterRandomSampleOfCombinations(items, combinationSize, sampleSize):
...                     pass
...     end = time.clock()
...     print("{0} seconds per length {1} sample of {2}C{3}".format((end - start)/samples, sampleSize, len(items), combinationSize))
...     break
...
0.03162827446371375 seconds per length 1200 sample of 36C10

      

0


source







All Articles