Multidimensional array for random.choice in NumPy
I have a table and I need to use random.choice to calculate the probability, for example (taken from the docs):
>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher']
>>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3])
array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'],
dtype='|S11')
If I have a 3D array instead of aa_milne_arr, this prevents me from continuing. I need to generate random things with different probabilities for 3 arrays, but the same for the elements inside them. For example,
>>> arr0 = ['red', 'green', 'blue']
>>> arr1 = ['light', 'wind', 'sky']
>>> arr3 = ['chicken', 'wolf', 'dog']
>>> p = [0.5, 0.1, 0.4]
And I need the same questions for the elements in arr0 (0.5), arr1 (0.1) and arr3 (0.4), so as a result I will see with a probability of 0.5 any element from arr0, etc.
Is this any elegant way to do this?
source to share
Divide the values p
by the lengths of the arrays, and then repeat them the same lengths.
Then select new probabilities from the concatenated array
arr = [arr0, arr1, arr3] lens = [len(a) for a in arr] p = [.5, .1, .4] new_arr = np.concatenate(arr) new_p = np.repeat(np.divide(p, lens), lens) np.random.choice(new_arr, p=new_p)
source to share
Here's what I came for. It takes either a vector of probabilities or a matrix in which the weights are organized in columns. The weights will be normalized to an amount of up to 1.
import numpy as np
def choice_vect(source,weights):
# Draw N choices, each picked among K options
# source: K x N ndarray
# weights: K x N ndarray or K vector of probabilities
weights = np.atleast_2d(weights)
source = np.atleast_2d(source)
N = source.shape[1]
if weights.shape[0] == 1:
weights = np.tile(weights.transpose(),(1,N))
cum_weights = weights.cumsum(axis=0) / np.sum(weights,axis=0)
unif_draws = np.random.rand(1,N)
choices = (unif_draws < cum_weights)
bool_indices = choices > np.vstack( (np.zeros((1,N),dtype='bool'),choices ))[0:-1,:]
return source[bool_indices]
It avoids using loops and is similar to the vector version of random.choice.
Then you can use it like this:
source = [[1,2],[3,4],[5,6]] weights = [0.5, 0.4, 0.1] choice_vect(source,weights) >> array([3, 2]) weights = [[0.5,0.1],[0.4,0.4],[0.1,0.5]] choice_vect(source,weights) >> array([1, 4])
source to share