Using random.choice () with preference and uniqueness
I have a list:
decisions = ['yes', 'no', 'unknown']
I am writing to a file using this list:
for x in range(0, 100):
file.write(random.choice(decisions))
What would be the most efficient way to ensure that 70% of the written values were something like "unknown"?
I want randomness to some extent, but also want the 70 values written to the file to be of a specific type. I plan on getting this percentage from the user so that he can change every run.
If I had another list that was much larger and would like to enforce uniqueness (no duplicate values, but also randomly ordered), what would be the best method?
source to share
If you are using NumPy for you, this is pretty straightforward to implement:
np.random.choice(['yes', 'no', 'unknown'], p=[0.15, 0.15, 0.7])
The second array is p
added to one and represents the probability that the corresponding entry in the first array will be selected.
The above will ensure that the "unknown" is chosen every time with a 70% probability.
If you want 100 options with exactly 70 "unknown" entries and 30 "yes" or "no":
hundred_choices = ['unknown']*70 + [random.choice(['yes', 'no']) for _ in range(30)]
... and then shuffle hundred_choices
with random.shuffle
.
source to share
To get exactly a certain percentage of the values written, you can use the shuffle () function, which contains the number of items you need.
choices = ['unknown']*70 + ['yes']*15 + ['no']*15
random.shuffle(choices)
source to share