R: calculate the probability of drawing at least 1 red marble
Suppose my population has n marbles and only 1% of them are red. In an example of 30 draws, what is the probability that I will draw at least 1 red marble?
I know P (at least 1 red marble) = 1 - P (no red marble)
I wrote a function in R
pMarble = function(n){
1-(choose(n-ceiling(0.01*n), 30)/choose(n, 30))
}
The function takes in 1 parameter, the number of marbles in the population, and I use sapply to iterate over different values โโof n
n = 100:1000 toplot = sapply(n, pMarble) plot(n, toplot)
Why is the plot interrupted? I thought it would just be a decreasing continuous function. Since the total amount of marble is increasing, given that I am only painting 30 marbles, is it likely that at least 1 red marble (present in the population with a frequency of 1%) will decrease monotonically? Why do I see gaps?
source to share
Suppose my population has n marbles and only 1% of them are red. In an example of 30 draws, what is the probability that I will draw at least 1 red marble?
You are correct that there is at least 1 chance of red marble 1-Pr(no marbles)
; for the binomial, it is actually the case that (since the individual samples of marble within a draw are independent) that the probability of no marble in 30 draws of n
marble is the same as the probability of no marble in one draw of 30n
marble ... so we have 1-(1-p)^(30n)
.
p <- 0.01
par(las=1,bty="l") ## cosmetic
curve(1-(1-p)^(30*x),from=0,to=100,
xlab="Number of 30-marble draws",ylab="prob(>0 marbles)")
Test this empirically for one case:
(1-(1-p)^(30*3)) ## 3 draws, 0.595
set.seed(101)
mean(replicate(100000,
any(rbinom(3,prob=0.01,size=30)>0)))
## 0.59717
source to share