How do I test hypotheses in Python?
1 answer
The SciPy package has a whole module with lots of statistics, including hypothesis tests and built-in distribution functions: scipy.stats
For example, you can check if a random sample is normally distributed using the Kolmogorov-Smirnov test:
import numpy as np
from scipy.stats import norm, pareto, kstest
n = 1000
sample_norm = norm.rvs(size=1000) # generate normally distributed random sample
sample_pareto = pareto.rvs(1.0, size=1000) # sample from some other distribution for comparison
d_norm, p_norm = kstest(sample_norm, norm.cdf) # test if the sample_norm is distributed normally (correct hypothesis)
d_pareto, p_pareto = kstest(sample_pareto, norm.cdf) # test if the sample_pareto is distributed normally (false hypothesis)
print('Statistic values: %.4f, %.4f' % (d_norm, d_pareto))
print('P-values: %.4f, %.4f' % (p_norm, p_pareto))
As you can see, it kstest
returns statistic value and p value.
norm.cdf
denotes the cumulative distribution function of a normal random variable.
+1
source to share