How to programmatically compute discrete probabilities

I am using EnumeratedIntegerDistribution for generated selections from my keyset.

How to programmatically compute an array of discrete probabilities. for example, I might want an approximate "normal" or Zipf distribution.

    int[] keys = keyDomain(domainMin, domainMax);
    double[] discreteProbabilities = new double[] { ?, ?, ?, ?, .... };

    EnumeratedIntegerDistribution distribution = new EnumeratedIntegerDistribution(keys, discreteProbabilities);

    int numSamples = 100;
    int[] samples = distribution.sample(numSamples);

      

+3


source to share


1 answer


As long as your distribution is indeed discrete and defined over the integers of your range (like a Poisson distribution), there is no problem assigning your array to discreteProbabilities [] as long as you have some kind of formula that you can compute for the probability of each integer value in your range, and then, since you are limiting the range, you divide the assigned probabilities by their sum to get the true distribution over your range, that is, sum = 1.



However, if your distribution is "continuous", that is, the samples can be any floating point / real number, either within the range or not, then things are more complicated. You must decide how to convert this distro to a distro based on your range's integers. One way is to simply evaluate the probability density function (for example, essentially exp (-x ^ 2/2) for a normal distribution) over your integer values, and then divide by the sum over the integer range. However, this may not be very realistic if you assume, for example, that you are rounding the sample to the nearest integer value to get your integer value. If you want to do this, then you have to compute the integral of a continuous probability density (e.g. with numerical integration,if you don't have an anti-derivative formula), where the integral is between n-0.5 and n + 0.5 for every integer n in your range. Then this is your probability value for an integer number n and similar to the previous one, you divide by the sum over an integer range so that your probabilities are up to 1.

+1


source







All Articles