Apache independent gamma distributions

Hey. I noticed some strange behavior in the Apache Maths library (version 2.2) specifically in the class org.apache.commons.math.distribution.GammaDistributionImpl

, although I think it probably applies to other distributions as well.

I wanted to take samples from different gamma distributions like this:

public static final double[] gammaSamples(final double[] shapeParameters)
{
    double[] samples = new double[shapeParameters.length];
    for (int i = 0; i < shapeParameters.length; i++)
    {
        GammaDistributionImpl gd = new GammaDistributionImpl(shapeParameters[i], 1.0d);
        try
        {
            samples[i] = gd.sample();
        }
        catch (MathException e)
        {
            e.printStackTrace();
        }
    }
    return samples;
}

      

However, when I run the code, I find all the samples are suspiciously similar, i.e. given

public static void main(String[] args)
{
    System.out.println(Arrays.toString(gammaSamples(new double[] { 2.0d, 2.0d, 2.0d})));
}

      

Some examples of outputs:

[0.8732612631078758, 0.860967116242789, 0.8676088095186796]
[0.6099133517568643, 0.5960661621756747, 0.5960661621756747]
[2.1266766239021364, 2.209383544840242, 2.209383544840242]
[0.4292184700011395, 0.42083613304362544, 0.42083613304362544]

      

I think the problem is with the default random number generator using the same / similar seeds for each distribution, I tested it like this:

public static final double[] gammaSamples(final double[] shapeParameters, final Random random)
{
    double[] samples = new double[shapeParameters.length];
    for (int i = 0; i < shapeParameters.length; i++)
    {
        GammaDistributionImpl gd = new GammaDistributionImpl(shapeParameters[i], 1.0d);
        gd.reseedRandomGenerator(random.nextLong());
        try
        {
            samples[i] = gd.sample();
        }
        catch (MathException e)
        {
            e.printStackTrace();
        }
    }
    return samples;
}

      

This seems to fix the problem, i.e. given

public static void main(String[] args)
{
    System.out.println(Arrays.toString(gammaSamples(new double[] { 2.0d, 2.0d, 2.0d }, new Random())));
}

      

Some examples of outputs:

[2.7506981228470084, 0.49600951917542335, 6.841476090550152]
[1.7571444623500108, 1.941865982739116, 0.2611420777612158]
[6.043421570871683, 0.8852269293415297, 0.6921033738466775]
[1.3859078943455487, 0.8515111736461752, 3.690127105402944]

      

My question is:

What's happening? Is this a bug or is it designed to distribute Apache Maths this way?

It seems odd to me that if I create separate distribution objects, I have to worry about which seeds are given to them and make sure they are different enough.

Another slight annoyance is that I cannot transfer these distributions to my own random object, but only allow the seed replacement with the reseedRandomGenerator (long seed) method. Being able to pass them in, my own random object would be quite useful when trying to reproduce the results.

Thanks for any help.

+3


source to share


1 answer


By looking at the javadoc:

I saw that there is a method public double[] sample(int sampleSize) throws MathException

Creates a random sample from the distribution. By default, the implementation generates a sample by calling sample () in a loop.

You tried?



double[] samples = sample(shapeParameters.length);

      

Edit . Sorry, I saw you compute a new one every time GammaDistributionImpl

with a new parameter alpha

. I assume this is due to the fact that the initial values ​​are sourced from the system clock with a finite resolution, and private calls to the constructor will give the same results. Have a look at this SO question .

Here are some resources to help you do more in-depth research:

+2


source







All Articles