How to calculate the sum of two normal distributions

I have a value type that represents a Gaussian distribution:

struct Gauss {
    double mean;
    double variance;
}

      

I would like to perform an integral over a series of these values:

Gauss eulerIntegrate(double dt, Gauss iv, Gauss[] values) {
    Gauss r = iv;
    foreach (Gauss v in values) {
        r += v*dt;
    }
    return r;
}

      

My question is how to implement the padding for these normal distributions.

The scalar ( dt

) multiplication seemed simple enough. But it was not easy! Thanks to FOOSHNICK for the help:

public static Gauss operator * (Gauss g, double d) {
    return new Gauss(g.mean * d, g.variance * d * d);
}

      

However, the addition eludes me. I guess I can just add funds; this is the deviation that is causing me problems. Any of these definitions seem "logical" to me.

public static Gauss operator + (Gauss a, Gauss b) {
    double mean = a.mean + b.mean;
    // Is it this? (Yes, it is!)
    return new Gauss(mean, a.variance + b.variance);        
    // Or this? (nope)
    //return new Gauss(mean, Math.Max(a.variance, b.variance));
    // Or how about this? (nope)
    //return new Gauss(mean, (a.variance + b.variance)/2);
}

      

Can anyone help identify a statistically correct, or at least a "reasonable" version of the operator +

?

I suppose I could switch the code to use interval arithmetic, but I was hoping to stay in a world of problems and statistics.

+1


source to share


7 replies


The sum of two normal distributions is itself a normal distribution:

N (mean 1, variance 1) + N (mean 2, variance2) ~ N (mean 1 + mean 2, variance1 + variance2)

It's all on the wikipedia page .



Be careful that these are indeed variances and not standard deviations.

// X + Y
public static Gauss operator + (Gauss a, Gauss b) {
    //NOTE: this is valid if X,Y are independent normal random variables
    return new Gauss(a.mean + b.mean, a.variance + b.variance);
}

// X*b
public static Gauss operator * (Gauss a, double b) {
    return new Gauss(a.mean*b, a.variance*b*b);
}

      

+7


source


More precisely:

If a random variable Z is defined as a linear combination of two uncorrelated Gaussian random variables X and Y, then Z itself is a Gaussian random variable, for example:

if Z = aX + bY, then mean (Z) = a * mean (X) + b * mean (Y), and variance (Z) = a 2 * variance (X) + b 2 * variance (Y).

If the random variables are correlated, you should consider this. Deviation (X) is determined by the expected value E ([X-mean (X)] 2 ). Doing this for Z = aX + bY, we get:



variance (Z) = a 2 * variance (X) + b 2 * variance (Y) + 2ab * covariance (X, Y)

If you sum two uncorrelated random variables that do not have Gaussian distributions, then the distribution of the sum is convolution two-component distributions.

If you are summing two correlated non-Gaussian random variables, you need to perform the corresponding integrals yourself.

+3


source


Yes, I thought that you cannot add Gaussian distributions together, but you can!

http://mathworld.wolfram.com/NormalSumDistribution.html

Indeed, the mean is the sum of the individual distributions, and the variance is the sum of the individual distributions.

+2


source


Well, your scalar multiplication is wrong - you must multiply the variance by the square of d. If you add a constant, then just add it to the mean, the variance remains the same. If you add two distributions, add them and add deviations.

+2


source


Can anyone help determine a statistically correct, or at least a "reasonable" version of the + operator?

Probably not, since adding two distributions means different things - working on reliability and support, my first reaction from the name would be the distribution of the mtbf system, if the mtbf of each part is normally distributed and the system has no redundancy, you are talking about the distribution of the sum of two normally distributed independent variations rather than the (logical) sum of the two normal distribution effects. Very often, operator overloading has amazing semantics. I would leave it as a function and call it "normalSumDistribution" if your code does not have a specific target audience.

+2


source


I'm not sure if I like what you call "integration" over a range of values. Do you mean this word in the sense of calculus? Are you trying to do numerical integration? There are other, better ways to do this. Your gaze doesn't suit me, let alone the optimal one.

Gaussian distribution is a nice, smooth feature. I think a good quadrature approach or Runge-Kutta would be a much better idea.

+1


source


I would think it depends on what kind of addition you are doing. If you just want to get a normal distribution with properties (mean, standard deviation, etc.) equal to the sum of the two distributions, then adding the properties mentioned in other answers is fine. This assumption is used in something like PERT if, when adding a large number of normal probability distributions, the resulting probability distribution is a different normal probability distribution.

The problem arises when the two added distributions are not alike. Take, for example, adding a probability distribution with a mean of 2 and a standard deviation of 1 and a probability of 10 with a standard deviation of 2. If you add these two distributions up, you get a probability distribution with two peaks, one in 2ish and one in 10ish. Hence, the result is not a normal distribution. The assumption of adding distros is really valid if the source distributions are either very similar, or you have many original distributions so the peaks and troughs can be aligned.

+1


source







All Articles