Normalize histogram of multiple data

Question

Normalize histogram of multiple data

I have several arrays that I am plotting a histogram, for example:

import numpy as np
import matplotlib.pyplot as plt

x = np.random.normal(0,.5,1000)
y = np.random.normal(0,.5,100000)

plt.hist((x,y),normed=True)

Of course, however, this normalizes both arrays individually so that they both have the same peak. I want to normalize them to the total number of elements so that the histogram y

will be noticeably taller than y x

. Is there a convenient way to do this in matplotlib, or do I have to mess around with numpy? I haven't found anything about this.

Another way of saying is that if I were to do a cumulative plot of the two arrays instead, they would not have to exceed 1, but should add to 1.

+3

python numpy matplotlib histogram normalize

Alex 11 Aug 14 at 21:09

source to share

1 answer

Davidmh · Accepted Answer · 2014-08-11T21:25:25+0000

Yes, you can calculate the histogram using numpy and renormalize it.

x = np.random.normal(0,.5,1000)
y = np.random.normal(0,.5,100000)

xhist, xbins = np.histogram(x, normed=True)
yhist, ybins = np.histogram(x, normed=True)

And now you are applying your regularization. For example, if you want x to be normalized to 1 and y proportionally:

yhist *= len(y) / len(x)

Now, to plot the histogram:

def plot_histogram(data, edge_bins, **kwargs):
    bins = edge_bins[:-1] + edge_bins[1:]
    plt.step(bins, data, **kwargs)

plot_histogram(xhist, xbins, c='b')
plot_histogram(yhist, ybins, c='g')

enter image description here

Normalize histogram of multiple data

More articles: