Normalize histogram of multiple data

I have several arrays that I am plotting a histogram, for example:

import numpy as np
import matplotlib.pyplot as plt

x = np.random.normal(0,.5,1000)
y = np.random.normal(0,.5,100000)

plt.hist((x,y),normed=True)

      

Of course, however, this normalizes both arrays individually so that they both have the same peak. I want to normalize them to the total number of elements so that the histogram y

will be noticeably taller than y x

. Is there a convenient way to do this in matplotlib, or do I have to mess around with numpy? I haven't found anything about this.

Another way of saying is that if I were to do a cumulative plot of the two arrays instead, they would not have to exceed 1, but should add to 1.

+3


source to share


1 answer


Yes, you can calculate the histogram using numpy and renormalize it.

x = np.random.normal(0,.5,1000)
y = np.random.normal(0,.5,100000)

xhist, xbins = np.histogram(x, normed=True)
yhist, ybins = np.histogram(x, normed=True)

      

And now you are applying your regularization. For example, if you want x to be normalized to 1 and y proportionally:

yhist *= len(y) / len(x)

      



Now, to plot the histogram:

def plot_histogram(data, edge_bins, **kwargs):
    bins = edge_bins[:-1] + edge_bins[1:]
    plt.step(bins, data, **kwargs)

plot_histogram(xhist, xbins, c='b')
plot_histogram(yhist, ybins, c='g')

      

enter image description here

+1


source







All Articles