Normalize histogram of multiple data
I have several arrays that I am plotting a histogram, for example:
import numpy as np
import matplotlib.pyplot as plt
x = np.random.normal(0,.5,1000)
y = np.random.normal(0,.5,100000)
plt.hist((x,y),normed=True)
Of course, however, this normalizes both arrays individually so that they both have the same peak. I want to normalize them to the total number of elements so that the histogram y
will be noticeably taller than y x
. Is there a convenient way to do this in matplotlib, or do I have to mess around with numpy? I haven't found anything about this.
Another way of saying is that if I were to do a cumulative plot of the two arrays instead, they would not have to exceed 1, but should add to 1.
source to share
Yes, you can calculate the histogram using numpy and renormalize it.
x = np.random.normal(0,.5,1000) y = np.random.normal(0,.5,100000) xhist, xbins = np.histogram(x, normed=True) yhist, ybins = np.histogram(x, normed=True)
And now you are applying your regularization. For example, if you want x to be normalized to 1 and y proportionally:
yhist *= len(y) / len(x)
Now, to plot the histogram:
def plot_histogram(data, edge_bins, **kwargs):
bins = edge_bins[:-1] + edge_bins[1:]
plt.step(bins, data, **kwargs)
plot_histogram(xhist, xbins, c='b')
plot_histogram(yhist, ybins, c='g')
source to share