Histogram configuration

Question

Histogram configuration

We have a dataset. We want to get their histograms and plot them on a log scale. We use the following code:

y,binEdges=np.histogram(hist_data,bins=200)
bincenters = 0.8*(binEdges[1:]+binEdges[:-1])
p.plot(bincenters,y,'-')
p.yscale('log', nonposy='clip')

Result: Figure of bins = 200

However, when I try to increase the bit (i.e. from bins = 200 to bins = 600) the result is: Figure of bins = 600]

How can you only store the lines and not the entire spectrum of each histogram?

+3

python numpy matplotlib histogram

DimKoim Dec 14. 14 at 17:28

source to share

2 answers

If some of the bins are empty, you can filter them using boolean indexing :

p.plot(bincenters[y>0],y[y>0],'-')

+1

wwii Dec 14. 14 at 18:29

source to share

Hooked · Accepted Answer · 2014-12-14T18:37:26+0000

What you see is that some of the bins are empty, so it draws a rectangle that goes from f(y) -> 0 -> f(y+delta) -> 0 -> f(y+2*delta)

. A common trick to get around this is not to use an abrupt cut as your bin (we call it the kernel). You can use, for example, "Kernel Density Estimation" to "flatten" the histogram. In this case, you put a bunch of gaussians centered at your data points - the sum reflects a reflection of the underlying probability distribution. You can use scipy to run KDE, or a nice package seaborn

that will do this with automatic plotting. The image from the linked sea view example gives a good illustration of this:

enter image description here

To use matplotlib hist

without drawing and only using strings, go to histtype="step"

.

Histogram configuration

More articles: