Export histogram from Python to Excel

I am new to coding and I need help with exporting data or just print it to python shell. Code:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import openpyxl

data = pd.read_excel('/Users/user/Desktop/Data/Book1.xlsx')
df = data.hist(bins=40)
plt.xlim([0,1000])
plt.title('Data')
plt.xlabel('Neuron')
plt.ylabel('# of Spikes')
plt.show()

      

So the code makes a histogram after fetching 40 boxes of data, the range is 0 to 1558.5 or so. What I am trying to do is export the AFTER binning data since I try to write:

writer = pd.ExcelWriter('/Users/user/Desktop/Data/output.xlsx')
df1.to_excel(writer,'Sheet2')
writer.save()

      

it saves the original data, not the data that was plotted with the histogram, and the bins are superimposed. Also, if I can get some help on how to change the number of cells in the range 0 to 5, 5 to 10, etc., it basically reads at an interval of 5, all the way to the end of the data, so it is at the end after all, you will stop at the last bit of data and insert that data into the trash. Any help is appreciated and it doesn't have to be specifically pandas. Thank. BTW, I think what I did was a Dataframe from imported data, again newbie so not so sure.

+3


source to share


1 answer


The string df = data.hist(bins=40)

does not actually create a DataFrame of bound data. df

ends up containing numpy ndarray

containing the object matplotlib.axes._subplots.AxesSubplot

.

One way to save binded data is to create a histogram through matplotlib hist()

. Add the following lines immediately after the line read_excel

:

import matplotlib.pyplot as plt
counts, bins, bars = plt.hist(data.values, bins=40)
df = pd.DataFrame({'bin_leftedge': bins[:-1], 'count': counts})

      

Then, as pointed out in the comment, be sure to change df1.to_excel(writer,'Sheet2')

to df.to_excel(writer,'Sheet2')

.

bins

contains the edges of each bin, so the array bins

will have one more element than the array counts

. Keep in mind that the above code links each count to the left edge of that counting bin and does not preserve the end edge of the end edge.

There might be a better or pandas-idiomatic way to do this, but hopefully this suits your needs.




EDIT: Width of whole bins

You can pass the list

edges of a bin bins=

to data.hist()

either plt.hist()

. To create cells 5 wide that start at 0 and include the maximum data value, this should work:

counts, bins, patches = plt.hist(data.values, bins=range(0, max(data.values)+5, 5))

      

Explanation: Built-in range(start, stop, step)

Python only accepts integers and returns a list containing the left endpoint ( start

) but excluding the right endpoint ( stop

). (In mathematical notation, range(start, stop, step)

returns evenly spaced integers in a half-open range [start, stop)

.) +5

The above string ensures that the last right edge of the bin ends on the right side of the maximum data value.

+1


source







All Articles