Python histogram of split () data

I'm trying to do histgramm over a text file that contains float:

import matplotlib.pyplot as plt

c1_file = open('densEst1.txt','r')
c1_data =  c1_file.read().split()    
c1_sum = float(c1_data.__len__())

plt.hist(c1_data)
plt.show()

      

The output c1_data.__len__()

works fine, but hist()

throws:

C:\Python27\python.exe "C:/x.py"
Traceback (most recent call last):
  File "C:/x.py", line 7, in <module>
    plt.hist(c1_data)
  File "C:\Python27\lib\site-packages\matplotlib\pyplot.py", line 2958, in hist
    stacked=stacked, data=data, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\__init__.py", line 1812, in inner
    return func(ax, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axes\_axes.py", line 5995, in hist
    if len(xi) > 0:
TypeError: len() of unsized object

      

+3


source to share


2 answers


The main reason for rejection plt.hist

is that the argument c1_data

is a list containing strings. When you are a open

file and read

its result is a line containing the contents of the files:

To read the contents of files, call f.read(size)

that reads some amount of data and returns it as a string (in text mode) or a byte object (in binary mode).

Emphasis on mine.

When you now have split

this long line, you will get a list containing the lines:

Returns a list of words in a string using sep as the separator string.



However, a list of strings is not a valid value for plt.hist

:

>>> import matplotlib.pyplot as plt
>>> plt.hist(['1', '2'])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
      1 import matplotlib.pyplot as plt
----> 2 plt.hist(['1', '2'])

C:\...\lib\site-packages\matplotlib\pyplot.py in hist(x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, hold, data, **kwargs)
   3079                       histtype=histtype, align=align, orientation=orientation,
   3080                       rwidth=rwidth, log=log, color=color, label=label,
-> 3081                       stacked=stacked, data=data, **kwargs)
   3082     finally:
   3083         ax._hold = washold

C:\...\lib\site-packages\matplotlib\__init__.py in inner(ax, *args, **kwargs)
   1895                     warnings.warn(msg % (label_namer, func.__name__),
   1896                                   RuntimeWarning, stacklevel=2)
-> 1897             return func(ax, *args, **kwargs)
   1898         pre_doc = inner.__doc__
   1899         if pre_doc is None:

C:\...\lib\site-packages\matplotlib\axes\_axes.py in hist(***failed resolving arguments***)
   6178             xmax = -np.inf
   6179             for xi in x:
-> 6180                 if len(xi) > 0:
   6181                     xmin = min(xmin, xi.min())
   6182                     xmax = max(xmax, xi.max())

TypeError: len() of unsized object

      

Decision:

You can simply convert it to a float array:

>>> import numpy as np
>>> plt.hist(np.array(c1_data, dtype=float))

      

+2


source


Pointing to an example using numpy ... easy and results below with code.

pandas will work too, separation and datatype are readable (even if it is column data), also you can read as a vector (depends on data size) /

# !/usr/bin/env python
%matplotlib inline

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
import numpy as np

# will be better to read with numpy because you use float ...
#a = np.fromfile(open('from_file', 'r'), sep='\n') 

from_file = np.array([1, 2, 2.5]) #sample data a
c1_data = from_file.astype(float) # convert the data in float

plt.hist(c1_data)  # plt.hist passes it arguments to np.histogram
plt.title("Histogram without 'auto' bins")
plt.show()

      



without automatic bins

plt.hist(c1_data, bins='auto')  # plt.hist passes it arguments to np.histogram
plt.title("Histogram with 'auto' bins")
plt.show()

      

using 'auto' bins

+1


source







All Articles