My DataFrame has NaN values ​​but shouldn't

I cannot access the first row of data (non-index) I have, everyone else is fine:

df = pd.read_csv('stock_conf_GT_50.csv')
df.head()

      

The data looks great:

     close     eqId     date    IntDate expiry delta    ivMid   conf
0   37.380005   7   2008-01-02    39447    1    50  0.3850  0.8663
1   37.380005   7   2008-01-02    39447    1    90  0.5053  0.7876
2   36.960007   7   2008-01-03    39448    1    50  0.3915  0.8597
3   36.960007   7   2008-01-03    39448    1    90  0.5119  0.7438
4   35.179993   7   2008-01-04    39449    1    50  0.4055  0.8454

      

Column names look great too:

df.columns
Index([' close', 'eqId', 'date', 'IntDate', 'expiry', 'delta', 'ivMid',
   'conf'],
  dtype='object')

      

I see some data:

df['eqId'].head()
0    7
1    7
2    7
3    7
4    7
Name: eqId, dtype: int64

      

But not the first (non-index) column:

df['close'].head()

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-118-f7ce330a88a7> in <module>()
----> 1 df['close'].head()

C:\Users\camcompco\AppData\Roaming\Python\Python34\site-       packages\pandas\core\frame.py in __getitem__(self, key)
   1789             return self._getitem_multilevel(key)
   1790         else:
-> 1791             return self._getitem_column(key)
   1792 
   1793     def _getitem_column(self, key):

C:\Users\camcompco\AppData\Roaming\Python\Python34\site-packages\pandas\core\frame.py in _getitem_column(self, key)
   1796         # get column
   1797         if self.columns.is_unique:
-> 1798             return self._get_item_cache(key)
   1799 
   1800         # duplicate columns & possible reduce dimensionaility

    C:\Users\camcompco\AppData\Roaming\Python\Python34\site-packages\pandas\core\generic.py in _get_item_cache(self, item)
   1082         res = cache.get(item)
   1083         if res is None:
-> 1084             values = self._data.get(item)
   1085             res = self._box_item_values(item, values)
   1086             cache[item] = res

C:\Users\camcompco\AppData\Roaming\Python\Python34\site-packages\pandas\core\internals.py in get(self, item, fastpath)
   2849 
   2850             if not isnull(item):
-> 2851                 loc = self.items.get_loc(item)
   2852             else:
   2853                 indexer = np.arange(len(self.items))   [isnull(self.items)]

C:\Users\camcompco\AppData\Roaming\Python\Python34\site-packages\pandas\core\index.py in get_loc(self, key, method)
   1576         """
   1577         if method is None:
-> 1578             return self._engine.get_loc(_values_from_object(key))
   1579 
   1580         indexer = self.get_indexer([key], method=method)

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:3811)()

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:3691)()

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item  (pandas\hashtable.c:12336)()

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12287)()

KeyError: 'close'

      

And this is what I get when I run this code:

DataFrame(df,columns=['close','ivMid','eqId'],index=None)

    close   ivMid   eqId
    0   NaN 0.3850  7
    1   NaN 0.5053  7
    2   NaN 0.3915  7
    3   NaN 0.5119  7
    4   NaN 0.4055  7
    5   NaN 0.5183  7
    6   NaN 0.4464  7
    7   NaN 0.5230  7
    8   NaN 0.4453  7
    9   NaN 0.4826  7
    10  NaN 0.5668  7

      

+3


source to share


1 answer


You can see that the close has a space before it in the Index:

Index([' close', 'eqId', 'date', 'IntDate', 'expiry', 'delta', 'ivMid',

      

Hence, KeyError

when trying to access the column "close".
You must access it via df[' close']

.



An alternative would be to apply stripe to the columns to ensure they don't have leading spaces:

df.index = df.index.map(lambda x: x.strip())

      

+2


source







All Articles