Error: Cannot convert float NaN to integer in pandas

I have a dataframe:

   a            b     c      d
0 nan           Y     nan   nan
1  1.27838e+06  N      3     96
2 nan           N      2    nan
3  284633       Y     nan    44

      

I am trying to modify data that is not null for type interger to avoid exponential data (1.27838e + 06):

f=lambda x : int(x)
df['a']=np.where(df['a']==None,np.nan,df['a'].apply(f))

      

But I am getting an error and an event that I was thinking that I want to change the dtype of a non-null value, can anyone please indicate my error? thank

+3


source to share


2 answers


Pandas has no way to store NaN values ​​for integers . Strictly speaking, you can have a column of mixed data types, but that can be computationally inefficient. Therefore, if you insist, you can do



df['a'] = df['a'].astype('O')
df.loc[df['a'].notnull(), 'a'] = df.loc[df['a'].notnull(), 'a'].astype(int)

      

+2


source


As far as I read in the pandas documentation , it is not possible to represent an integer NaN

:

"In the absence of high-performance NA support built into NumPy from scratch, the main victim is the ability to represent NA across entire arrays."



As explained later, this is due to memory and performance considerations, and to keep the resulting series "numeric". One possibility is to use arrays dtype=object

.

+1


source







All Articles