How to replace non-integer values ​​in pandas Dataframe?

I have a dataframe consisting of two columns: Age and Salary

Age   Salary
21    25000
22    30000
22    Fresher
23    2,50,000
24    25 LPA
35    400000
45    10,00,000

      

How to handle outliers in the Salary column and replace them with an integer?

+3


source to share


2 answers


If you need to replace non-numeric values, use to_numeric

with the parameter errors='coerce'

:



df['new'] = pd.to_numeric(df.Salary.astype(str).str.replace(',',''), errors='coerce')
              .fillna(0)
              .astype(int)
print (df)
   Age     Salary      new
0   21      25000    25000
1   22      30000    30000
2   22    Fresher        0
3   23   2,50,000   250000
4   24     25 LPA        0
5   35     400000   400000
6   45  10,00,000  1000000

      

+8


source


Use numpy, where you can find a non-numeric digit, replace with "0".



df['New']=df.Salary.apply(lambda x: np.where(x.isdigit(),x,'0'))

      

+1


source







All Articles