Filtering rows / float / interger values ​​in columns (Pandas)

How can I filter only string / integer / float values ​​in one column in pandas dataframe like below?

                         SIC
1                      246804
2                      135272
3                      898.01
4                     3453.33
5                       shine  
6                        add
7                         522
8                         Nan
9                      string
10                      29.11
11                        20    

      

+3


source to share


3 answers


You can use exits with pd.to_numeric

and boolean indexing.

To use only strings:

df[pd.to_numeric(df.SIC, errors='coerce').isnull()]

      

Output:

      SIC
5   shine
6     add
8     Nan
9  string

      



To use only numbers:

df[pd.to_numeric(df.SIC, errors='coerce').notnull()]

      

Output:

        SIC
1    246804
2    135272
3    898.01
4   3453.33
7       522
10    29.11
11       20

      

+5


source


You can use a method apply()

along with a function isinstance()

. Can be replaced str

with int

, float

etc .:



df = pd.DataFrame([1,2,4.5,np.NAN,'asdf',5,'string'],columns=['SIC'])
print(df)
      SIC
0       1
1       2
2     4.5
3     NaN
4    asdf
5       5
6  string

print(df[df['SIC'].apply(lambda x: isinstance(x,str))])
      SIC
4    asdf
6  string

      

0


source


Alternatives with str.isalpha

:

In [658]: df[df.SIC.str.isalpha()]
Out[658]: 
      SIC
5   shine
6     add
8     Nan
9  string

      

For ints / floats, a slightly stronger solution is required with pd.to_numeric

:

In [679]: pd.to_numeric(df.SIC, errors='coerce').dropna()
Out[679]: 
1     246804.00
2     135272.00
3        898.01
4       3453.33
7        522.00
10        29.11
11        20.00
Name: SIC, dtype: float64

      

Disadvantage: converting ints to floats as well. Workaround (Scott's solution):df[pd.to_numeric(df.SIC, errors='coerce').notnull()]

0


source







All Articles