How do I use .le () and .ge () when filtering columns of a pandas dataframe?

Here is an example pandas DataFrame:

import pandas as pd
import numpy as np

data = {"first_column": ["item1", "item2", "item3", "item4", "item5", "item6", "item7"],
        "second_column": ["cat1", "cat1", "cat1", "cat2", "cat2", "cat2", "cat2"],
        "third_column": [5, 1, 8, 3, 731, 189, 9]}

df = pd.DataFrame(data)

df
     first_column second_column  third_column
0        item1          cat1             5
1        item2          cat1             1
2        item3          cat1             8
3        item4          cat2             3
4        item5          cat2           731
5        item6          cat2           189
6        item7          cat2             9

      

I would like to "filter" the third column based on 10 = <x = <1000.

If I do greater than or equal to 10, this:

df['greater_than_ten'] = df.third_column.ge(10).astype(np.uint8)

      

If I do less than 1000 this:

df['less_than_1K'] = df.third_column.le(1000).astype(np.uint8)

      

but I cannot do these operations at the same time, i.e.

df['both'] = df.third_column.le(1000).ge(10).astype(np.uint8)

      

And I could not try these operations consistently.

How can I use .ge()

and .le()

together?

+3


source to share


3 answers


You can use between()

interests instead of your own series.

df['both'] = df.third_column.between(10, 1000).astype(np.uint8)

      



Yielding

>>> df

  first_column second_column  third_column  both
0        item1          cat1             5     0
1        item2          cat1             1     0
2        item3          cat1             8     0
3        item4          cat2             3     0
4        item5          cat2           731     1
5        item6          cat2           189     1
6        item7          cat2             9     0

      

+3


source


Use &

to combine conditions:



In [28]:
df['both'] = df['third_column'].ge(10) & df['third_column'].le(1000)
df

Out[28]:
  first_column second_column  third_column   both
0        item1          cat1             5  False
1        item2          cat1             1  False
2        item3          cat1             8  False
3        item4          cat2             3  False
4        item5          cat2           731   True
5        item6          cat2           189   True
6        item7          cat2             9  False

      

+2


source


In [11]: df['both'] = df.eval("10 <= third_column <= 1000").astype(np.uint8)

In [12]: df
Out[12]:
  first_column second_column  third_column  both
0        item1          cat1             5     0
1        item2          cat1             1     0
2        item3          cat1             8     0
3        item4          cat2             3     0
4        item5          cat2           731     1
5        item6          cat2           189     1
6        item7          cat2             9     0

      

UPDATE:

In [13]: df.eval("second_column in ['cat2'] and 10 <= third_column <= 1000").astype(np.uint8)
Out[13]:
0    0
1    0
2    0
3    0
4    1
5    1
6    0
dtype: uint8

      

+2


source







All Articles