How do I use .le () and .ge () when filtering columns of a pandas dataframe?
Here is an example pandas DataFrame:
import pandas as pd
import numpy as np
data = {"first_column": ["item1", "item2", "item3", "item4", "item5", "item6", "item7"],
"second_column": ["cat1", "cat1", "cat1", "cat2", "cat2", "cat2", "cat2"],
"third_column": [5, 1, 8, 3, 731, 189, 9]}
df = pd.DataFrame(data)
df
first_column second_column third_column
0 item1 cat1 5
1 item2 cat1 1
2 item3 cat1 8
3 item4 cat2 3
4 item5 cat2 731
5 item6 cat2 189
6 item7 cat2 9
I would like to "filter" the third column based on 10 = <x = <1000.
If I do greater than or equal to 10, this:
df['greater_than_ten'] = df.third_column.ge(10).astype(np.uint8)
If I do less than 1000 this:
df['less_than_1K'] = df.third_column.le(1000).astype(np.uint8)
but I cannot do these operations at the same time, i.e.
df['both'] = df.third_column.le(1000).ge(10).astype(np.uint8)
And I could not try these operations consistently.
How can I use .ge()
and .le()
together?
+3
source to share
3 answers
You can use between()
interests instead of your own series.
df['both'] = df.third_column.between(10, 1000).astype(np.uint8)
Yielding
>>> df
first_column second_column third_column both
0 item1 cat1 5 0
1 item2 cat1 1 0
2 item3 cat1 8 0
3 item4 cat2 3 0
4 item5 cat2 731 1
5 item6 cat2 189 1
6 item7 cat2 9 0
+3
source to share
Use &
to combine conditions:
In [28]:
df['both'] = df['third_column'].ge(10) & df['third_column'].le(1000)
df
Out[28]:
first_column second_column third_column both
0 item1 cat1 5 False
1 item2 cat1 1 False
2 item3 cat1 8 False
3 item4 cat2 3 False
4 item5 cat2 731 True
5 item6 cat2 189 True
6 item7 cat2 9 False
+2
source to share
In [11]: df['both'] = df.eval("10 <= third_column <= 1000").astype(np.uint8)
In [12]: df
Out[12]:
first_column second_column third_column both
0 item1 cat1 5 0
1 item2 cat1 1 0
2 item3 cat1 8 0
3 item4 cat2 3 0
4 item5 cat2 731 1
5 item6 cat2 189 1
6 item7 cat2 9 0
UPDATE:
In [13]: df.eval("second_column in ['cat2'] and 10 <= third_column <= 1000").astype(np.uint8)
Out[13]:
0 0
1 0
2 0
3 0
4 1
5 1
6 0
dtype: uint8
+2
source to share