Dataframe-filtering rows by column values
I have a Dataframe df
Num1 Num2
one 1 0
two 3 2
three 5 4
four 7 6
five 9 8
I want to filter rows that have a value greater than 3 in Num1 and less than 8 in Num2.
I tried this
df = df[df['Num1'] > 3 and df['Num2'] < 8]
but an error occured.
ValueError: The truth value of the series is ambiguous.
so i used
df = df[df['Num1'] > 3]
df = df[df['Num2'] < 8]
I think the code could be shorter.
Is there another way?
source to share
You need to add ()
because the precedence of the operator with the bit operator is &
:
df1 = df[(df['Num1'] > 3) & (df['Num2'] < 8)]
print (df1)
Num1 Num2
three 5 4
four 7 6
The best explanation is here .
Or, if you want the shortest code, use query
:
df1 = df.query("Num1 > 3 and Num2 < 8")
print (df1)
Num1 Num2
three 5 4
four 7 6
df1 = df.query("Num1 > 3 & Num2 < 8")
print (df1)
Num1 Num2
three 5 4
four 7 6
source to share
Yes, you can use the operator &
:
df = df[(df['Num1'] > 3) & (df['Num2'] < 8)]
# ^ & operator
This is because it and
works on the truth value of the two operands, whereas an operator &
can be defined on arbitrary data structures.
The parentheses are required here because the &
link is shorter >
and <
, therefore, without parentheses, Python will read the expression as df['Num1'] > (3 & df['Num2']) < 8
.
Note that you can use the operator |
as boolean or.
source to share