Pandas COUNTIF based on column value
I am trying to essentially do COUNTIF in pandas to count how many items in a row match the number in the first column.
Dataframe:
a b c d
1 2 3 1
2 3 4 2
3 5 6 3
So, I want to count the instances in line (b, c, d) that match a. Line 1, for example, should be 1, since only d matches a.
I searched a bit for this, but so far only found examples where its total (e.g. counting all values ββgreater than 0) but not based on the dataframe column. I am guessing its some form of logic that masks based on the column but df == df.a
doesn't seem to work
You can use eq
which you can pass to a parameter axis
to indicate the direction of the comparison, then you can do the sum of the string to count the number of values ββmatched:
df.eq(df.a, axis=0).sum(1) - 1
#0 1
#1 1
#2 1
#dtype: int64
df.apply(lambda x: (x == x[0]).sum()-1,axis=1)