Pandas add calculated row to bottom of dataframe

Below is a small sample of a dataframe that I have and I want to add a calculated row to the bottom of it:

sch     q1     q2     q3
acc     Yes    Yes    No
acc     Yes    No     No
acc     Yes    No     No
acc     Yes    Yes    Yes

      

I want to add a line at the bottom that will give me the percentage of "Yes" values ​​for each column so that it looks like below.

sch     q1     q2     q3
acc     Yes    Yes    No
acc     Yes    No     No
acc     Yes    No     No
acc     Yes    Yes    Yes
acc     1.00   0.5    0.25

      

Any help would be greatly appreciated.

+3


source to share


4 answers


I see your lambda and bring up a pure pandas solution:

df.append(df.eq('Yes').mean(), ignore_index=True)

      



You will not specify what should happen to the column sch

, so I ignored it. In my current solution, this column will get the value 0

.

+2


source


suppose the following approach:



In [11]: df.loc[len(df)] = ['acc'] + df.filter(regex='^q\d+') \
                                       .eq('Yes').mean().values.tolist()

In [12]: df
Out[12]:
   sch   q1   q2    q3
0  acc  Yes  Yes    No
1  acc  Yes   No    No
2  acc  Yes   No    No
3  acc  Yes  Yes   Yes
4  acc    1  0.5  0.25

      

+2


source


df.append(df.apply(lambda x: len(x[x=='Yes'])/len(x)),ignore_index=True)

      

Output:

    q1   q2    q3
0  Yes  Yes    No
1  Yes   No    No
2  Yes   No    No
3  Yes  Yes   Yes
4    1  0.5  0.25

      

+1


source


Use pd.concat

, mean

, to_frame

and T for transposition.

pd.concat([df,df.replace({'Yes':True,'No':False}).mean().to_frame().T.assign(sch='acc')])

      

Output:

    q1   q2    q3  sch
0  Yes  Yes    No  acc
1  Yes   No    No  acc
2  Yes   No    No  acc
3  Yes  Yes   Yes  acc
0    1  0.5  0.25  acc

      

+1


source







All Articles