Pandas: add crosstab totals

How do I add an extra row and an extra column for totals to my crosstab?

df = pd.DataFrame({"A": np.random.randint(0,2,100), "B" : np.random.randint(0,2,100)})
ct = pd.crosstab(new.A, new.B)
ct

      

enter image description here

I thought I would add a new column (obtained by row summation)

ct["Total"] = ct.0 + ct.1

      

but it doesn't work.

+3


source to share


3 answers


This is because "attributed" column access does not work with integer column names. Using standard indexing:

In [122]: ct["Total"] = ct[0] + ct[1]

In [123]: ct
Out[123]:
B   0   1  Total
A
0  26  24     50
1  30  20     50

      

See warnings at the end of this section in the docs: http://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute-access



When you want to work with strings, you can use .loc

:

In [126]: ct.loc["Total"] = ct.loc[0] + ct.loc[1]

      

In this case, it is ct.loc["Total"]

equivalent toct.loc["Total", :]

+3


source


In fact, it pandas.crosstab

already provides an option margins

that does exactly what you want.

> df = pd.DataFrame({"A": np.random.randint(0,2,100), "B" : np.random.randint(0,2,100)})
> pd.crosstab(df.A, df.B, margins=True)
B     0   1  All
A               
0    26  21   47
1    25  28   53
All  51  49  100

      



Basically, by setting margins=True

, the resulting frequency table will add an "All" column and an "All" row that calculate the subtotals.

+9


source


For this you have to use fields = True and also crosstab. It should work!

0


source







All Articles