Python (Pandas) Add subtotal per lvl of multi-index data
Assuming I have the following data file:
a b c Sce1 Sce2 Sce3 Sce4 Sce5 Sc6
Animal Ground Dog 0.0 0.9 0.5 0.0 0.3 0.4
Animal Ground Cat 0.6 0.5 0.3 0.5 1.0 0.2
Animal Air Eagle 1.0 0.1 0.1 0.6 0.9 0.1
Animal Air Owl 0.3 0.1 0.5 0.3 0.5 0.9
Object Metal Car 0.3 0.3 0.8 0.6 0.5 0.6
Object Metal Bike 0.5 0.1 0.4 0.7 0.4 0.2
Object Wood Chair 0.9 0.6 0.1 0.9 0.2 0.8
Object Wood Table 0.9 0.6 0.6 0.1 0.9 0.7
I want to create a MultiIndex that will contain the sum of each lvl. The result will look like this:
a b c Sce1 Sce2 Sce3 Sce4 Sce5 Sce6
Animal 1.9 1.6 1.4 1.3 2.7 1.6
Ground 0.6 1.4 0.8 0.5 1.3 0.6
Dog 0.0 0.9 0.5 0.0 0.3 0.4
Cat 0.6 0.5 0.3 0.5 1.0 0.2
Air 1.3 0.2 0.7 0.8 1.4 1.0
Eagle 1.0 0.1 0.1 0.6 0.9 0.1
Owl 0.3 0.1 0.5 0.3 0.5 0.9
Object 2.6 1.6 1.8 2.3 2.0 2.3
Metal 0.8 0.3 1.1 1.3 0.9 0.8
Car 0.3 0.3 0.8 0.6 0.5 0.6
Bike 0.5 0.1 0.4 0.7 0.4 0.2
Wood 1.8 1.3 0.6 1.0 1.1 1.5
Chair 0.9 0.6 0.1 0.9 0.2 0.8
Table 0.9 0.6 0.6 0.1 0.9 0.7
I am currently using a loop to create three different frames of data at each level and then manipulate them in excel as shown below. So I would like to use this calculation in python if possible.
for i in range range(0,3):
df = df.groupby(list(df.columns)[0:lvl], as_index=False).sum()
return df
Thank you very much in advance.
+3
source to share
2 answers
With some liberal use MAGIC
pd.concat([
df.assign(
**{x: 'Total' for x in 'abc'[i:]}
).groupby(list('abc')).sum() for i in range(4)
]).sort_index()
Sce1 Sce2 Sce3 Sce4 Sce5 Sc6
a b c
Animal Air Eagle 1.0 0.1 0.1 0.6 0.9 0.1
Owl 0.3 0.1 0.5 0.3 0.5 0.9
Total 1.3 0.2 0.6 0.9 1.4 1.0
Ground Cat 0.6 0.5 0.3 0.5 1.0 0.2
Dog 0.0 0.9 0.5 0.0 0.3 0.4
Total 0.6 1.4 0.8 0.5 1.3 0.6
Total Total 1.9 1.6 1.4 1.4 2.7 1.6
Object Metal Bike 0.5 0.1 0.4 0.7 0.4 0.2
Car 0.3 0.3 0.8 0.6 0.5 0.6
Total 0.8 0.4 1.2 1.3 0.9 0.8
Total Total 2.6 1.6 1.9 2.3 2.0 2.3
Wood Chair 0.9 0.6 0.1 0.9 0.2 0.8
Table 0.9 0.6 0.6 0.1 0.9 0.7
Total 1.8 1.2 0.7 1.0 1.1 1.5
Total Total Total 4.5 3.2 3.3 3.7 4.7 3.9
I can get exactly what you asked for with
pd.concat([
df.assign(
**{x: '' for x in 'abc'[i:]}
).groupby(list('abc')).sum() for i in range(1, 4)
]).sort_index()
Sce1 Sce2 Sce3 Sce4 Sce5 Sc6
a b c
Animal 1.9 1.6 1.4 1.4 2.7 1.6
Air 1.3 0.2 0.6 0.9 1.4 1.0
Eagle 1.0 0.1 0.1 0.6 0.9 0.1
Owl 0.3 0.1 0.5 0.3 0.5 0.9
Ground 0.6 1.4 0.8 0.5 1.3 0.6
Cat 0.6 0.5 0.3 0.5 1.0 0.2
Dog 0.0 0.9 0.5 0.0 0.3 0.4
Object 2.6 1.6 1.9 2.3 2.0 2.3
Metal 0.8 0.4 1.2 1.3 0.9 0.8
Bike 0.5 0.1 0.4 0.7 0.4 0.2
Car 0.3 0.3 0.8 0.6 0.5 0.6
Wood 1.8 1.2 0.7 1.0 1.1 1.5
Chair 0.9 0.6 0.1 0.9 0.2 0.8
Table 0.9 0.6 0.6 0.1 0.9 0.7
As for how! I'll leave this as an exercise for the reader.
+9
source to share