Summing more than two information frames with the same indexes in Pandas

I want to add 4 Dataframes values โ€‹โ€‹with the same indexes to Pandas. If there are two data frames, df1 and df2, we can write:

df1.add(df2)

      

and for 3 data frames:

df3.add(df2.add(df1))

      

I wonder if there is a more general way to do this in Python.

+3


source to share


1 answer


Option 1
Usesum

sum([df1, df2, df3, df4])

      

Option 2
Usereduce

from functools import reduce

reduce(pd.DataFrame.add, [df1, df2, df3, df4])

      

Option 3
Use pd.concat

and pd.DataFrame.sum

With level=1


This only works when there is one level for data indexes. We need to get a little prettier to make it work. I recommend other options.

pd.concat(dict(enumerate([df1, df2, df3, df4]))).sum(level=1)

      


Customization

df = pd.DataFrame([[1, -1], [complex(0, 1), complex(0, -1)]])
df1, df2, df3, df4 = [df] * 4

      

Demo

sum([df1, df2, df3, df4])

        0        1
0  (4+0j)  (-4+0j)
1      4j      -4j

      




from functools import reduce

reduce(pd.DataFrame.add, [df1, df2, df3, df4])

        0        1
0  (4+0j)  (-4+0j)
1      4j      -4j

      


pd.concat(dict(enumerate([df1, df2, df3, df4]))).sum(level=1)

        0        1
0  (4+0j)  (-4+0j)
1      4j      -4j

      


Timing

small data

%timeit sum([df1, df2, df3, df4])
%timeit reduce(pd.DataFrame.add, [df1, df2, df3, df4])
%timeit pd.concat(dict(enumerate([df1, df2, df3, df4]))).sum(level=1)

1000 loops, best of 3: 591 ยตs per loop
1000 loops, best of 3: 456 ยตs per loop
100 loops, best of 3: 3.61 ms per loop

      

big data

df = pd.DataFrame([[1, -1], [complex(0, 1), complex(0, -1)]])
df = pd.concat([df] * 1000, ignore_index=True)
df = pd.concat([df] * 100, axis=1, ignore_index=True)
df1, df2, df3, df4 = [df] * 4

%timeit sum([df1, df2, df3, df4])
%timeit reduce(pd.DataFrame.add, [df1, df2, df3, df4])
%timeit pd.concat(dict(enumerate([df1, df2, df3, df4]))).sum(level=1)

100 loops, best of 3: 3.94 ms per loop
100 loops, best of 3: 2.9 ms per loop
1 loop, best of 3: 1min per loop

      

+8


source







All Articles