Trying to create a new framework based on internal column sums from another dataframe using Python / pandas

Suppose I have pandas dataframe df like this:

df = DataFrame({'Col1':[1,2,3,4], 'Col2':[5,6,7,8]})

    Col1 Col2
0      1      5
1      2      6
2      3      7
3      4      8

      

Is there a way to change the column to the sum of all the following items in the column?

For example, for "Col1" the result is:

    Col1   Col2
0     10      5
1      9      6
2      7      7
3      4      8

      

1 becomes 1 + 2 + 3 + 4 = 10
2 becomes 2 + 3 + 4 = 9
3 becomes 3 + 4 = 7
4 4 remains

If possible, is there a way to specify the clipping index after which this behavior will take place? For example, if the clipping index is key 1, the result is:

    Col1   Col2
0      1      5
1      2      6
2      7      7
3      4      8

      

I think there is no other way other than using loops to do this, but I thought there might be a way to use vectorized computation.

Thanks heaps

+3


source to share


2 answers


Here's one way to avoid the loop.



import pandas as pd

your_df = pd.DataFrame({'Col1':[1,2,3,4], 'Col2':[5,6,7,8]})

def your_func(df, column, cutoff):
    # do cumsum and flip over
    x = df[column][::-1].cumsum()[::-1]
    df[column][df.index > cutoff] = x[x.index > cutoff]     
    return df

# to use it
your_func(your_df, column='Col1', cutoff=1)

Out[68]: 
   Col1  Col2
0     1     5
1     2     6
2     7     7
3     4     8

      

+1


source


Yes, you can use a loop, but very cheap:

def sum_col(column,start=0):
    l = len(column)
    return [column.values[i:].sum() for i in range(start,l)]

      



And usage:

data['Col1'] = sum_col(data['Col1'],0)

      

+1


source







All Articles