How is reverse rolling?

I have a sliding sum calculated on a grouped dataframe, but adding it in the wrong way, it is the sum of the future when I need the sum of the past.

What am I doing wrong here?

I am importing data and sorting by size and date (I already tried to remove date sorting)

df = pd.read_csv('Input.csv', parse_dates=True)
df.sort_values(['Dimension','Date'])
print(df)

      

Then I create a new column which is a multi-index grouped by rolling window

new_column = df.groupby('Dimension').Value1.apply(lambda x: 
x.rolling(window=3).sum())

      

I, then reset the index will be the same as the original

df['Sum_Value1'] = new_column.reset_index(level=0, drop=True)
print(df)

      

I also tried changing the index before calculating, but that also failed.

Input

Dimension,Date,Value1,Value2
1,4/30/2002,10,20
1,1/31/2002,10,20
1,10/31/2001,10,20
1,7/31/2001,10,20
1,4/30/2001,10,20
1,1/31/2001,10,20
1,10/31/2000,10,20
2,4/30/2002,10,20
2,1/31/2002,10,20
2,10/31/2001,10,20
2,7/31/2001,10,20
2,4/30/2001,10,20
2,1/31/2001,10,20
2,10/31/2000,10,20
3,4/30/2002,10,20
3,1/31/2002,10,20
3,10/31/2001,10,20
3,7/31/2001,10,20
3,1/31/2001,10,20
3,10/31/2000,10,20

      

Output:

    Dimension        Date  Value1  Value2  Sum_Value1
0           1   4/30/2002      10      20         NaN
1           1   1/31/2002      10      20         NaN
2           1  10/31/2001      10      20        30.0
3           1   7/31/2001      10      20        30.0
4           1   4/30/2001      10      20        30.0
5           1   1/31/2001      10      20        30.0
6           1  10/31/2000      10      20        30.0
7           2   4/30/2002      10      20         NaN
8           2   1/31/2002      10      20         NaN
9           2  10/31/2001      10      20        30.0
10          2   7/31/2001      10      20        30.0
11          2   4/30/2001      10      20        30.0
12          2   1/31/2001      10      20        30.0
13          2  10/31/2000      10      20        30.0

      

Target output:

    Dimension        Date  Value1  Value2  Sum_Value1
0           1   4/30/2002      10      20        30.0
1           1   1/31/2002      10      20        30.0
2           1  10/31/2001      10      20        30.0
3           1   7/31/2001      10      20        30.0
4           1   4/30/2001      10      20        30.0
5           1   1/31/2001      10      20         NaN
6           1  10/31/2000      10      20         NaN
7           2   4/30/2002      10      20        30.0
8           2   1/31/2002      10      20        30.0
9           2  10/31/2001      10      20        30.0
10          2   7/31/2001      10      20        30.0
11          2   4/30/2001      10      20        30.0
12          2   1/31/2001      10      20         Nan
13          2  10/31/2000      10      20         NaN

      

+3


source to share


3 answers


You need a reverse sum, so reverse your series before copying it:



lambda x: x[::-1].rolling(window=3).sum()

      

+3


source


You can shift the result by window-1

to get the results aligned to the left:

df["sum_value1"] = (df.groupby('Dimension').Value1
                      .apply(lambda x: x.rolling(window=3).sum().shift(-2)))

      



enter image description here

+1


source


Moving backward is the same as fast forwarding and then changing the result:

x.rolling(window=3).sum().shift(-2)

      

+1


source







All Articles