Reset cumsum if over limit (python)

The following numpy snippet will return a cumsum of the input array, which is flushed every time NaN is encountered.

v = np.array([1., 1., 1., np.nan, 1., 1., 1., 1., np.nan, 1.])
n = np.isnan(v)
a = ~n
c = np.cumsum(a)
d = np.diff(np.concatenate(([0.], c[n])))
v[n] = -d
result = np.cumsum(v)

      

Likewise, how can I compute the cumsum that is flushed if the cumsum exceeds any value using pandas or numpy vectorized operations?

eg. for the limit = 5, in = [1,1,1,1,1,1,1,1,1,1], out = [1,2,3,4,5,1,2,3,4, five]

+1


source to share


1 answer


If the numbers in your array are all positive, it might be easiest to use cumsum()

then the modulo operator:

>>> a = np.array([1,1,1,1,1,1,1,1,1,1])
>>> limit = 5
>>> x = a.cumsum() % limit
>>> x
array([1, 2, 3, 4, 0, 1, 2, 3, 4, 0])

      

Then you can set any null values ​​up to the limit to get the array you want:

>>> x[x == 0] = limit
>>> x
array([1, 2, 3, 4, 5, 1, 2, 3, 4, 5])

      


Here's one possible general solution using Pandas' method expanding_apply

. (I have not tested it extensively ...)



First, define the modified function cumsum

:

import pandas as pd

def cumsum_limit(x):
    q = np.sum(x[:-1])
    if q > 0:
        q = q%5
    r = x[-1]
    if q+r <= 5:
        return q+r
    elif (q+r)%5 == 0:
        return 5
    else:
        return (q+r)%5

a = np.array([1,1,1,1,1,1,1,1,1,1]) # your example array

      

Apply the function to the array like this:

>>> pd.expanding_apply(a, lambda x: cumsum_limit(x))
array([ 1.,  2.,  3.,  4.,  5.,  1.,  2.,  3.,  4.,  5.])

      

Here the function applies to another interesting series:

>>> s = pd.Series([3, -8, 4, 5, -3, 501, 7, -100, 98, 3])
>>> pd.expanding_apply(s, lambda x: cumsum_limit(x)) 
0     3
1    -5
2    -1
3     4
4     1
5     2
6     4
7   -96
8     2
9     5
dtype: float64

      

+2


source







All Articles