Reset cumsum if over limit (python)
The following numpy snippet will return a cumsum of the input array, which is flushed every time NaN is encountered.
v = np.array([1., 1., 1., np.nan, 1., 1., 1., 1., np.nan, 1.])
n = np.isnan(v)
a = ~n
c = np.cumsum(a)
d = np.diff(np.concatenate(([0.], c[n])))
v[n] = -d
result = np.cumsum(v)
Likewise, how can I compute the cumsum that is flushed if the cumsum exceeds any value using pandas or numpy vectorized operations?
eg. for the limit = 5, in = [1,1,1,1,1,1,1,1,1,1], out = [1,2,3,4,5,1,2,3,4, five]
source to share
If the numbers in your array are all positive, it might be easiest to use cumsum()
then the modulo operator:
>>> a = np.array([1,1,1,1,1,1,1,1,1,1])
>>> limit = 5
>>> x = a.cumsum() % limit
>>> x
array([1, 2, 3, 4, 0, 1, 2, 3, 4, 0])
Then you can set any null values up to the limit to get the array you want:
>>> x[x == 0] = limit
>>> x
array([1, 2, 3, 4, 5, 1, 2, 3, 4, 5])
Here's one possible general solution using Pandas' method expanding_apply
. (I have not tested it extensively ...)
First, define the modified function cumsum
:
import pandas as pd
def cumsum_limit(x):
q = np.sum(x[:-1])
if q > 0:
q = q%5
r = x[-1]
if q+r <= 5:
return q+r
elif (q+r)%5 == 0:
return 5
else:
return (q+r)%5
a = np.array([1,1,1,1,1,1,1,1,1,1]) # your example array
Apply the function to the array like this:
>>> pd.expanding_apply(a, lambda x: cumsum_limit(x))
array([ 1., 2., 3., 4., 5., 1., 2., 3., 4., 5.])
Here the function applies to another interesting series:
>>> s = pd.Series([3, -8, 4, 5, -3, 501, 7, -100, 98, 3])
>>> pd.expanding_apply(s, lambda x: cumsum_limit(x))
0 3
1 -5
2 -1
3 4
4 1
5 2
6 4
7 -96
8 2
9 5
dtype: float64
source to share