Digitizing a numpy array

I have two vectors:

  time_vec = np.array([0.2,0.23,0.3,0.4,0.5,...., 28....])
  values_vec = np.array([500,200,220,250,200,...., 218....])
  time_vec.shape == values_vec.shape 

      

Now I want to take bin for values ​​every 0.5 seconds and take the average. For example,

  value_vec = np.array(mean_of(500,200,220,250,200), mean_of(next values in next 0.5 second interval))

      

Is there any numpy method I have lost in which bin and do you have a value for bins?

+3


source to share


3 answers


You can use np.ufunc.reduceat

. You just need to fill in where the breakpoints are, i.e. When floor(t / .5)

changes:

say:

>>> t
array([ 0.    ,  0.025 ,  0.2125,  0.2375,  0.2625,  0.3375,  0.475 ,  0.6875,  0.7   ,  0.7375,  0.8   ,  0.9   ,
        0.925 ,  1.05  ,  1.1375,  1.15  ,  1.1625,  1.1875,  1.1875,  1.225 ])
>>> b
array([ 0.8144,  0.3734,  1.4734,  0.6307, -0.611 , -0.8762,  1.6064,  0.3863, -0.0103, -1.6889, -0.4328, -0.7373,
        1.7856,  0.8938, -1.1574, -0.4029, -0.4352, -0.4412, -1.7819, -0.3298])

      

breakpoints:

>>> i = np.r_[0, 1 + np.nonzero(np.diff(np.floor(t / .5)))[0]]
>>> i
array([ 0,  7, 13])

      



and the sum for each interval:

>>> np.add.reduceat(b, i)
array([ 3.411 , -0.6975, -3.6545])

      

and the average will be the sum over the length of the interval:

>>> np.add.reduceat(b, i) / np.diff(np.r_[i, len(b)])
array([ 0.4873, -0.1162, -0.5221])

      

+3


source


You can pass a parameter weights=

in np.histogram

to calculate the summed values ​​in each time buffer and then normalize by bin count



# 0.5 second time bins to average within
tmin = time_vec.min()
tmax = time_vec.max()
bins = np.arange(tmin - (tmin % 0.5), tmax - (tmax % 0.5) + 0.5,  0.5)

# summed values within each bin
bin_sums, edges = np.histogram(time_vec,bins=bins, weights=values_vec)

# number of values within each bin
bin_counts, edges = np.histogram(time_vec,bins=bins)

# average value within each bin
bin_means = bin_sums / bin_counts

      

+2


source


You can use np.bincount

which is presumably quite efficient for such binning operations. Here's an implementation based on her solution to our case -

# Find indices where 0.5 intervals shifts onto next ones
A = time_vec*2
idx = np.searchsorted(A,np.arange(1,int(np.ceil(A.max()))),'right')

# Setup ID array such that all 0.5 intervals are ID-ed same
out = np.zeros((A.size),dtype=int)
out[idx[idx < A.size]] = 1
ID = out.cumsum()

# Finally use bincount to sum and count elements of same IDs
# and thus get mean values per ID
mean_vec = np.bincount(ID,values_vec)/np.bincount(ID)

      

Example run -

In [189]: time_vec
Out[189]: 
array([ 0.2 ,  0.23,  0.3 ,  0.4 ,  0.5 ,  0.7 ,  0.8 ,  0.92,  0.95,
        1.  ,  1.11,  1.5 ,  2.  ,  2.3 ,  2.5 ,  4.5 ])

In [190]: values_vec
Out[190]: array([36, 11, 93, 32, 72, 75, 26, 28, 77, 31, 60, 77, 76, 32,  6, 85])

In [191]: ID
Out[191]: array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 3, 4, 4, 5], dtype=int32)

In [192]: mean_vec
Out[192]: array([ 48.8,  47.4,  68.5,  76. ,  19. ,  85. ])

      

0


source







All Articles