Split numpy array into multiple arrays using array of indices (Python)

I have an array:

a = [1, 3, 5, 7, 29 ... 5030, 6000]

      

This array is created from a previous process, and the length of the array can be different (depends on user input).

I also have an array:

b = [3, 15, 67, 78, 138]

      

(which could also be completely different)

I want to use an array b

to split an array a

into multiple arrays.

Specifically, I want the resulting arrays to be:

array1 = a[:3]
array2 = a[3:15]
...
arrayn = a[138:]

      

Where n = len(b)

.

My first thought was to create a 2D array slices

with size (len(b), something)

. However, we do not know this something

beforehand, so I assigned a value len(a)

to it as this is the maximum number of numbers it can hold.

I have this code:

 slices = np.zeros((len(b), len(a)))

 for i in range(1, len(b)):
     slices[i] = a[b[i-1]:b[i]]

      

But I am getting this error:

ValueError: could not broadcast input array from shape (518) into shape (2253412)

      

+3


source to share


3 answers


You can use numpy.split :

np.split(a, b)

      



Example:

np.split(np.arange(10), [3,5])
# [array([0, 1, 2]), array([3, 4]), array([5, 6, 7, 8, 9])]

      

+5


source


b.insert(0,0)
result = []
for i in range(1,len(b)):
    sub_list = a[b[i-1]:b[i]]
    result.append(sub_list)
result.append(a[b[-1]:])

      



+2


source


You are getting the error because you are trying to create a dangling array. This is not valid in numpy.

Improving @ Bohdan's answer:

from itertools import zip_longest
result = [a[start:end] for start, end in zip_longest(np.r_[0, b], b)]

      

The trick here is to zip_longest

make the final cut from b[-1]

to None

, which is equivalent a[b[-1]:]

, eliminating the need for special processing on the last element.

Please do not choose this. This is just a thing I added for fun. The "correct" answer is @ Psidom's answer.

+2


source







All Articles