Split numpy array into multiple arrays using array of indices (Python)
I have an array:
a = [1, 3, 5, 7, 29 ... 5030, 6000]
This array is created from a previous process, and the length of the array can be different (depends on user input).
I also have an array:
b = [3, 15, 67, 78, 138]
(which could also be completely different)
I want to use an array b
to split an array a
into multiple arrays.
Specifically, I want the resulting arrays to be:
array1 = a[:3] array2 = a[3:15] ... arrayn = a[138:]
Where n = len(b)
.
My first thought was to create a 2D array slices
with size (len(b), something)
. However, we do not know this something
beforehand, so I assigned a value len(a)
to it as this is the maximum number of numbers it can hold.
I have this code:
slices = np.zeros((len(b), len(a)))
for i in range(1, len(b)):
slices[i] = a[b[i-1]:b[i]]
But I am getting this error:
ValueError: could not broadcast input array from shape (518) into shape (2253412)
source to share
You can use numpy.split :
np.split(a, b)
Example:
np.split(np.arange(10), [3,5])
# [array([0, 1, 2]), array([3, 4]), array([5, 6, 7, 8, 9])]
source to share
You are getting the error because you are trying to create a dangling array. This is not valid in numpy.
Improving @ Bohdan's answer:
from itertools import zip_longest
result = [a[start:end] for start, end in zip_longest(np.r_[0, b], b)]
The trick here is to zip_longest
make the final cut from b[-1]
to None
, which is equivalent a[b[-1]:]
, eliminating the need for special processing on the last element.
Please do not choose this. This is just a thing I added for fun. The "correct" answer is @ Psidom's answer.
source to share