Array array sizes in numpy

I would like to work on "jagged arrays" and I prefer to write "A + A" instead of "[x + y for x, y in zipped (A, A)]"

For this, I would like to convert a list of arrays of different sizes to a generic numpy array, but ran into an error due to seemingly overzealous broadcasting (notice that the first three were successful, but the last one failed):

In[209]: A = array([ones([3,3]), array([1, 2])])
In[210]: A = array([ones([3,3]), array([1, 2])], dtype=object)
In[211]: A = array([ones([3,2]), array([1, 2])], dtype=object)
In[212]: A = array([ones([2,2]), array([1, 2])], dtype=object)
Traceback (most recent call last):
  File "/home/hzhang/.conda/envs/myenv/lib/python3.4/site-
packages/IPython/core/interactiveshell.py", line 2881, in run_code
  exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-212-7297723106f9>", line 1, in <module>
  A = array([ones([2,2]), array([1, 2])], dtype=object)
ValueError: could not broadcast input array from shape (2,2) into shape (2)

      

reference

0


source to share


1 answer


Your case is the option on the third case in my answer to

How to keep numpy from broadcasting when creating an array of objects from different shaped arrays

np.array

trying to create a multidimensional array of numbers from the input list. If the sizes of the components are different enough, it resorts to splitting the arrays, creating an array of objects instead. I think of an array like this as a glorified / discounted list.

How to store multiple numpy 1d arrays with different lengths and print them

In your problem case, the dimensions are close enough to "think" that it can create a 2d array, but when it starts filling those values, it discovers that it cannot pass values ​​for that, and therefore throws an Error. One could argue that he should have backed off and accepted an "array of objects". But this decision tree is deeply embedded in the compiled code.

The problematic question in the previous SO question was

np.array([np.zeros((2, 2)), np.zeros((2,3))])

      

Comparison of the 1st dimension and the second is not. I'm not entirely sure why yours works IN[211]

, but In[212]

no. But the error message is the same, up to trying (2,2) => (2).

change

oops - I first read your problem example as:

np.array([np.ones([2,2]), np.ones([1, 2])], dtype=object)

      



That is, combining (2,2) with (1,2), which creates object (2). In fact, you combine

 (2,2) with a (2,) 

      

So it looks like np.empty((2,2),float)

(or object

) because it out[...]=[ones([2,2]), array([1,2])]

creates this error.


In any case, the most reliable way to create an array of objects is to initialize it and copy the arrays.

Out[90]: array([None, None], dtype=object)
In [91]: arr[:]=[ones([2,2]), array([1, 2])]
In [92]: arr
Out[92]: 
array([array([[ 1.,  1.],
       [ 1.,  1.]]), array([1, 2])], dtype=object)

      


Be careful about doing math on object arrays like this. What works is "miss":

In [93]: A+A
Out[93]: 
array([array([[ 2.,  2.],
       [ 2.,  2.],
       [ 2.,  2.]]),
       array([2, 4])], dtype=object)

In [96]: np.min(A[1])
Out[96]: 1
In [97]: np.min(A)
....
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [98]: A.sum()
Out[98]: 
array([[ 2.,  3.],
       [ 2.,  3.],
       [ 2.,  3.]])

      

it works because it works A[0]+A[1]

. A[1]

is (2), which is transmitted in (3,2).

With arrays of objects numpy

some sort of list comprehension is applied, iterating over the elements of the object. This way one can get the convenience of array notation, but not the same speed as with a true 2d array.

+1


source







All Articles