Numpy asarray () is referencing the original list

I have a very long list of list and I am converting it to a numpy array using numpy.asarray (), is it safe to delete the original list after getting this matrix, or will the newly created numpy array be modified by this action as well?

+3


source to share


3 answers


I'm pretty sure the data isn't split and you can safely delete the lists. Your original matrix

is a nested structure of Python objects, with the numbers themselves also being Python objects that can be located all over the place in memory. The Numpy array is also an object, but it is more or less a header that contains the dimensions and data type, with a pointer to a contiguous block of data, where all the numbers are packed as close as possible as "raw numbers". There is no way like these two different ways could exchange data, so the data is probably copied when the Numpy array is created. Example:

In [1]: m = [[1,2,3],[4,5,6],[7,8,9]]
In [2]: import numpy as np
In [3]: M = np.array(m)
In [4]: M[1,1] = 55
In [5]: M
Out[5]: 
array([[ 1,  2,  3],
       [ 4, 55,  6],
       [ 7,  8,  9]])
In [6]: m
Out[6]: [[1, 2, 3], [4, 5, 6], [7, 8, 9]] # original is not modified!

      



Note that Numpy arrays can share data among themselves, for example. when you do a slice into an array. This is called a "view", so if you change the data in a subset, it also changes in the original array:

In [18]: P = M[1:, 1:]
In [19]: P[1,1] = 666
In [20]: P
Out[20]: 
array([[ 55,   6],
       [  8, 666]])
In [21]: M
Out[21]: 
array([[  1,   2,   3],
       [  4,  55,   6],
       [  7,   8, 666]])  # original is also modified!

      

+3


source


The data is copied because the numpy array stores its own copy of the data, as described by Bas Swinckels. You can check this for yourself. While a trivially small list might do that too, the ginormous data below might improve the point a bit;)



import numpy as np
list_data = range(1000000000)   # note, this will probably take a long time

# This will also take a long time 
# because it is copying the data in memory
array_data = np.asarray(list_data) 

# even this will probably take a while
del list_data

# But you still have the data even after deleting the list
print(array_data[1000])

      

+2


source


Yes, it's safe to delete if your input consists of list

. From the documentationNo copy is performed (ONLY) if the input is already an ndarray.

+1


source







All Articles