Difference between list (numpy_array) and numpy_array.tolist ()
What is the difference between applying list()
in an array numpy
or calling tolist()
?
I have tested the types of both outputs and both show that what I am getting is list
, however the outputs do not look exactly the same. Is it because it is list()
not numpy
-specific (i.e. can be applied to any sequence) and tolist()
is numpy
-specific, in which case they return the same thing?
Input:
points = numpy.random.random((5,2))
print "Points type: " + str(type(points))
Output:
Points type: <type 'numpy.ndarray'>
Input:
points_list = list(points)
print points_list
print "Points_list type: " + str(type(points_list))
Output:
[array([ 0.15920058, 0.60861985]), array([ 0.77414769, 0.15181626]), array([ 0.99826806, 0.96183059]), array([ 0.61830768, 0.20023207]), array([ 0.28422605, 0.94669097])]
Points_list type: 'type 'list''
Input:
points_list_alt = points.tolist()
print points_list_alt
print "Points_list_alt type: " + str(type(points_list_alt))
Output:
[[0.15920057939342847, 0.6086198537462152], [0.7741476852713319, 0.15181626186774055], [0.9982680580550761, 0.9618305944859845], [0.6183076760274226, 0.20023206937408744], [0.28422604852159594, 0.9466909685812506]]
Points_list_alt type: 'type 'list''
source to share
Your example already shows the difference ; consider the following 2D array:
>>> import numpy as np
>>> a = np.arange(4).reshape(2, 2)
>>> a
array([[0, 1],
[2, 3]])
>>> a.tolist()
[[0, 1], [2, 3]] # nested vanilla lists
>>> list(a)
[array([0, 1]), array([2, 3])] # list of arrays
tolist
handles the full conversion to nested vanilla lists (i.e. list
of list
of int
), whereas it list
simply iterates over the first size of the array, creating a list of arrays ( list
of np.array
of np.int64
). Although both are lists:
>>> type(list(a))
<type 'list'>
>>> type(a.tolist())
<type 'list'>
the elements of each list are of a different type:
>>> type(list(a)[0])
<type 'numpy.ndarray'>
>>> type(a.tolist()[0])
<type 'list'>
Another difference, as you noticed, is that it list
will work on any iterable, whereas it tolist
can only be called on objects that specifically implement this method.
source to share
.tolist()
appears to convert all values ββrecursively to python ( list
) primitives , whereas list
creates a python list from an iterable. Since the array is an array of numpy arrays
, list(...)
it creates list
from array
s
You can think of it list
as a function that looks like this:
# Not the actually implementation, just for demo purposes
def list(iterable):
newlist = []
for obj in iter(iterable):
newlist.append(obj)
return newlist
source to share
The main difference is that it tolist
recursively converts all data to standard python library types.
For example:
>>> arr = numpy.arange(2)
>>> [type(item) for item in list(arr)]
[numpy.int64, numpy.int64]
>>> [type(item) for item in arr.tolist()]
[builtins.int, builtins.int]
Functional differences aside tolist
, it will generally be faster as it knows it has a numpy array and access to the backing array. Whereas, it list
will fall back to using an iterator to add all the elements.
In [2]: arr = numpy.arange(1000)
In [3]: %timeit arr.tolist()
10000 loops, best of 3: 33 Β΅s per loop
In [4]: %timeit list(arr)
10000 loops, best of 3: 80.7 Β΅s per loop
I expect to tolist
be
source to share