Big-O of del my_list [:]?
What's the big O del my_list [:]? This command removes all items in the list. I understand it will be O (n). n is the length of the list.
Hence the big O of this code would be big O (n ^ 2), right?
Note that this is not for school, but for my understanding when I practice for interviews.
from copy import deepcopy
class Solution:
# @param A : list of integers
# @return an integer
def removeDuplicates(self, A):
copy_array = deepcopy(A)
del A[:]
for j in copy_array:
if j in A:
pass
else:
A.append(j)
return len(A)
source to share
del
doesn't affect big-O here, loop is order n
and test j in A
is order n
, so nested loop O(n**2)
; del
- O(n)
, but it is not part of the loop, and since it has a lower order of operation, it is ignored.
Side note: A's solution O(n)
for this would be to use collections.OrderedDict
deduplication, preserving order, making the body of the method simple:
A[:] = collections.OrderedDict.fromkeys(A)
return len(A)
source to share
as pointed out in the comments and other answers, it doesn't matter in your case if the removal is O (n), because this is an operation only once and your loop is already O (n ^ 2).
Nevertheless, your question about is del A[:]
interesting and deserves attention:
According to the Python wiki page on time complexity, removing a snippet from a list is indeed O (n).
However, since lists are internally represented as arrays, and operations with [:]
essentially reassign the entire array, and the old array may be garbage collected at some later point, I think it's likely that this case could actually be optimized in the implementation, so that the operator itself del
is O (1) and the actual clearing of the old array is delayed. The algorithmic cost may be O (n), but it may still have the advantage of deferring some work to remove it from the computational hotspot in your program.
As per the first comment below, however, the main Python implementation, CPython, uses reference counting; this means it del
should actually decrement the reference count for each item contained in the list, which is O (n).
PyPy, however, has configurable garbage collection , and all of the collectors it supports seem to be based on a copy generation scheme that does allow optimization. Moreover, in a copy scheme, the memory containing the old objects can usually just be ignored and not properly allocated, so the actual deferred cost may actually be free (in the sense that the statement del
makes the next generation copy cheaper, since the original array is no longer needed copy). Data allocated for "ignored" objects can still be cleared (and, indeed, PyPy's link points out that its generation GC does this), but since all the old memory space is cleared, I'm not sure what matters how much of that space actually full.
NOTE , however, objects requiring special cleanup operations, that is, objects that implement __del__
, are a special case and cannot be simply "ignored" during generation, so allocating an array of objects with __del__
must always be O (n). The linked page has some information on how these objects are handled.
source to share