Len () cost and pep8 clause when checking a sequence
If python complexity len()
is O (1) why does pep8 suggest using
if seq:
insteadif len(seq) == 0:
https://wiki.python.org/moin/TimeComplexity
https://www.python.org/dev/peps/pep-0008/#programming-recommendations
Can't read len(seq) == 0
?
source to share
The former can handle both an empty string and None
. For example, consider these two variables.
>>> s1 = ''
>>> s2 = None
Using the first method
def test(s):
if s:
return True
else:
return False
>>> test(s1)
False
>>> test(s2)
False
Now using len
def test(s):
if len(s) == 0:
return True
else:
return False
>>> test(s1)
True
>>> test(s2)
Traceback (most recent call last):
File "<pyshell#13>", line 1, in <module>
test(s2)
File "<pyshell#11>", line 2, in test
if len(s) == 0:
TypeError: object of type 'NoneType' has no len()
So, from a performance standpoint, both will be O (1), but the truthfulness test (the first method) is more reliable since it handles None
in addition to empty strings.
source to share
The fact that len
is O (1) is just a convention. There may be sequences / collections where len
not O (1), at this point the check len(x) == 0
may take asymptotically longer.
The point is that the question "is the size of this collection equal k
?" and "is the collection empty?" are fundamentally different problems. The former answers len(x) == k
, and the latter answers to bool(x)
[ie is x
true]. Hence, if you want to check if the collection is empty, just use if collection
. They just match if k == 0
.
As a general rule, you should use the most specific term to describe exactly what you are doing. Spelling len(x) == 0
might just be a typo for len(x) == 10
with missing 1
, while getting it if x
wrong is harder.
source to share
For if len(seq) == 0
you need to analyze that
- you are calling a function,
- you are checking if its return value is equal and
- that "something" is 0.
For if seq
you only need to parse what seq
is being evaluated in a boolean context.
You might argue that you also need to know what type seq
is meaningful so that you know what it means to evaluate it in a boolean context, but then you also need to know the type for len(seq)
, so you need to compute its length. (For example, for a dict d
, len(d)
it doesn't say anything about the values in the dictionary, just how many keys it has.)
source to share