Len () cost and pep8 clause when checking a sequence

If python complexity len()

is O (1) why does pep8 suggest using if seq:

insteadif len(seq) == 0:

https://wiki.python.org/moin/TimeComplexity
https://www.python.org/dev/peps/pep-0008/#programming-recommendations

Can't read len(seq) == 0

?

+3


source to share


3 answers


The former can handle both an empty string and None

. For example, consider these two variables.

>>> s1 = ''
>>> s2 = None

      

Using the first method

def test(s):
    if s:
        return True
    else:
        return False

>>> test(s1)
False
>>> test(s2)
False

      



Now using len

def test(s):
    if len(s) == 0:
        return True
    else:
        return False

>>> test(s1)
True
>>> test(s2)
Traceback (most recent call last):
  File "<pyshell#13>", line 1, in <module>
    test(s2)
  File "<pyshell#11>", line 2, in test
    if len(s) == 0:
TypeError: object of type 'NoneType' has no len()

      

So, from a performance standpoint, both will be O (1), but the truthfulness test (the first method) is more reliable since it handles None

in addition to empty strings.

+8


source


The fact that len

is O (1) is just a convention. There may be sequences / collections where len

not O (1), at this point the check len(x) == 0

may take asymptotically longer.

The point is that the question "is the size of this collection equal k

?" and "is the collection empty?" are fundamentally different problems. The former answers len(x) == k

, and the latter answers to bool(x)

[ie is x

true]. Hence, if you want to check if the collection is empty, just use if collection

. They just match if k == 0

.



As a general rule, you should use the most specific term to describe exactly what you are doing. Spelling len(x) == 0

might just be a typo for len(x) == 10

with missing 1

, while getting it if x

wrong is harder.

+4


source


For if len(seq) == 0

you need to analyze that

  • you are calling a function,
  • you are checking if its return value is equal and
  • that "something" is 0.

For if seq

you only need to parse what seq

is being evaluated in a boolean context.

You might argue that you also need to know what type seq

is meaningful so that you know what it means to evaluate it in a boolean context, but then you also need to know the type for len(seq)

, so you need to compute its length. (For example, for a dict d

, len(d)

it doesn't say anything about the values ​​in the dictionary, just how many keys it has.)

+1


source







All Articles