Fully Python issubset ()

Question

Fully Python issubset ()

Considering two sets A and B and their length: a = len (A) and b = len (B), where a> = b. Which is complementary to Python 2.7's issubset () function, i.e. B.issubset (A)? There are two conflicting answers I can find on the internet:

1, O (a) or O (b)

found at: https://wiki.python.org/moin/TimeComplexity and bit .ly / 1AWB1QU

(Sorry I can't post more http links, so a shortened url has to be used instead.)

I downloaded the source code from the official Python site and found that:

def issubset(self, other):
    """Report whether another set contains this set."""
    self._binary_sanity_check(other)
    if len(self) > len(other):  # Fast check for obvious cases
        return False
    for elt in ifilterfalse(other._data.__contains__, self):
        return False
    return True

there is only a loop here.

2, O (a * b)

found from: bit.ly/1Ac7geK

I also found that some codes look like Python source codes: bit.ly/1CO9HXa like this:

def issubset(self, other):
    for e in self.dict.keys():
        if e not in other:
            return False
        return True

there are two loops here.

So which one is right? Can anyone give me a detailed answer on the difference between the above two explanations? Thank you very much in advance.

+3

performance python time-complexity loops

Jasonhz Dec 28. 14 at 6:51

source to share

1 answer

NPE · Accepted Answer · 2014-12-28T07:00:33+0000

The difficulty B.issubset(A)

isO(len(B))

, assuming that e in A

is constant time.

This is a reasonable assumption in general, but can easily be broken with a bad hash function. If, for example, all elements A

have the same hash code, the time complexity B.issubset(A)

will worsen to O(len(B) * len(A))

.

In your second code snippet, the complexity is the same as above. If you look closely, there is only one cycle; the other is operator if

( if e not in other:

).

Fully Python issubset ()

More articles: