Fully Python issubset ()
Considering two sets A and B and their length: a = len (A) and b = len (B), where a> = b. Which is complementary to Python 2.7's issubset () function, i.e. B.issubset (A)? There are two conflicting answers I can find on the internet:
1, O (a) or O (b)
found at: https://wiki.python.org/moin/TimeComplexity and bit .ly / 1AWB1QU
(Sorry I can't post more http links, so a shortened url has to be used instead.)
I downloaded the source code from the official Python site and found that:
def issubset(self, other):
"""Report whether another set contains this set."""
self._binary_sanity_check(other)
if len(self) > len(other): # Fast check for obvious cases
return False
for elt in ifilterfalse(other._data.__contains__, self):
return False
return True
there is only a loop here.
2, O (a * b)
found from: bit.ly/1Ac7geK
I also found that some codes look like Python source codes: bit.ly/1CO9HXa like this:
def issubset(self, other):
for e in self.dict.keys():
if e not in other:
return False
return True
there are two loops here.
So which one is right? Can anyone give me a detailed answer on the difference between the above two explanations? Thank you very much in advance.
source to share
The difficulty B.issubset(A)
isO(len(B))
, assuming that e in A
is constant time.
This is a reasonable assumption in general, but can easily be broken with a bad hash function. If, for example, all elements A
have the same hash code, the time complexity B.issubset(A)
will worsen to O(len(B) * len(A))
.
In your second code snippet, the complexity is the same as above. If you look closely, there is only one cycle; the other is operator if
( if e not in other:
).
source to share