Why is my second method slower than the first method?
I was doing the leetcode problem No. 387. First Unique Character in a String
. Given the string, find the first non-repeating character in it and return the index. If it doesn't exist, return -1. In the meantime, there is no need to know about it. ”
Examples:
s = "leetcode"
return 0.
s = "loveleetcode",
return 2.
I wrote 2 algorithms:
Method 1
def firstUniqChar(s):
d = {}
L = len(s)
for i in range(L):
if s[i] not in d:
d[s[i]] = [i]
else:
d[s[i]].append(i)
M = L
for k in d:
if len(d[k])==1:
if d[k][0]<M:
M = d[k][0]
if M<L:
return M
else:
return -1
This is very intuitive, i.e. first create a counter dictionary by flipping all char in s
(this can also be done using one string in collections.Counter
), then do a second loop, only checking those keys whose value is a list of length 1. I think like I did 2 loops, have there must be some redundant computation. So I wrote a second algorithm, which I think is better than the first, but on the leetcode platform, the second is much slower than the first and I couldn't figure out why.
Method 2
def firstUniqChar(s):
d = {}
L = len(s)
A = []
for i in range(L):
if s[i] not in d:
d[s[i]] = i
A.append(i)
else:
try:
A.remove(d[s[i]])
except:
pass
if len(A)==0:
return -1
else:
return A[0]
2nd loop just once for all char in s
Your first solution is O(n)
, but the second solution is O(n^2)
, since the method A.remove
iterates over the elements A
.
As others have said, the use is list.remove
quite expensive ... Your use collections.Counter
is a good idea.
You need to scan the line once to find uniques. Then it's probably best to iterate over it again and take the index of the first unique - which is what your potential code does:
from collections import Counter
s = "loveleetcode"
# Build a set of unique values
unique = {ch for ch, freq in Counter(s).items() if freq == 1}
# re-iterate over the string until we first find a unique value or
# not - default to -1 if none found
first_index = next((ix for ix, ch in enumerate(s) if ch in unique), -1)
# 2