Longest sequence of consecutive letters
Suppose I have a string of lowercase letters like
'ablccmdnneofffpg'
And my goal is to find the longest sequence of consecutive numbers within that string, which in this case:
'abcdefg'
An intuitive attempt is to find a loop around each letter and get the longest sequence starting with that letter. One possible solution is
longest_length = 0
start = None
current_start = 0
while current_start < len(word) - longest_length:
current_length = 1
last_in_sequence = ord(word[current_start])
for i in range(current_start + 1, len(word)):
if ord(word[i]) - last_in_sequence == 1:
current_length += 1
last_in_sequence = ord(word[i])
if current_length > longest_length:
longest_length = current_length
start = current_start
while (current_start < len(word) - 1 and
ord(word[current_start + 1]) - ord(word[current_start]) == 1):
current_start += 1
current_start += 1
Are there other ways to solve the problem with fewer lines or even using some pythonic methods?
source to share
You can keep track of all subsequences of consecutive characters as shown in a string using a dictionary and then take the one that is the longest.
Each subsequence is entered by the next candidate in the alphabet, so that once the expected candidate is reached in the string, it is used to update the value of the corresponding subsequence in the dictionary and add the following alphabet as the new dictionary value:
def longest_sequence(s):
d = {}
for x in s:
if x in d:
d[chr(ord(x)+1)] = d[x] + x
else:
d[chr(ord(x)+1)] = x
return max(d.values(), key=len)
print(longest_sequence('ablccmdnneofffpg'))
# abcdefg
print(longest_sequence('ba'))
# b
print(longest_sequence('sblccmtdnneofffpgtuyvgmmwwwtxjyuuz'))
# stuvwxyz
source to share
A solution that trades memory for (some) time:
It keeps track of all the sequences it sees, and then prints the longest ones found (although there may be several) at the end.
from contextlib import suppress
class Sequence:
def __init__(self, letters=''):
self.letters = letters
self.last = self._next_letter(letters[-1:])
def append(self, letter):
self.letters += letter
self.last = self._next_letter(letter)
def _next_letter(self, letter):
with suppress(TypeError):
return chr(ord(letter) + 1)
return 'a'
def __repr__(self):
return 'Sequence({}, {})'.format(repr(self.letters),
repr(self.last))
word = 'ablccmdnneofffpg'
sequences = []
for letter in word:
for s in sequences:
if s.last == letter:
s.append(letter)
break
else:
sequences.append(Sequence(letters=letter))
sequences = list(sorted(sequences, key=lambda s: len(s.letters), reverse=True))
print(sequences[0].letters)
source to share
Basically you are asking longest increasing subsequence
which is a well researched problem. Check out the pseudocode on Wikipedia.
source to share
Similar to Moses Koledoi's solution , but only stores longitudes for the ordinals of characters and only builds the solution string at the end. So this should be a little more economical:
def longest_seq(s):
d = {}
for c in s:
c, prev_c = ord(c), ord(c) - 1
d[c] = max(d.get(c, 0), d.pop(prev_c, 0) + 1)
c, l = max(d.items(), key=lambda i: i[1])
return ''.join(map(chr, range(c-l+1, c+1)))
source to share