Is there a better algorithm for finding the longest sequence of the same letter in a string?

I tried to look at the algorithms and try to change them in order to make them the fastest. I recently tried an algorithm that looks for the longest sequence of any letter in a string. The naive answer looks at all letters, and when the current sequence is larger than the largest sequence found, the new largest becomes the current one. Example:

With C for the current sequence and M for the maximum sequence, the order of letter checks and variable updates is AAAACCDDD-> A (C = 1, M = 1) → A (C = 2, M = 2) → A (C = 3 , M = 3) → A (C = 4, M = 4) → C (C = 1, M = 4) → C (C = 2, M = 4) → D (C = 1, M = 4) → D (C = 2, M = 4) → D (C = 3, M = 4) Answer: 4 It might be faster if you cannot get the new big sequence given in M, the place where you are in the string and line size.

I tried and came up with an algorithm that usually accesses the smaller elements of a string, I think it will be easier to explain like this:

Instead of jumping 1 on 1, you jump, which would be necessary to have a new large sequence if all the letters in the jump were the same. So, for example, after you read AAAB, you jump 3 places because you assume that all 3 are the following letters: B (AAABBBB). Of course, they may not be, which is why you are now going backwards, counting the serial B right behind your position. Your next "jump" will be lower, depending on how many Bs you find. For example,

AAABCBBBBD after the jump you are in third B. You go back and find one B, back again and find C you stop. Now you already know that you have a sequence of 2, so your next jump cannot be 3 - you can skip a sequence of 4 B. So you jump 2 and get to B. Go back and find B. Next reverse position is where you started, so you know you found the sequence from 4. It didn't make much difference in this example, but if you instead use a string like AAABBBBCDDECEE, you can see that after you jumped from the first C to the last C, you will only need to back off, because seeing the letter behind you E you don't care what it is about.

I have coded both methods and the second is 2 - 3 times faster. Now I'm really curious to know if there is a faster way to find it?

+3


source to share





All Articles