Faster distance editing algorithm

Problem: I know the trivial formulation of DP distance and computation in O (mn) for 2 rows of size n and m respectively. But recently I found out that if we only need to calculate the minimum value of the edit distance f and it is limited to | f | <= s, then we can compute it in O (min (m, n) + s ^ 2) or O (s * min (m, n)) [wikipedia] time.

Please explain the dp wording behind it if it is based on DP or explain the algorithm.

Take a look at the improved algorithm

link section : http://en.wikipedia.org/wiki/Edit_distance .

another link about improved UKKONEN algorithm http://www.berghel.net/publications/asm/asm.php

Thanks in advance.

+3


source to share


1 answer


You can calculate edit distance in O (min (n, m) * s) using the following simple idea:

Consider the i-th row in the DP table.

So, if we know that the answer is <= s, then we interstit in cells with coordinates (i, i - s), (i, i - s + 1), ..., (i, i + s). Because in other cells the answer is strictly more than s.

For example, suppose we know that the edit distance between "abacaba" and "baadba" is less than 3.



DP-table for this strings

So, we can skip the red cells because they have a value greater than s.

The asymptotics of the algorithm is O (min (n, m) * s), since we compute s-cells to the left and right of the main diagonal.

+10


source







All Articles