Calculation of the cell of the Levenshtein cell

Question

Calculation of the cell of the Levenshtein cell

I don't understand how the values in the levenshtein matrix are calculated according to this article . I know how we get to edit distance 3. Can anyone explain in simple human terms how we reach each value in each cell?

enter image description here

+3

python matrix levenshtein-distance hamming-distance edit-distance

jxn 09 June 15 at 21:06

source to share

2 answers

Hover your mouse over each value using the dots in this matrix in the wikipedia article and it describes in layman's terms what each value means.

eg. using the notation(x,y)

Element
(0,0)

compares None

with None

. (0,0) = 0

because they are equal Element
(0,1)

compares 'k'

with None

. (0,1) = 1

, because:
- insert 'k'
  
  to convert None
  
  to 'k'
  
  , so+1
Element
(3,2)

compares 'kit'

with 'si'

. (3,2) = 2

because of ``
- None
  
  == None
  
  so +0
  
  - Lev = 0
  
  see element(0,0)
- swap 's','k'
  
  therefore +1
  
  - Lev = 1
  
  see element(1,1)
- 'i' == 'i'
  
  therefore +0
  
  - Lev = 1
  
  see element(2,2)
- insert 't'
  
  therefore +1
  
  - Lev = 2
  
  see element(3,2)

0

Alexander mcfarlane 09 June 15 at 21:46

source to share

Condla · Accepted Answer · 2015-06-09T21:54:47+0000

Hi I just looked at the link to the Wikipedia article you shared:

The way of constructing a matrix is described in the "Definition" section. Now I'll just translate what this means and what you need to do to build the matrix yourself:

Just to make sure that basic information is missing: i stands for row number and j stands for column number.

So let's start with the first line of the matrix definition: It says the matrix is max (i, j) if min (i, j) = 0 The condition will only hold for the 0th row and 0th column elements. (Then min (0, j) is 0 and min (i, 0) is 0). So for the 0th row and 0th column, you enter the value max (i, j) which corresponds to the row number for the 0th column and the column number for the 0th row. So far so good:

    k i t t e n
  0 1 2 3 4 5 6
s 1
i 2
t 3
t 4
i 5
n 6
g 7

All other values are plotted at least one of these three values:

lev(i-1, j) + 1
lev(i, j-1) + 1
lev(i-1, j-1) + 1_(a_i != b_i)

Where lev matches pre-existing levenshtein matrix elements. Lev (i, j-1) is just the matrix component to the left of the one we want to define. lev (i-1, j) is the component above and lev (i-1, j-1) is the item on the left and above. Here 1_ (a_i! = B_i) means that if the letters in this space are not equal to 1, otherwise 0.

If we jump directly into the matrix element (1, 1), which matches the letters (s, k): We define 3 components:

lev(i-1, j) + 1 = 2     [1 + 1 = 2]
lev(i, j-1) + 1 = 2     [1 + 1 = 2]
lev(i-1, j-1) + 1 = 1   [0 + 1 = 1]  + 1 because k is clearly not s

Now we take the minimum of these three values and find the next entry for the Levenshtein matrix.

Do this evaluation for every single row of OR by column and the result is the full Levenshtein matrix.

Calculation of the cell of the Levenshtein cell

More articles: