Calculate Levenshtein Difficulty Edit Distance
I've been looking at this simple python implementation "Levenshtein Change Distance" all day long .
def lev(a, b):
"""Recursively calculate the Levenshtein edit distance between two strings, a and b.
Returns the edit distance.
"""
if("" == a):
return len(b) # returns if a is an empty string
if("" == b):
return len(a) # returns if b is an empty string
return min(lev(a[:-1], b[:-1])+(a[-1] != b[-1]), lev(a[:-1], b)+1, lev(a, b[:-1])+1)
From: http://www.clear.rice.edu/comp130/12spring/editdist/
I know it has exponential complexity, but how would I start calculating this complexity from scratch?
I've searched all over the internet but couldn't find any explanation just claims that it is exponential.
Thank.
-
Draw a call tree (which you probably already did).
-
Call tree annotation. For arbitrary n, define the depth d of the tree as a function of n.
Also, determine how many branches / children there are per node, on average, as n approaches infinity; which is called the average branching factor b.
-
Understand that visiting every node in a tree of depth d with an average branching ratio b takes at least orders of operations b ^ d. Write this number in terms of n and you have a lower bound on the complexity of the input size.
More specifically: you keep recursing until you hit an empty line, dropping one character each time. If we call the lengths of the strings m and n, then the depth of the tree is min (m, n). For every node in the call tree other than leaves, you return exactly three times, so in the limit the average branching factor is 3. This gives us the node call tree Θ (3 ^ min (m, n)). The worst case is when m = n, so we can call it Θ (3 ^ n).
This is still only the lower limit of complexity. For a complete picture, you should also consider the amount of work done between recursive calls. In this naive code, it is actually linear time because it has a[:-1]
to copy (at Θ (n) cost) almost anything a
, which gives Θ (n 3 ^ n) total complexity. *
[* I once caught a CS professor using Python slicing in binary search, which ended up running time Θ (n lg n).]