Complexity of edit distance (Levenshtein distance)

2019-04-14 07:59发布

问题:

I have been working all day with a problem which I can't seem to get a handle on. The task is to show that a recursive implementation of edit distance has the time complexity Ω(2max(n,m)) where n & m are the length of the words being measured.

The implementation is comparable to this small python example

def lev(a, b):
    if("" == a):
       return len(b)   # returns if a is an empty string
    if("" == b):
        return len(a)   # returns if b is an empty string
    return min(lev(a[:-1], b[:-1])+(a[-1] != b[-1]), lev(a[:-1], b)+1, lev(a, b[:-1])+1)

From: http://www.clear.rice.edu/comp130/12spring/editdist/

I have tried drawing trees of the recursion depth for different short words but I cant find the connection between the tree depth and complexity.

Recursion Formula from my calculation

m = length of word1
n = length of word2
T(m,n) = T(m-1,n-1) + 1 + T(m-1,n) + T(m,n-1)
With the base cases:
T(0,n) = n
T(m,0) = m

But I have no idea on how to proceed since each call leads to 3 new calls as the lengths don't reach 0.

I would be grateful for any tips on how I can proceed to show that the lower bound complexity is Ω(2max(n,m)).

回答1:

Your recursion formula:

T(m,n) = T(m-1,n-1) + T(m-1,n) + T(m,n-1) + 1
T(0,n) = n
T(m,0) = m

is right.

You can see, that every T(m,n) splits of into three paths. Due to every node runs in O(1) we only have to count the nodes.

A shortest path has the length min(m,n), so the tree has at least 3min(m,n) nodes. But there are some path that are longer. You get the longest path by alternately reduce the first and the second string. This path will have the length m+n-1, so the whole tree has at most 3m+n-1 nodes.

Let m = min(m,n). The tree contains also at least

different paths, one for each possible order of reducing n.

So Ω(2max(m,n)) and Ω(3min(m,n)) are lower bounds and O(3m+n-1) is an upper bound.