I'm reading The Algorithm Design Manual by Steven Skiena, and I'm on the dynamic programming chapter. He has some example code for edit distance and uses some functions which are explained neither in the book nor on the internet. So I'm wondering
a) how does this algorithm work?
b) what do the functions indel and match do?
#define MATCH 0 /* enumerated type symbol for match */
#define INSERT 1 /* enumerated type symbol for insert */
#define DELETE 2 /* enumerated type symbol for delete */
int string_compare(char *s, char *t, int i, int j)
{
int k; /* counter */
int opt[3]; /* cost of the three options */
int lowest_cost; /* lowest cost */
if (i == 0) return(j * indel(' '));
if (j == 0) return(i * indel(' '));
opt[MATCH] = string_compare(s,t,i-1,j-1) + match(s[i],t[j]);
opt[INSERT] = string_compare(s,t,i,j-1) + indel(t[j]);
opt[DELETE] = string_compare(s,t,i-1,j) + indel(s[i]);
lowest_cost = opt[MATCH];
for (k=INSERT; k<=DELETE; k++)
if (opt[k] < lowest_cost) lowest_cost = opt[k];
return( lowest_cost );
}
Basically, it utilizes the dynamic programming method of solving problems where the solution to the problem is constructed to solutions to subproblems, to avoid recomputation, either bottom-up or top-down.
The recursive structure of the problem is as given here, where
i,j
are start (or end) indices in the two strings respectively.Here's an excerpt from this page that explains the algorithm well.
I recommend going through this lecture for a good explanation.
The function
match()
returns 1, if the two characters mismatch (so that one more move is added in the final answer) otherwise 0.On page 287 in the book:
They're explained in the book. Please read section 8.2.4 Varieties of Edit Distance
Please go through this link: https://secweb.cs.odu.edu/~zeil/cs361/web/website/Lectures/styles/pages/editdistance.html
the code implementing the above algorithm is :
This is a recursive algorithm not dynamic programming. Note that both i & j point to the last char of s & t respectively when the algorithm starts.
indel returns 1. match(a, b) returns 0 if a = b (match) else return 1 (substitution)
The algorithm is not hard to understand, you just need to read it couple of times. What's always amuse me is the person who invented it and the trust that recursion will do the right thing.