So I am trying to devise a spin off of the Levenshtein algorithm, where I keep track of what transformations I did in the string(inserted a, or substitute a for b).
Example:
Basically, say I am computing the edit distance of "bbd" and "bcd"
The edit distance will be 1 and the transformation will be "substitude b for c"
Question:
How would I approach this problem since the implementations i've seen do not concern themselves with knowing what kind of operation it was but only the total cost?
You can use this module - there is an editops
function there, which returns a list with the operations needed to transform one string to another.
Example:
Levenshtein.editops("FBBDE", "BCDASD")
[('delete', 0, 0), ('replace', 2, 1), ('insert', 4, 3), ('insert', 4, 4), ('replace', 4, 5)]
From the docs:
Find sequence of edit operations transforming one string to another.
editops(source_string, destination_string)
editops(edit_operations, source_length, destination_length)
The result is a list of triples (operation, spos, dpos), where
operation is one of equal',
replace', insert', or
delete'; spos
and dpos are position of characters in the first (source) and the
second (destination) strings. These are operations on signle
characters. In fact the returned list doesn't contain the equal',
but all the related functions accept both lists with and without
equal's.