I'm testing the new python regex module, which allows for fuzzy string matching, and have been impressed with its capabilities so far. However, I've been having trouble making certain exceptions with fuzzy matching. The following is a case in point. I want ST LOUIS
, and all variations of ST LOUIS
within an edit distance of 1 to match ref
. However, I want to make one exception to this rule: the edit cannot consist of an insertion to the left of the leftmost character containing the letters N
, S
, E
, or W
. With the following example, I want inputs 1 - 3 to match ref, and input 4 to fail. However, using the following ref
causes it to match to all four inputs. Does anyone who is familiar with the new regex module know of a possible workaround?
input1 = 'ST LOUIS'
input2 = 'AST LOUIS'
input3 = 'ST LOUS'
input4 = 'NST LOUIS'
ref = '([^NSEW]|(?<=^))(ST LOUIS){e<=1}'
match = regex.fullmatch(ref,input1)
match
<_regex.Match object at 0x1006c6030>
match = regex.fullmatch(ref,input2)
match
<_regex.Match object at 0x1006c6120>
match = regex.fullmatch(ref,input3)
match
<_regex.Match object at 0x1006c6030>
match = regex.fullmatch(ref,input4)
match
<_regex.Match object at 0x1006c6120>