Throwing exceptions for fuzzy match with new Python regex module
I am testing a new python regex module that allows fuzzy string matching and am still impressed with its capabilities.However, I am having problems with some fuzzy matching exceptions. Below is an example. I want ST LOUIS
and all options ST LOUIS
within an edit distance of 1 to match ref
. However, I want to make one exception to this rule: the editing may not consist of the insertion to the left of the leftmost character, containing letters N
, S
, E
, or W
. In the following example, I want inputs 1 - 3 to match ref and input 4 to error. However, using the followingref
causes it to match all four inputs. Does anyone familiar with the new regex module know of a possible workaround?
input1 = 'ST LOUIS' input2 = 'AST LOUIS' input3 = 'ST LOUS' input4 = 'NST LOUIS' ref = '([^NSEW]|(?<=^))(ST LOUIS){e<=1}' match = regex.fullmatch(ref,input1) match <_regex.Match object at 0x1006c6030> match = regex.fullmatch(ref,input2) match <_regex.Match object at 0x1006c6120> match = regex.fullmatch(ref,input3) match <_regex.Match object at 0x1006c6030> match = regex.fullmatch(ref,input4) match <_regex.Match object at 0x1006c6120>
source to share
Try using negative lookahead instead:
(?![NEW]|SS)(ST LOUIS){e<=1}
(ST LOUIS){e<=1}
matches a line containing fuzzy conditions placed on it. You want him not to start with [NSEW]
. A negative look does it for you (?![NSEW])
. But the line S
you want starts with already, you only want to exclude lines starting with S
added to the beginning of your line. Such a line starts with SS
, and therefore is added to the negative view.
Note that if you make errors> 1 this probably won't work as desired.
source to share