Make sure the regex subpattern doesn't contain the previous subpattern?
I am wondering if there is a way to check if the subpattern matches a given sequence so that I can block it.
For example, let's say that I wanted to capture everything except a repeat of an earlier capture. So if I had a suggestion [word plus word]
, the next would have to log everything ( word plus
) up to the second occurrence word
.
(\w+)[^\1]+
The first is (\w+)
exciting word
. The second capture group [^...]
tries to exclude it (it was previously marked \1
), but it only works on characters, not subpanel captures.
Is there anyway to do this?
You can use templates like this:
(\w+)(?:(?!\1).)*
Which uses a negative lookahead to assert (on each character) that the previously matched word is not contained in the subexpression.
You can use lazy quantifiers and search, for example:
(\w+).*?(?=\1)
you can also surround w + with word boundaries like this:
\b(\w+)\b.*?(?=\1)
so you don't match things like this: hello where would you match "ll"