Exclude words containing my regex but not my regex
I am trying to find a way to exclude words that contain my regex but is not my regex using the search
widget object method Text
. For example, suppose I have this regex "(if)|(def)"
and words like define
, definition
or elif
are all found by the function re.search
, but I want the regex to find exactly if
and def
.
This is the code I am using:
import keyword
PY_KEYS = keyword.kwlist
PY_PATTERN = "^(" + ")|(".join(PY_KEYS) + ")$"
But it still accepts type words define
, but I only need type words def
, even if it define
contains def
.
I need this to highlight words in the widget tkinter.Text
. The function I am using that is responsible for highlighting the code is:
def highlight(self, event, pattern='', tag=KW, start=1.0, end="end", regexp=True):
"""Apply the given tag to all text that matches the given pattern
If 'regexp' is set to True, pattern will be treated as a regular
expression.
"""
if not isinstance(pattern, str) or pattern == '':
pattern = self.syntax_pattern # PY_PATTERN
# print(pattern)
start = self.index(start)
end = self.index(end)
self.mark_set("matchStart", start)
self.mark_set("matchEnd", start)
self.mark_set("searchLimit", end)
count = tkinter.IntVar()
while pattern != '':
index = self.search(pattern, "matchEnd", "searchLimit",
count=count, regexp=regexp)
# prints nothing
print(self.search(pattern, "matchEnd", "searchLimit",
count=count, regexp=regexp))
if index == "":
break
self.mark_set("matchStart", index)
self.mark_set("matchEnd", "%s+%sc" % (index, count.get()))
self.tag_add(tag, "matchStart", "matchEnd")
On the other hand, if PY_PATTERN = "\\b(" + "|".join(PY_KEYS) + ")\\b"
, then does not highlight anything, and you can see if you put the print inside the function so that it is empty.
source to share
The answers are given in order for Python regex, but I found the search
tkinter widget method is actually Text
using Tcl regex style.
In this case, instead of wrapping a word or regex with \b
or \\b
(if we are not using a raw string), we can simply use the appropriate Tcl word boundary character, i.e. \y
or \\y
, which did the job in my case.
See my other question for more information.
source to share
You can use bindings :
"^(?:if|def)$"
^
asserts the position at the beginning of the line, but $
asserts the position at the end of the line, asserting that nothing else can be matched if the line is incomplete if
or def
.
>>> import re
for foo in ["if", "elif", "define", "def", "in"]:
bar = re.search("^(?:if|def)$", foo)
print(foo, ' ', bar);
... if <_sre.SRE_Match object at 0x934daa0>
elif None
define None
def <_sre.SRE_Match object at 0x934daa0>
in None
source to share