Ignore optional suffix without leading delimiter

I would like to spell the first part of a word, ignoring the optional suffix. Both the suffix and the preceding text consist of the same character class (i.e., there is no delimiter before the suffix).

My first attempt only captures the first letter:

m = re.search(r'([A-Za-z]+?)(?:Suff)?', 'textSuff')
m.groups()
>>> ('t',)

      

I only want to grab "text", but when I make the first element of the group greedy, it grabs the whole line.

m = re.search(r'([A-Za-z]+)(?:Suff)?', 'textSuff')
m.groups()
>>> ('textSuff',)

      

Is it possible to limit the suffix without a different character?

+3


source to share


2 answers


If your template is built from additional templates, make sure you get as few characters as possible. Thus, there must be at least a border. I am assuming word boundary \b

is a valid way (you need to match words here):

([A-Za-z]+?)(?:Suff)?\b

      

Watch the demo

IDEONE DEMO :



import re
p = re.compile(r'([A-Za-z]+?)(?:Suff)?\b')
test_str = "textSuff more words tSuff"
print(re.findall(p, test_str))

      

Outputs:

['text', 'more', 'words', 't']

      

+1


source


You need to point out that after it's over, or there should be an unacceptable character ...



m = re.search(r'([A-Za-z]+?)(?:Suff)?(?:[^A-Za-z]|$)'

      

0


source







All Articles