Incorrect content replacement

Question

Incorrect content replacement

I am trying to replace the term brunch with only sentences that contain any of the following words: Saturday, Sunday, and / or weekend. However, it replaces the whole sentence and not just the term brunch.

>>> reg = re.compile(r'(?:(?:^|\.)[^.]*(?=saturday|sunday|weekend)[^.]*(brunch)[^.]*(?:\$|\.)|(?:^|\.)[^.]*(brunch)[^.]*(?=saturday|sunday|weekend)[^.]*(?:\$|\.))',re.I)
>>> str = 'Limit 1 per person. Limit 1 per table. Not valid for carryout. Not valid 
    with any other offers, no cash back. Valid only for Wednesday-Friday dinner and 
    Saturday-Sunday brunch. Not valid on federal holidays. Reservation required.'
>>> reg.findall(str)
[('brunch', '')]
>>> reg.sub(r'BRUNCH',str)
'Limit 1 per person. Limit 1 per table. Not valid for carryout. Not valid with any 
 other offers, no cash backBRUNCH Not valid on federal holidays. Reservation required.'

I want it to create the following:

Limit 1 per person. Limit 1 per table. Not valid for carryout. Not valid with any other
offers, no cash back. Valid only for Wednesday-Friday dinner and Saturday-Sunday BRUNCH. 
Not valid on federal holidays. Reservation required.

Answer:

To solve this problem, I was able to use the following:

>>> reg = re.compile(r'(?:((?:^|\.)[^.]*(?=saturday|sunday|weekend)[^.]*)(brunch)([^.]*(?:\$|\.))|((?:^|\.)[^.]*)(brunch)([^.]*(?=saturday|sunday|weekend)[^.]*(?:\$|\.)))',re.I)
>>> reg.sub('\g<1>BRUNCH\g<3>',str)
'Limit 1 per person. Limit 1 per table. Not valid for carryout. Not valid with any other offers, no cash back. Valid only for Wednesday-Friday dinner and Saturday-Sunday BRUNCH. Not valid on federal holidays. Reservation required.'

+3

python regex

user2694306 Dec 21 '14 at 9:46

source to share

4 answers

Instead of using a regular expression, it's easier to break it down into steps:

s = "Limit 1 per person. Limit 1 per table. Not valid for carryout. Not valid with any other offers, no cash back. Valid only for Wednesday-Friday dinner and Saturday-Sunday brunch. Not valid on federal holidays. Reservation required."
results = []
for line in s.split("."):
    if any(text in line.lower() for text in ("saturday", "sunday", "weekend")):
        results.append(line.replace("brunch", "BRUNCH"))
    else:
        results.append(line)
result = ".".join(results)
print(result)

+3

twasbrillig Dec 21 At 9:57 am

source to share

Keep the regex that way and use the backreference instead:

reg = re.compile(r'((?:saturday|sunday|weekend)\s+)brunch', re.I)
reg.sub(r'\1BRUNCH',str)
'Limit 1 per person. Limit 1 per table. Not valid for carryout. Not valid with any other
 offers, no cash back. Valid only for Wednesday-Friday dinner and Saturday-Sunday BRUNCH.
 Not valid on federal holidays. Reservation required.'

+1

anubhava Dec 21 At 9:57 am

source to share

You don't have to use regex

for everyone, you can split the sentence and process each one separately and use a list comprehension instead:

>>> import re
>>> l=s.split('.')
>>> print '.'.join([re.sub('brunch','BRUNCH',sent) if 'Saturday' in sent or 'Sunday' in sent or 'Weekend' in sent else sent for sent in l])
'Limit 1 per person. Limit 1 per table. Not valid for carryout. Not valid 
    with any other offers, no cash back. Valid only for Wednesday-Friday dinner and 
    Saturday-Sunday BRUNCH. Not valid on federal holidays. Reservation required.'

0

Kasramvd Dec 21 14 at 10:03

source to share

Aran-Fey · Accepted Answer · 2014-12-21T10:09:06+0000

Since you are forced to use a regex:

Search

((?:^|\.)(?=[^.]*(?:saturday|sunday|weekend))[^.]*)brunch

replace

\1BRUNCH

Make sure you compile it as case insensitive. See demo .

Please note that this only replaces one appearance brunch

per sentence.

Incorrect content replacement

More articles: