Search in string and get 2 words before and after match in Python
I am using Python to search for some words (also multi-tone) in a description (string).
For this I am using a regex like this
result = re.search(word, description, re.IGNORECASE)
if(result):
print ("Trovato: "+result.group())
But I need to get the first 2 words before and after the match. For example, if I have something like this:
Parking is awful here, this shop sucks.
" here " is the word I'm looking for. So after I matched it to my regex, I need 2 words (if exists) before and after the match.
In the example: The parking lot here is awful, this
"Parking" and awful, these are the words I need.
ATTTENTION The description booth will be very long and the pattern "here" may appear multiple times?
source to share
I would do it like this (edit: added anchors for most cases):
(\S+\s+|^)(\S+\s+|)here is(\s+\S+|)(\s+\S+|$)
Likewise, you will always have 4 groups (may need to be trimmed) with the following behavior:
- If group 1 is empty, there was no word before (group 2 is also empty)
- If group 2 is empty, there is only one word left (group 1)
- If groups 1 and 2 are not empty, they are words before in order
- If group 3 is empty, there were no words after
- If group 4 is empty, there was only one word after:
- If groups 3 and 4 are not empty, these are words after the order
Fixed demo link
source to share
Based on your explanation, it gets a little tricky. The solution below addresses scenarios where the pattern you are looking for might actually also be in the previous two or the next two words.
line = "Parking here is horrible, here is great here is mediocre here is here is "
print line
pattern = "here is"
r = re.search(pattern, line, re.IGNORECASE)
output = []
if r:
while line:
before, match, line = line.partition(pattern)
if match:
if not output:
before = before.split()[-2:]
else:
before = ' '.join([pattern, before]).split()[-2:]
after = line.split()[:2]
output.append((before, after))
print output
The result from my example would be:
source to share