Regex: Extract and Match Specific words Between Two Characters

Question

Regex: Extract and Match Specific words Between Two Characters

I need to extract from a string, words that match (path, road, street and street) with each word before and after it up to a comma, character, or near in front.

Line examples:
1. Yeet Road, Off Mandy Plant Way, Mando GRA.
2.3A, Sleek Drive, Off Tremble Rake Street.
3.57 Radish Slist Road Ikoyi

The result should be as close to:

Yeet road
Mandy's Way
Rake Street Alarm
Radish Slist Road Ikoyi

Based on some stack answers, this is what I have:
(?<=\,)(.*Way|Road|Str|Street?)(?=\,)

Any help would be appreciated.

+3

php regex

Jonathan Itakpe 18 jul. 17 at 9:48 am

source to share

2 answers

you can use

^\d+\s*(*SKIP)(*F)|\b[^,]*\b(?:way|r(?:oa)?d|str(?:eet)?)\b[^,]*\b

See regex demo

More details

^\d+\s*(*SKIP)(*F)

- matches and omits leading 1 or more digits followed by 0+ spaces at the beginning of the line
|

- or matches ...
\b[^,]*\b(?:way|r(?:oa)?d|str(?:eet)?)\b[^,]*\b

- any 0+ non-comma characters followed by any non-capturing alternatives as whole words, and then 0 + non-comma characters again, the entire subpattern is matched at word boundaries to avoid leading / trailing punctuation matching / spaces.

+1

Wiktor Stribiżew 18 jul. 17 at 10:26

source to share

Casimir et Hippolyte · Accepted Answer · 2017-07-18T10:42:57+0000

You can try something like this (with the ignore_case flag):

\b(?:(?!off\b)[a-z]+[^\w,\n]+)*?\b(?:way|road|str(?:eet)?)\b(?:[^\w,\n]+[a-z]+)*

demo

However, pattern types that start describing an undefined substring of length undefined before literal pattern parts (keywords) are ineffective. This is not important for small lines, but you cannot use them on a large line.

To exclude certain words, you can change (?!off\b)

to(?!off\b|word1\b|word2\b|...)

Also, you need to clarify what characters are allowed or not between words.

Regex: Extract and Match Specific words Between Two Characters

More articles: