Regex matches multiple numbers after keywords
I have a question about writing Regex in Python.
Line:
abc rules 2.3, 4.5, 6.7, 8.9 and def rules 3.6, 6.7, 8.9 and 10.11.
My goal is to try and use a single line regex to capture all numbers.
Also, I want to put the number in different groups. 2.3, 4.5, 6.7, 8.9
should be under the group abc rules
, but 3.6, 6.7, 8.9 and 10.11
will be under def rules
.
I am trying to use regex:
(?<=abc rules) \d{1,2}.\d{1,2}
to capture all numbers after the abc rules, but I could only get the first numbers.
How can I achieve my goal?
Thanks everyone!
source to share
you can use
import re
rx = r"\b(?:abc|def)\s+rules\s+(\d*\.*?\d+(?:(?:,|\s*and)\s*\d*\.*?\d+)*)"
s = "abc rules 2.3, 4.5, 6.7, 8.9 and def rules 3.6, 6.7, 8.9 and 10.11."
print([re.split(r'\s*(?:,|\band\b)\s*', x) for x in re.findall(rx, s)])
# => [['2.3', '4.5', '6.7', '8.9'], ['3.6', '6.7', '8.9', '10.11']]
See Python Demo
The point is that you can match substrings with numbers, capture only parts of the numbers, and then split them into \s*(?:,|\band\b)\s*
regex.
This matches all substrings:
\b(?:abc|def)\s+rules\s+(\d*\.*?\d+(?:(?:,|\s*and)\s*\d*\.*?\d+)*)
See regex demo
More details
-
\b
- word boundary -
(?:abc|def)
- eitherabc
ordef
-
\s+
- 1 or more spaces -
rules
- substringrules
-
\s+
- 1 or more spaces -
(\d*\.*?\d+(?:(?:,|\s*and)\s*\d*\.*?\d+)*)
- capture of group 1:-
\d*\.*?\d+
- int or float number -
(?:(?:,|\s*and)\s*\d*\.*?\d+)*
- zero or more sequences:-
(?:,|\s*and)
-,
or 0+ spaces and thenand
-
\s*
- spaces 0+ -
\d*\.*?\d+
- int or float number
-
-
The regular expression \s*(?:,|\band\b)\s*
matches a comma or a whole word and
enclosed with spaces + +.
source to share