Regex matches multiple numbers after keywords

I have a question about writing Regex in Python.

Line:

abc rules 2.3, 4.5, 6.7, 8.9 and def rules 3.6, 6.7, 8.9 and 10.11.

      

My goal is to try and use a single line regex to capture all numbers.

Also, I want to put the number in different groups. 2.3, 4.5, 6.7, 8.9

should be under the group abc rules

, but 3.6, 6.7, 8.9 and 10.11

will be under def rules

.

I am trying to use regex: (?<=abc rules) \d{1,2}.\d{1,2}

to capture all numbers after the abc rules, but I could only get the first numbers.

How can I achieve my goal?

Thanks everyone!

+3


source to share


1 answer


you can use

import re
rx = r"\b(?:abc|def)\s+rules\s+(\d*\.*?\d+(?:(?:,|\s*and)\s*\d*\.*?\d+)*)"
s = "abc rules 2.3, 4.5, 6.7, 8.9 and def rules 3.6, 6.7, 8.9 and 10.11."
print([re.split(r'\s*(?:,|\band\b)\s*', x) for x in re.findall(rx, s)])
# => [['2.3', '4.5', '6.7', '8.9'], ['3.6', '6.7', '8.9', '10.11']]

      

See Python Demo

The point is that you can match substrings with numbers, capture only parts of the numbers, and then split them into \s*(?:,|\band\b)\s*

regex.

This matches all substrings:



\b(?:abc|def)\s+rules\s+(\d*\.*?\d+(?:(?:,|\s*and)\s*\d*\.*?‌​\d+)*)

      

See regex demo

More details

  • \b

    - word boundary
  • (?:abc|def)

    - either abc

    ordef

  • \s+

    - 1 or more spaces
  • rules

    - substring rules

  • \s+

    - 1 or more spaces
  • (\d*\.*?\d+(?:(?:,|\s*and)\s*\d*\.*?‌​\d+)*)

    - capture of group 1:
    • \d*\.*?\d+

      - int or float number
    • (?:(?:,|\s*and)\s*\d*\.*?‌​\d+)*

      - zero or more sequences:
      • (?:,|\s*and)

        - ,

        or 0+ spaces and thenand

      • \s*

        - spaces 0+
      • \d*\.*?‌​\d+

        - int or float number

The regular expression \s*(?:,|\band\b)\s*

matches a comma or a whole word and

enclosed with spaces + +.

+1


source







All Articles