Python 3 regex - find all matching start and end index matches in string

This was my original approach:

string = '1'*15     
result = re.finditer(r'(?=11111)', string)      # overlapped = True   
                                                # Doesn't work for me 
for i in result:                                # python 3.5
   print(i.start(), i.end())


It finds all matching matches, but does not get the right end index. Output:

1 <_sre.SRE_Match object; span=(0, 0), match=''>
2 <_sre.SRE_Match object; span=(1, 1), match=''>
3 <_sre.SRE_Match object; span=(2, 2), match=''>
4 <_sre.SRE_Match object; span=(3, 3), match=''>
(and so on..)


My question is: How do I find all the matched matches and also get all the start and end indices?


source to share

2 answers

The problem you are getting is that the lookahead is a zero-width assertion that consumes (i.e. adds to the match result) no text. It's just a position on the line. This way, all of your matches start and end at the same place in the string.

You need to enclose the lookahead pattern with a capturing group (i.e. (?=(11111))

) and access the start and end of group 1 (with i.start(1)

and i.end(1)


import re
s = '1'*15     
result = re.finditer(r'(?=(11111))', s)

for i in result:
    print(i.start(1), i.end(1))


See Python demo , its result

(0, 5)
(1, 6)
(2, 7)
(3, 8)
(4, 9)
(5, 10)
(6, 11)
(7, 12)
(8, 13)
(9, 14)
(10, 15)




You can compare to this implementation and see where the differences might be.

match = re.finditer(r'111','test111 end111 and another 111')
for i in match:


If that doesn't work, you kindly share sample data



All Articles