Regex search outside of a string

The code is below:

import java.util.regex.*;

public class RegEx {

    public static void main(String[] args) {

        Pattern p = Pattern.compile("\\d*");
        Matcher m = p.matcher("ab56ef");
        System.out.println("Pattern is " + m.pattern());
        while (m.find()) {
            System.out.print("index: " + m.start() + " " + m.group());
        }
    }
}

      

Result:

index: 0 index: 1 index: 2 56 index: 4 index: 5 index: 6

      

Since the length of "ab34ef" is 6, the highest index of the string is 5.
Why is there a match at index 6? Thank you in advance!

+3


source to share


1 answer


You have 6 indices because there are 6 matches here as it \d*

could match an empty string. There is always a blank line before each character in the input line, because the regex engine processes the text at every position, looking for borders or specific characters.

Here's a visualization :

enter image description here



Here the engine checks the beginning of the line and says, "I can't see the numbers, but I can return a match, since the number of digits can be 0." It returns an empty string as a match and goes to b

. And so on until the end of the line.

If you need to find all numbers, just use the +

shorthand quantifier \d

.

See IDEONE demo

+8


source