How to get all found groups using template and template

I have the following regular expression pattern: ^(\d+)(;(\d+))*$

. And I would like to get the number of groups in this regex and the value of each one.

I tried using groupCount and group, but I get the following results:

Input: "1"
Groups: 3
"1", "1", null, null 
Input: "1;2"
Groups: 3
"1;2", "1", ";2", "2"
Input: "1;2;3"
Groups: 3
"1;2;3", "1", ";3", "3"
Input: "1;2;3;4"
Groups: 3
"1;2;3;4", "1", ";4", "4"

      

I was expecting the first one to "1"

get 1 from groupCount. And in the case of the latter, "1;2;3;4"

I expected to get 7 from groupCount.

Is there any method in Matcher that returns what I was expecting?

EDIT: Added code that generated the above output

String input = "1";
Pattern pattern = Pattern.compile("^(\\d+)(;(\\d+))*$");
for (int i = 2; i < 6; ++i) {
    Matcher matcher = pattern.matcher(input);
    matcher.matches();
    System.out.println("Input: \"" + input + "\"\nGroups: " + matcher.groupCount());
    for (int group = 0; group <= matcher.groupCount(); ++group) {
        System.out.print("\"" + matcher.group(group) + "\", ");
    }
    System.out.println();
    input += ";" + i;
}

      

+3


source to share


1 answer


Sorry, but there is a misunderstanding on your side about groups.

You define the number of groups with your regexp. It is string independent. And in your regex you are defining 3 groups:

 ^(\\d+)(;(\\d+))*$
  1     2 3

      

Groups are numbered with opening brackets. Thus, your regular expression will always have exactly 3 groups. If they correspond to something, something completely different.

So, the first group will always have the first number found. For the other two groups, you do something special: you repeat the capture group .



Since the next numbers you match are all stored in group 3, you will only find the last one in the final result. In .net, you should be able to read all the matches, but I don't think it is possible in Java.

Decision:

Validate line with regex

^\\d+(;\\d+)*$

      

And if the format is ok, then get the numbers by dividing by ";"

+5


source







All Articles