Group matching with Asterisk?

How do I get content for a group with an asterisk?

For example, I would like to get a comma-separated list, e. g 1,2,3,4,5

.

private static final String LIST_REGEX = "^(\\d+)(,\\d+)*$";
private static final Pattern LIST_PATTERN = Pattern.compile(LIST_REGEX);

public static void main(String[] args) {
    final String list = "1,2,3,4,5";
    final Matcher matcher = LIST_PATTERN.matcher(list);
    System.out.println(matcher.matches());
    for (int i = 0, n = matcher.groupCount(); i < n; i++) {
        System.out.println(i + "\t" + matcher.group(i));
    }
}

      

And the conclusion

true
0   1,2,3,4,5
1   1

      

How can I get each record i.e. e. 1

, 2

, 3

, ...?

I am looking for a general solution. This is just a demo example.
Imagine a more complex regex, for example ^\\[(\\d+)(,\\d+)*\\]$

to match a list like[1,2,3,4,5]

+3


source to share


2 answers


You can use String.split()

.

for (String segment : "1,2,3,4,5".split(","))
    System.out.println(segment);

      

Or, you can capture multiple times with the assertion:

Pattern pattern = Pattern.compile("(\\d),?");
for (Matcher m = pattern.matcher("1,2,3,4,5");; m.find())
     m.group(1);

      

For your second example, which you added, you can make a similar match.

for (String segment : "!!!!![1,2,3,4,5] //"
                          .replaceFirst("^\\D*(\\d(?:,\\d+)*)\\D*$", "$1")
                          .split(","))
    System.out.println(segment);

      



I made a demo of the online code . Hope this is what you wanted.


how can I get all matches (zero, one or more) for an asterisked arbiter group (xyz)*

? [The group repeats, and I would like to get every recapture.]

No, you cannot. Regex capture groups and backlinks tell you why:

The return value for this group is the last captured

Since the capture group with a quantifier is kept on its own number, what value does the engine return when checking the group? All motors return the last committed value. For example, if you match a string A_B_C_D_

with ([A-Z]_)+

, then when checking for a match, group 1 will be D_

. With the exception of the .NET engine, all intermediate values ​​are lost. Basically, group 1 is overwritten every time its pattern is matched.

+4


source


I assume you can search for something like the following, this will handle both of your examples.

private static final String LIST_REGEX = "^\\[?(\\d+(?:,\\d+)*)\\]?$";
private static final Pattern LIST_PATTERN = Pattern.compile(LIST_REGEX);

public static void main(String[] args) {
    final String list = "[1,2,3,4,5]";
    final Matcher matcher = LIST_PATTERN.matcher(list);

    matcher.find(); 
    int i = 0;

    String[] vals = matcher.group(1).split(",");

    System.out.println(matcher.matches());
    System.out.println(i + "\t" + matcher.group(1));

    for (String x : vals) {
       i++;
       System.out.println(i + "\t" + x);
    }
}

      



Output

true
0   1,2,3,4,5
1   1
2   2
3   3
4   4
5   5

      

+2


source







All Articles