Java regex to match words with optional plural comments between every two letters (as a back reference to a regex subexpression)

I need a java regex that matches a word, taking into account the possibility of a comment inside every two subsequent letters. Eg "W/*comment1*/OR/*comment2*/D"

. I tried using the named group and backlink:

(?<comment>\s*/\*.*\*/\s*)W\k<comment>*O\k<comment>*R\k<comment>*D

      

But that doesn't work because the backreference refers to the match of the named group, not the actual group subexpression. So, I had to repeat the comment sub-expression (?<comment>\s*/\*.*\*/\s*)

in all places where it was expected:

W(\s*/\*.*\*/\s*)*O(\s*/\*.*\*/\s*)*R(\s*/\*.*\*/\s*)*D

      

This works, but is there an even more elegant solution without having to repeat the "comment" subpattern many times?

+3


source to share


2 answers


You can do this by capturing an email (or several) at a time, discarding the optional following comments, for example:

        String toBeParsed="W/* this is comment 1 */OR/*this is comment 2*/D";
        String regexp = "(\\w+)(/\\*.*?\\*/)*"; // match letters + optional comment
        Pattern pattern =Pattern.compile(regexp);
        Matcher matcher=pattern.matcher(toBeParsed);
        String word="";
        while(matcher.find()){
            String letter=matcher.group(1);
            String comment=matcher.group(2);
            System.out.println("found letter(s) "+letter);
            word+=letter;
            if (comment!=null) System.out.println("discarding comment "+matcher.group(2));
        }
        System.out.println(word);

      



output

found letter(s) W
discarding comment /* this is comment 1 */
found letter(s) OR
discarding comment /*this is comment 2*/
found letter(s) D
WORD

      

+1


source


"how to return a reference to a regular expression subexpression"

Do you mean it?



"(.*)\\1"

      

This matches any duplicate word. \ 1 refers to the first group, which is the first parenthesis.

0


source







All Articles