Java regex to match words with optional plural comments between every two letters (as a back reference to a regex subexpression)
I need a java regex that matches a word, taking into account the possibility of a comment inside every two subsequent letters. Eg "W/*comment1*/OR/*comment2*/D"
. I tried using the named group and backlink:
(?<comment>\s*/\*.*\*/\s*)W\k<comment>*O\k<comment>*R\k<comment>*D
But that doesn't work because the backreference refers to the match of the named group, not the actual group subexpression. So, I had to repeat the comment sub-expression (?<comment>\s*/\*.*\*/\s*)
in all places where it was expected:
W(\s*/\*.*\*/\s*)*O(\s*/\*.*\*/\s*)*R(\s*/\*.*\*/\s*)*D
This works, but is there an even more elegant solution without having to repeat the "comment" subpattern many times?
source to share
You can do this by capturing an email (or several) at a time, discarding the optional following comments, for example:
String toBeParsed="W/* this is comment 1 */OR/*this is comment 2*/D";
String regexp = "(\\w+)(/\\*.*?\\*/)*"; // match letters + optional comment
Pattern pattern =Pattern.compile(regexp);
Matcher matcher=pattern.matcher(toBeParsed);
String word="";
while(matcher.find()){
String letter=matcher.group(1);
String comment=matcher.group(2);
System.out.println("found letter(s) "+letter);
word+=letter;
if (comment!=null) System.out.println("discarding comment "+matcher.group(2));
}
System.out.println(word);
output
found letter(s) W
discarding comment /* this is comment 1 */
found letter(s) OR
discarding comment /*this is comment 2*/
found letter(s) D
WORD
source to share