Remove pattern from string in Java

Currently I am working on a tool that helps me analyze constantly growing String

, which can look like the following: String s = "AAAAAAABBCCCDDABQ"

. I want to be able to find the sequence A and B, do something, and then remove that sequence from the original String

.

My code looks like this:

while (someBoolean){

    if(Pattern.matches("A+B+", s)) {
        //Do stuff
        //Remove the found pattern
    }

    if(Pattern.matches("C+D+", s)) {
        //Do other stuff
        //Remove the found pattern
    }

}
return s;

      

Also, how could I remove three sequences so that it s

just contains "Q"

at the end of the calculation without and an infinite loop?

+3


source to share


3 answers


You have to use the regex replacement loop, i.e. the appendReplacement(StringBuffer sb, String replacement)

and methods appendTail(StringBuffer sb)

.

To find one of the many patterns, use a |

regular expression
and write each pattern separately.

You can then use group(int group)

to get a consistent string for each capturing group (the first group is group 1), which returns null

if that group did not match. For best performance, just check if the group matches start(int group)

, which returns -1

if that group didn't match.

Example:



String s = "AAAAAAABBCCCDDABQ";
StringBuffer buf = new StringBuffer();
Pattern p = Pattern.compile("(A+B+)|(C+D+)");
Matcher m = p.matcher(s);
while (m.find()) {
    if (m.start(1) != -1) { // Group 1 found
        System.out.println("Found AB: " + m.group(1));
        m.appendReplacement(buf, ""); // Replace matched substring with ""
    } else if (m.start(2) != -1) { // Group 2 found
        System.out.println("Found CD: " + m.group(2));
        m.appendReplacement(buf, ""); // Replace matched substring with ""
    }
}
m.appendTail(buf);
String remain = buf.toString();
System.out.println("Remain: " + remain);

      

Output

Found AB: AAAAAAABB
Found CD: CCCDD
Found AB: AB
Remain: Q

      

+4


source


This solution assumes that the string always ends with Q.



String s="AAAAAAABBCCCDDABQ";

Pattern abPattern = Pattern.compile("A+B+");
Pattern cdPattern = Pattern.compile("C+D+");


while (s.length() > 1){

    Matcher abMatcher = abPattern.matcher(s);
    if (abMatcher.find()) {
        s = abMatcher.replaceFirst("");
        //Do other stuff
    }

    Matcher cdMatcher = cdPattern.matcher(s);
    if (cdMatcher.find()) {
      s = cdMatcher.replaceFirst("");
        //Do other stuff
    }

}
System.out.println(s);

      

+1


source


You are probably looking for something like this:

String input = "AAAAAAABBCCCDDABQ";
String result = input;
String[] chars = {"A", "B", "C", "D"}; // chars to replace

for (String ch : chars) {
    if (result.contains(ch)) {
        String pattern = "[" + ch + "]+";
        result = result.replaceAll(pattern, ch);
    }
}

System.out.println(input); //"AAAAAAABBCCCDDABQ"
System.out.println(result); //"ABCDABQ"

      

This basically replaces the sequence of each character for one.

If you want to completely remove the sequence, just replace ch

with ""

in the method parameters replaceAll

inside the body.

0


source







All Articles