Splitting based on double dots excluding one dot in Java
I have a string str1 below written in Java that I would like to split.
String str1 = "S1..R1..M1..D2..N3..S1.R1.M1.D2.N3.S1R1M1D2N3";
I would like to split a string into the following elements in an array:
S1.., R1.., M1.., D2.., N3.., S1., R1., M1., D2, N3., S1, R1, M1, D2, N3
I think I need to go for the 3rd playthrough, first with .., next to. and finally with a letter.
At first I tried to split with ... but I don't expect the result:
System.out.println("\n Original String = "+str1+"\nSplit Based on .. = "+Arrays.toString(str1.split("(?<=[..])")));
Result of the above split:
Original String = S1..R1..M1..D2..N3..S1.R1.M1.D2.N3.S1R1M1D2N3
Split Based on .. = [S1., ., R1., ., M1., ., D2., ., N3., ., S1., R1., M1., D2., N3., S1R1M1D2N3]
I tried even with:
("(?<=[.+])").
Not sure if I need to select Pattern / Matches.
You need your help.
source to share
Instead Positive Lookbehind use Positive Lookahead .
String s = "S1..R1..M1..D2..N3..S1.R1.M1.D2.N3.S1R1M1D2N3";
String[] parts = s.split("(?<!\\A)(?=[A-Z]\\d)");
System.out.println("Original = " + s + "\nSplitted = " + Arrays.toString(parts));
Note . I used Negative Lookbehind before the lookahead assertion to argue that it is not possible to match position at the beginning of a line. By doing this, it prevents an empty item from being the first item in your list.
Output
Original = S1..R1..M1..D2..N3..S1.R1.M1.D2.N3.S1R1M1D2N3
Splitted = [S1.., R1.., M1.., D2.., N3.., S1., R1., M1., D2., N3., S1, R1, M1, D2, N3]
Another way is matching instead of splitting.
String s = "S1..R1..M1..D2..N3..S1.R1.M1.D2.N3.S1R1M1D2N3";
Pattern p = Pattern.compile("[A-Z]\\d+\\.*");
Matcher m = p.matcher(s);
List<String> matches = new ArrayList<String>();
while (m.find()) {
matches.add(m.group());
}
System.out.println(matches);
Output
[S1.., R1.., M1.., D2.., N3.., S1., R1., M1., D2., N3., S1, R1, M1, D2, N3]
source to share
Go to smart regex for the argument .split()
. I'm going to enlighten you and provide you with this smart regex.;)
str1.split("(?<=[.\\d])(?=[A-Z]\\d)")
takes:
"S1..R1..M1..D2..N3..S1.R1.M1.D2.N3.S1R1M1D2N3"
gives:
["S1..", "R1..", "M1..", "D2..", "N3..", "S1.", "R1.", "M1.", "D2.", "N3.", "S1", "R1", "M1", "D2", "N3"]
source to share