RegExp to match strings formed with a limited set of characters without reusing any character
I have a bunch of characters like this: ABBCD
And I have several such spaces: _ _ _
Is there a way to use a regular expression to match any string that can be formed by "dragging and dropping" the available characters into empty spaces?
So, in the example, these are several valid matches:
A B C
A B B
B C B
D A B
But they are not valid:
A A B // Only one 'A' is available in the set
B B B // Only two 'B are available in the set
Sorry if it has already been asked before.
source to share
vks solution will work correctly and here it is optimized with padding to fulfill the "_ _ _" rule :
^(?!(?:[^A]*A){2})(?!(?:[^B]*B){3})(?!(?:[^C]*C){2})(?!(?:[^D]*D){2})(?:[ABCD](?:\s|$)){3}
Changes from original regex:
- The capturing groups are removed as we are in Java - the Java regex implementation devotes time to recording captured groups during matching).
- The anchor is
^
moved forward for regular expression readability.
Regex explanation:
-
^
Sets the position at the start of the match. -
(?!
Negative lookup - asserts that our position does not match the following without moving the pointer: -
(?:[^A]*A){2}
Two "A" s (alphabetic character), and not "A" s rolled in an optimal way. -
)
Closes the group. -
(?!(?:[^B]*B){3})
Same as above group. Asserts that there are not three Bs in the match . -
(?!(?:[^C]*C){2})
It is claimed that there are not two "C" s in the match . -
(?!(?:[^D]*D){2})
It is alleged that there is a match not two "D" . -
(?:
Non-Capture Group: Complies with the following conditions: -
[ABCD]
Any character from the list "A" , "B" , "C" or "D" . -
(?:\s|$)
Space or end of line. -
){3}
Three times. To match the "_ _ _" rule , you need to execute the sequence exactly three times.
To use a regular expression:
boolean fulfillsRule(String str) {
Pattern tripleRule = Pattern.compile("^(?!(?:[^A]*A){2})(?!(?:[^B]*B){3})(?!(?:[^C]*C){2})(?!(?:[^D]*D){2})(?:[ABCD](?:\s|$)){3}");
return tripleRule.matcher(str).find();
}
source to share
Interesting problem, this is my idea:
(?m)^(?!.*([ACD]).*\1)(?!(?>.*?B){3})(?>[A-D] ){2}[A-D]$
Used (?m)
MULTILINE modifier , which ^
corresponds to the start line and $
the end line.
Test in regexplanet (click on Java); regex101 (not Java)
If I understood correctly, the available symbol bank is A,B,B,C,D
. A string must be valid if it contains 0 or 1 each, [ACD]
or 0-2 B
in your example. My model has three parts:
-
(?!.*([ACD]).*\1)
Used at the beginning of a line with a^
negative lookup to ensure it[ACD]
doesn't happen more often than once, by debugging[ACD]
before\1
and checking it doesn't happen twice anywhere. -
(?!(?>.*?B){3})
Using negative to ensureB
no more than 2x are encountered. -
finally
(?>[A-D] ){2}[A-D]$
determines the total number of characters used, guarantees formatting where each letter must be pre-spaced or start and checks the length.
This can easily be changed for other needs. Also see SO Regex FAQ
source to share