RegExp to match strings formed with a limited set of characters without reusing any character

I have a bunch of characters like this: ABBCD

And I have several such spaces: _ _ _

Is there a way to use a regular expression to match any string that can be formed by "dragging and dropping" the available characters into empty spaces?

So, in the example, these are several valid matches:

A B C
A B B
B C B
D A B

      

But they are not valid:

A A B    // Only one 'A' is available in the set
B B B    // Only two 'B are available in the set

      

Sorry if it has already been asked before.

+3


source to share


3 answers


vks solution will work correctly and here it is optimized with padding to fulfill the "_ _ _" rule :

^(?!(?:[^A]*A){2})(?!(?:[^B]*B){3})(?!(?:[^C]*C){2})(?!(?:[^D]*D){2})(?:[ABCD](?:\s|$)){3}

      

Below is an example regex .

Changes from original regex:

  • The capturing groups are removed as we are in Java - the Java regex implementation devotes time to recording captured groups during matching).
  • The anchor is ^

    moved forward for regular expression readability.


Regex explanation:

  • ^

    Sets the position at the start of the match.
  • (?!

    Negative lookup - asserts that our position does not match the following without moving the pointer:
  •   (?:[^A]*A){2}

    Two "A" s (alphabetic character), and not "A" s rolled in an optimal way.
  • )

    Closes the group.
  • (?!(?:[^B]*B){3})

    Same as above group. Asserts that there are not three Bs in the match .
  • (?!(?:[^C]*C){2})

    It is claimed that there are not two "C" s in the match .
  • (?!(?:[^D]*D){2})

    It is alleged that there is a match not two "D" .
  • (?:

    Non-Capture Group: Complies with the following conditions:
  •   [ABCD]

    Any character from the list "A" , "B" , "C" or "D" .
  •   (?:\s|$)

    Space or end of line.
  • ){3}

    Three times. To match the "_ _ _" rule , you need to execute the sequence exactly three times.

To use a regular expression:

boolean fulfillsRule(String str) {
    Pattern tripleRule = Pattern.compile("^(?!(?:[^A]*A){2})(?!(?:[^B]*B){3})(?!(?:[^C]*C){2})(?!(?:[^D]*D){2})(?:[ABCD](?:\s|$)){3}");
    return tripleRule.matcher(str).find();
}

      

+8


source


 (?!(.*?A){2,})(?!(.*?B){3,})(?!((.*?C){2,}))(?!((.*?D){2,}))^[ABCD]*$

      

You can use something like this. See demo.



http://regex101.com/r/uH3fV3/1

+5


source


Interesting problem, this is my idea:

(?m)^(?!.*([ACD]).*\1)(?!(?>.*?B){3})(?>[A-D] ){2}[A-D]$

      

Used (?m)

MULTILINE modifier , which ^

corresponds to the start line and $

the end line.

Test in regexplanet (click on Java); regex101 (not Java)


If I understood correctly, the available symbol bank is A,B,B,C,D

. A string must be valid if it contains 0 or 1 each, [ACD]

or 0-2 B

in your example. My model has three parts:

  • (?!.*([ACD]).*\1)

    Used at the beginning of a line with a ^

    negative lookup to ensure it [ACD]

    doesn't happen more often than once, by debugging [ACD]

    before \1

    and checking it doesn't happen twice anywhere.

  • (?!(?>.*?B){3})

    Using negative to ensure B

    no more than 2x are encountered.

  • finally (?>[A-D] ){2}[A-D]$

    determines the total number of characters used, guarantees formatting where each letter must be pre-spaced or start and checks the length.

This can easily be changed for other needs. Also see SO Regex FAQ

+2


source







All Articles