Regex multiple match substring

I have an application that determines, given a perl regex, if it should display a dropdown menu or a simple input field. So I have to check the regex pattern for the "outer form" and substring. I have come up with several solutions for this.

Given the input pattern "^ (100 | 500 | 1000) $", this should result in a dropdown menu with three entries: 100, 500, and 1000. I need one regex that parses the entire pattern to determine if it is a valid list and one a regex that does the actual substring match as I don't know how to match the same substring multiple times. This is my regex pattern:

^\^\((?:((?:[^\|]|\\\|)+)(?:\||(?:\)\$$)))+

      

A bit of a simplification, since this regex is a little fuzzy:

^\^\((?:([\w\d]+)(?:\||(?:\)\$$)))+

      

This works, but only stores the last substring (1000 in this case) and discards the rest, tested with PCRE tools and online regex. To get the actual substrings, i.e. the dropdowns, I have:

(?:\^\()?((?:[^\|]|\\|)+)(?:\||(?:\)\$$))

      

Simplifying again:

(?:\^\()?([\w\d]+)(?:\||(?:\)\$$))

      

This matches a substring, but does not match the syntax of a dropdown pattern that does another regex (for example, it also matches "^ (100 |) with substring" 100 "). My question is, is there a way to combine these regexes to have only one pattern that matches 1) the entire pattern syntax and 2) the actual substrings?

Thanks in advance,

Jeremiah

PS: Sorry if this is obvious, but I am very confused today with all these regexes.

Sample data:

Input regular expression: ^ (100 | 500 | 1000) $
Syntax OK!
Matching substrings: 100, 500, 1000
=> dropdown show

Input expression: ^ [0-9a-fA-F] + $
Syntax is invalid!
=> show correct input field

Input regular expression: ^ (foo | bar) $
Syntax OK!
Matching substrings: "foo", "bar"
=> dropdown show

Input expression: ^ (foo | bar) [0-9] + $
Syntax is invalid!
=> show correct input field

+3


source to share


2 answers


You can achieve what you need using two steps.

You can use this regex to check the format:

\^\(\w+(?:\|\w+)*\)\$

      

Working demo

enter image description here



After checking the correct strings, you can use a function like this:

$str = "^(100|500|1000|2000|3000)$";
$arr = preg_split ("/\W+/" , $str, -1, PREG_SPLIT_NO_EMPTY);
print_r($arr);

      

Output:

Array
(
    [0] => 100
    [1] => 500
    [2] => 1000
    [3] => 2000
    [4] => 3000
)

      

+3


source


It looks like you are using PCRE.

You can use an option PCRE_DUPNAMES

or alternatively put an option (?J)

at the beginning of the template.

This option makes PCRE remember every capture group value that matches, rather than just throwing away everything but the last one. ( this is wrong , see comments)



Unfortunately it is not supported by the online testing tools AFAIK. I don't know what language you are using, but it needs some support so that you can use this feature.

From the PCRE Docs :

If you want to get complete information about all captured substrings for a given name, you should use the pcre_get_stringtable_entries () function.

+1


source







All Articles