Regular expression to split delimited data enclosed in double curly braces
I am trying to match a string like this:
{{name|arg1|arg2|...|argX}}
with regex
I am using preg_match
with
/{{(\w+)\|(\w+)(?:\|(.+))*}}/
but i get something like this whenever i use more than two arguments
Array
(
[0] => {{name|arg1|arg2|arg3|arg4}}
[1] => name
[2] => arg1
[3] => arg2|arg3|arg4
)
The first two elements cannot contain spaces, the rest can. Maybe I've been working on this for too long, but I can't seem to find the error - any help would be greatly appreciated.
Thanks Jan
source to share
Don't use regular expressions for these simple tasks. What do you really need:
$inner = substr($string, 2, -2);
$parts = explode('|', $inner);
# And if you want to make sure the string has opening/closing braces:
$length = strlen($string);
assert($inner[0] === '{');
assert($inner[1] === '{');
assert($inner[$length - 1] === '}');
assert($inner[$length - 2] === '}');
source to share
The problem is here: \ | (. +)
Regular expressions, by default, match as many characters as possible. Insofar as. any character, other instances | happy too, and that's not what you would like.
To prevent this, you must exclude | from an expression that says "match anything other than |", the result is \ | ([^ \ |] +).
source to share
Should work anywhere from 1 to N arguments
<?php
$pattern = "/^\{\{([a-z]+)(?:\}\}$|(?:\|([a-z]+))(?:\|([a-z ]+))*\}\}$)/i";
$tests = array(
"{{name}}" // should pass
, "{{name|argOne}}" // should pass
, "{{name|argOne|arg Two}}" // should pass
, "{{name|argOne|arg Two|arg Three}}" // should pass
, "{{na me}}" // should fail
, "{{name|arg One}}" // should fail
, "{{name|arg One|arg Two}}" // should fail
, "{{name|argOne|arg Two|arg3}}" // should fail
);
foreach ( $tests as $test )
{
if ( preg_match( $pattern, $test, $matches ) )
{
echo $test, ': Matched!<pre>', print_r( $matches, 1 ), '</pre>';
} else {
echo $test, ': Did not match =(<br>';
}
}
source to share
Of course you end up with something like this :) There is no way in a regex to return a dynamic number of matches - in your case - the arguments.
Looking at what you want to do, you have to keep up with the current regex and just blow up additional arguments to '|' and add them to the args array.
source to share
indeed, this is from the PCRE manual:
When the capture subpattern is repeated, the value that is the substring that matched the final iteration. For example, after (tweedle [dume] {3} \ s *) + matches "tweedledum tweedledee" the value of the captured substring of "Tweedledee". However, if there are nested sub-matrices, the corresponding captured values may have been set in previous iterations. For example, after / (a | (b)) + / matches "aba" the value of the second captured Substring is "b".