Regex pattern to get string between curly braces
I have a line The quick brown {fox, dragon, dinosaur} jumps over the lazy {dog, cat, bear, {lion, tiger}}.
I want to get the whole string between them in curly braces. Curly curly braces within curly braces should be ignored. The expected result in a PHP array would be
[0] => fox, dragon, dinosaur
[1] => dog, cat, bear, {lion, tiger}
I tried this pattern \{([\s\S]*)\}
from the regex pattern string between curly braces and exclude the curly braces that Mar answered, but it seems that this pattern gets the whole string between curly braces without splitting unrelated text (not sure if the correct word to use ). Here is the result of the template above
fox, jumps, over} over the lazy {dog, cat, bear, {lion, tiger}}
What is the best regex pattern to output the expected result from the sentence above?
You can use this recursive regex pattern in PHP:
$re = '/( { ( (?: [^{}]* | (?1) )* ) } )/x';
$str = "The quick brown {fox, dragon, dinosaur} jumps over the lazy {dog, cat, bear, {lion, tiger}}.";
preg_match_all($re, $str, $matches);
print_r($matches[2]);
Demo version of RegEx
As anubhava said, you can use a recursive pattern for this.
However, his version is rather "slow" and does not cover all cases.
I would use this regex:
#({(?>[^{}]|(?0))*?})#
As you can see there: http://lumadis.be/regex/test_regex.php?id=2516 , it's faster; and corresponds to more results.
So how does it work?
/
( # capturing group
{ # looks for the char '{'
(?> # atomic group, engine will never backtrack his choice
[^{}] # looks for a non-'{}' char
| # or
(?0) # re-run the regex in a subroutine to match a subgroup
)*? # and does it as many time as needed
} # looks for the char '}'
) # ends the capture
/x
Why did I use "*?"
Adding '?' to '*' makes it lifeless. If you use a greedy quantifier, the engine will start out more subroutine than bumpy. (If you need more explanation, let me know)