How to explode a string using a comma other than situtation when that comma is inside apostrophes?

I have the following text:

$string='
            blah<br>
            @include (\'file_to_load\')
            <br>
            @include (\'file_to_load\',\'param1\',\'param2\',\'param3\')
    ';

      

I would like to catch (and then replace using preg_replace_callback) all occurrences of "@include" with parameters (eg @include ('file_to_load', 'param1', 'param2', 'param3'))

So, I do this:

$string='
 blah<br>
 @include (\'file_to_load\')
 <br>
 @include (\'file_to_load\',\'param1\',\'param2\')
';
$params=[];
$result = preg_replace_callback(
    '~@include \((,?.*?)\)~',//I catch @include, parenthesis and all between them
    function ($matches) {
        echo '---iteration---';
        $params=explode(',',$matches[1]);//exploding by a comma
        echo '<pre>';
        var_dump($params);
        echo '</pre>';
        return $matches[1];
    },
    $string
);

      

And everything is fine until the comma displays the parameter inside , for example here:

$string='
    blah<br>
    @include (\'file_to_load\')
    <br>
    @include (\'file_to_load\',\'param1,something\',[\'elem\'=>\'also, a comma\']])
';

      

Here we have a comma inside the param1 parama, now, after exploding with the explode () function, it clearly doesn't work as I want.

Is there a way to explode () (perhaps using a regex) with a comma, but not when the comma is inside apostrophes?

+3


source to share


3 answers


Use the following for separation:

,(?=([^']*'[^']*')*[^']*$)

      

Use preg_split

as it explode

doesn't support regex:



Code:

$params = preg_split(',(?=([^']*'[^']*')*[^']*$)',$matches[1]);

      

+2


source


What you are looking for is tokenization. Don't try to separate the comma. Instead, define each building block of your expression. Therefore, you need to align, not split.

For example, this is a simple regex:

'[^']+'

      

Matches these elements:

@include ('file_to_load','param1,something',['elem'=>'also, a comma'])
          \____________/ \________________/  \____/  \_____________/

      

But this may not be enough for your case, since you have an array and I assume you should parse it as well.

So identify each parameter separately:

'[^']+'|\[.+?\]

      

@include ('file_to_load','param1,something',['elem'=>'also, a comma'])
          \____________/ \________________/ \_______________________/

      



The problem with this approach is that it won't let you match nested arrays. If you need to parse this, then the pattern becomes more complex:

(?(DEFINE)
  (?<string>'[^']+')
  (?<array> \[ (?: (?&arrayitem) (?> , \s* (?&arrayitem) )* )? \] )
  (?<arrayitem> \s* (?&string) \s* => \s* (?&value) \s* )
  (?<value> (?&string) | (?&array) )
)
(?&value)

      

Yes, it's a recursive regex, but it can actually identify the parameters:

@include ('file_to_load','param1,something',['elem'=>'also, a comma','other'=>['nested' => 'array']])
          \___________/  \________________/ \______________________________________________________/

      

Demo

Since I don't know what you are trying to do with the parameters afterwards, you may need to write a parser instead of using regular expressions, but that depends on what you are trying to do, parameters.

Note: You may need to replace the string pattern with '[^']+'

something more complex if you want to avoid quoting within a string.

There are two generally accepted ways to do this:

  • Use a backslash: 'abc\'def'

    '(?:[^\\']++|\\.)*'
    
          

  • Double quote: 'abc''def'

    '(?:[^']++|'')*'
    
          

+2


source


Try using this:

"\@include[\s]*\([^\)]*\)"

      

This will match

@include (\'file_to_load\')

      

and

@include (\'file_to_load\',\'param1,something\',[\'elem\'=>\'also, a comma\']])

      

Hope this helps.

0


source







All Articles