Get the string of the first argument of the calling function

I want to search with PHP files for a special function call. The reason is because I want to generate .MO-Files for GetText-Extension. So I first need to create a .PO-Files that contains all the required text strings.

I already find a lot of texts, but there are some problems.

Here is my Regex to find the first argument of a function:



I need to find function calls with the following patterns:

_("text %s", 3);


The text can contain escaped quotes. My problem is this is urgent that I need to know if there was an apostrophe or a regular quote for the call.

If I have a challenge



then i get the problem i get the text



without an end quote.

Do any of you have an idea how I can get my Regex to work?


source to share

2 answers

I would use a PHP tokenizer for this kind of thing, not regular expressions:

$funcName = '_';
$tokens   = token_get_all(file_get_contents('path/to/your/script.php'));
$strings  = array();

foreach($tokens as $index => $token){


  if($token[0] === T_CONSTANT_ENCAPSED_STRING){

    if(!isset($tokens[$index - 2]) || ($tokens[$index - 1] !== "("))

    list($id, $text, $line) = $tokens[$index - 2];

    // this is your string (substr drops quotes around it)
    if(($id === T_STRING) && ($text === $funcName))
      $strings[] = substr($token[1], 1, -1);






Raw regex:



Restricted regex:



The result is capturing group 1. I used the reset pattern branch (?|pattern)

so that the capture group number was reset for every branch variable split |


There (?|'((?:[^'\\]|\\.)*)'|"((?:[^"\\]|\\.)*)")

are 2 templates inside the reset branch :

  • '((?:[^'\\]|\\.)*)'

    : Matching and capturing content within a single-quoted string that consists of either an unquoted sequence, no backslash, or an escaped sequence. Actually, I'm a bit sloppy here, since the (raw) new line character is considered part of the string. I don't think the spec will allow this, but if the input contains valid code then there should be no problem.

  • "((?:[^"\\]|\\.)*)"

    : Same as above, but for a double quoted string.

Note that I am not using the rest of the function arguments.



All Articles