Get line number from preg_match_all ()

I am using PHP preg_match_all () to find a string imported with file_get_contents (). The regular expression returns matches, but I would like to know at what line number these matches were found. What's the best technique for this?

I could read the file as an array and do a regex for each line, but the problem is that my regex matches the carriage return (newlines) results.

+11


source to share


8 answers


Ok, it's a little late, maybe you decided to do it, but I had to do it and it's pretty simple. using flag PREG_OFFSET_CAPTURE

in preg_match

will return the position of the match character. let's say $ charpos, so

list($before) = str_split($content, $charpos); // fetches all the text before the match

$line_number = strlen($before) - strlen(str_replace("\n", "", $before)) + 1;

      



voila!

+8


source


You cannot do this with just regular expressions. At least not clean. What can you do to use the PREG_OFFSET_CAPTURE

preg_match_all flag and parse the messages of the entire file.



I mean, after you have an array of matching lines and starting offsets for each line, just count how many \r\n

or \n

or \r

are between the start of the file and the offset for each match. The match line number will consist of the number of individual EOL terminators ( \r\n

| \n

| \r

) plus 1

.

+10


source


You have a couple of options, but none of them are "simple":

a) exec()

and use the system command grep

which can report line numbers:

exec("grep -n 'your pattern here' file.txt", $output);`

      

b) Slurp in the file with file_get_contents()

, split it into an array of strings, then use preg_grep()

to find matching strings.

$dat = file_get_contents('file.txt');
$lines = explode($dat, "\n");
$matches = preg_grep('/your pattern here/', $lines);

      

c) Read the file in line size lines, keep the current line count and match your pattern on each line.

$fh = fopen('file.txt', 'rb');
$line = 1;
while ($line = fgets($fh)) {
     if (preg_match('/your pattern here/', $line)) {
         ... whatever you need to do with matching lines ...
     }
     $line++;
}

      

Each has its ups and downs

a) You call an external program, and if your template contains any user-supplied data, you potentially open yourself a shell equivalent to a SQL injection attack. On the plus side, you don't have to interrupt the entire file and save a little on memory overhead.

b) You are protected from shell attacks, but you need to parse the entire file. If your file is large, you are likely to run out of available memory.

c) You are calling a regex on each line, which will have significant overhead if you are dealing with a lot of lines.

+2


source


I think, first of all, you need to read $ String into an array, each item stands for each line and looks something like this:

$List=file($String);
for($i=0;$i<count($List),$i++){
if(preg_match_all()){;//your work here
echo $i;//echo the line number where the preg_match_all() works
}
}

      

+1


source


$data = "Abba
Beegees
Beatles";

preg_match_all('/Abba|Beegees|Beatles/', $data, $matches, PREG_OFFSET_CAPTURE);
foreach (current($matches) as $match) {
    $matchValue = $match[0];
    $lineNumber = substr_count(mb_substr($data, 0, $match[1]), PHP_EOL) + 1;

    echo "`{$matchValue}` at line {$lineNumber}\n";
}

      

Output

`Abba` at line 1
`Beegees` at line 2
`Beatles` at line 3

      

(check your performance requirements)

+1


source


You can use preg_match_all to find the offsets of each newline, and then compare them to the offsets you already have.

// read file to buffer
$data = file_get_contents($datafile);

// find all linefeeds in buffer    
$reg = preg_match_all("/\n/", $data, $lfall, PREG_OFFSET_CAPTURE );
$lfs = $lfall[0];

// create an array of every offset
$linenum = 1;
$offset = 0;    
foreach( $lfs as $lfrow )
{
    $lfoffset = intval( $lfrow[1] );
    for( ; $offset <= $lfoffset; $offset++ )
        $offsets[$offset] = $linenum;   // offset => linenum
    $linenum++;
}

      

0


source


This works, but does a new one preg_match_all

on every line, which can be quite costly.

$file = file.txt;

$log = array();

$line = 0;

$pattern = '/\x20{2,}/';

if(is_readable($file)){

    $handle = fopen($file, 'rb');

    if ($handle) {

        while (($subject = fgets($handle)) !== false) {

            $line++;

            if(preg_match_all ( $pattern,  $subject, $matches)){

                $log[] = array(
                    'str' => $subject, 
                    'file' =>  realpath($file),
                    'line' => $line,
                    'matches' => $matches,
                );
            } 
        }
        if (!feof($handle)) {
            echo "Error: unexpected fgets() fail\n";
        }
        fclose($handle);
    }
}

      

Alternatively, you can read the file as soon as you get the line numbers, and then run preg_match_all

through the entire file and collect the match offsets.

$file = 'file.txt';
$length = 0;
$pattern = '/\x20{2,}/';
$lines = array(0);

if(is_readable($file)){

    $handle = fopen($file, 'rb');

    if ($handle) {

        $subject = "";

        while (($line = fgets($handle)) !== false) {

            $subject .= $line;
            $lines[] = strlen($subject);
        }
        if (!feof($handle)) {
            echo "Error: unexpected fgets() fail\n";
        }
        fclose($handle);

        if($subject && preg_match_all ( $pattern, $subject, $matches, PREG_OFFSET_CAPTURE)){

            reset($lines);

            foreach ($matches[0] as $key => $value) {

                while( list($line, $length) = each($lines)){ // continues where we left off

                    if($value[1] < $length){

                        echo "match is on line: " . $line;

                        break; //break out of while loop;
                    }
                }
            }
        }
    }
}}

      

0


source


//Keep it simple, stupid

$allcodeline = explode(PHP_EOL, $content);

foreach ( $allcodeline as $line => $val ) :
    if ( preg_match("#SOMEREGEX#i",$val,$res) ) {
        echo $res[0] . '!' . $line . "\n";
    }
endforeach;

      

0


source







All Articles