Php Pulling one line from HUGE text file

I'm fine with PHP, but probably not as good as some of you guys here.

I'm basically trying to find a way to grab a line from a huge, and I mean a huge text file ... it's basically a list of keywords that I want to call by line number, but not wanting to go through them all before I go to this line ..... otherwise I could hide my server.

Currently using this

$lines = file('http://www.mysite.com/keywords.txt');
foreach ($lines as $line_num => $line) {
   echo "$line_num";
}

      

It works, but I'm sure it's the best way to do it, to save to Usuage, because that puts the whole file in memory, and if I can just say php give me line number 97, is there rumm ....

Hope you guys can come up with a solution like yours smarter than me: P ty

+3


source to share


3 answers


Use SplFileObject



 $file = "test.txt";
  $line_number = 1000;
  $file_obj = new SplFileObject( $file );
    /*** seek to the line number ***/
  $file_obj->seek( $line_number );

   /*** return the current line ***/
   echo  $file_obj->current();

      

+2


source


If the lines are only textual and variable in length, you cannot know which line is # 97; the only thing that makes it 97th is 96 lines earlier.

So, you need to read the entire file up to that point (this is what SplFileObject does):

$fp = fopen("keywords.txt", "r");
while($line--)
{
    if (feof($fp))
        // ERROR: line does not exist
    $text = fgets($fp, 1024); // 1024 = max length of one line
}
fclose($fp);

      

But if you can keep the line number before each line, then there is a file

...
95 abbagnale
96 abbatangelo
97 abbatantuono
98 ...

      

then you can implement binary search:

- start with s1 = 0 and s2 = file length
- read a keyword and line number at seek position s3 = (s1+s2)/2 (*)
- if line number is less than desired, s1 = s3; else s2 = s3; and repeat previous step.
- if line number is the one desired, strip the number from the text and you get the keyword.

      

(*), since the line will most likely not start exactly in s #, you need two fgets: one to get rid of the false word half, and the other to get rid of the line number. When you get "close", it will be faster to read the larger chunk and split it into lines. For example, you are looking for line 170135 and read at line 170180: what you better do is rewind the search position one kilobyte, read the data in kilobytes and search there for 170135.



Or, if the lengths of the different strings are not too different, it may be advisable to keep the string at a fixed size (here "#" should actually be spaces, and in the length string, you need to count the line terminator, \ n or \ r \ n):

abbagnale#########
abbatangelo#######
abbatantuono######

      

and then, let's say each keyword is 32 bytes,

$fp = fopen("keywords.txt", "r");
fseek($fp, 97 * 32, SEEK_SET);
$text = trim(fgets($fp, 32));
fclose($fp);

      

will be more or less instantaneous.

If the file is on a remote server, you still need to download the entire file (up to the desired line), and you'd be better served by placing a "scanner" script on the remote server that can trigger searches. Then you can run

$text = file_get_contents("http://www.mysite.com/keywords.php?line=97");

      

and get your string in milliseconds.

+2


source


There is no way to get "line number x" from a file in almost any language without reading it in one way or another. After all, a line is just stuff between two end-of-line characters. While collecting the "character number x" from a file can be done without loading the entire file (with some difficulty), it is not possible to collect the "x" line number without loading all lines up to x (and in most methods you need to load all lines)

A method where you load all lines until line x is next (using fgets ):

$f = fopen('http://www.mysite.com/keywords.txt');
$i=97
$text=""
while (($text = fgets($f,2048)) !== false && $i>0) {
       $i--
}
echo $text

      

0


source







All Articles