File_get_contents gets a different file from google than shown in the browser

Question

File_get_contents gets a different file from google than shown in the browser

I am using file_get_contents to see if there is a search url I am looking at:

If I go to this URL in my browser another file is displayed which I see when I echo file_get_contents

$url = "http://www.google.com/search?q=*a*+site:www.reddit.com/r/+-inurl:(/shirt/|/related/|/domain/|/new/|/top/|/controversial/|/widget/|/buttons/|/about/|/duplicates/|dest=|/i18n)&num=1&sort=date-sdate";
$google_search = file_get_contents($url);

What's wrong with my code?

+3

url php file-get-contents

Simon h Jul 12 15 at 16:51

source to share

2 answers

My guess is that Google is definitely checking the user agent to avoid any automatic searches.

So, you should at least use CURL and define the correct user agent string (that is, the same as the generic browser) to "trick" Google.

Somehow I'm afraid it won't be so easy to fool Google, but maybe I'm just paranoid and at least you can learn something about CURL.

0

Francesco abeni Jul 12 15 at 17:03

source to share

Tivie · Accepted Answer · 2015-07-12T17:03:29+0000

Nothing. The problem is that the page uses javascript and ajax to get the content. So, to get a snapshot of a page, you need to "launch". That is, you need to parse javascript code, which php does not.

Your best bet is to use a headless browser like phantomjs. If you are looking, you will find several guides explaining how to do this.

Note

If all you're looking for is a way to extract raw data from a search, you can try using the google search api .

File_get_contents gets a different file from google than shown in the browser

More articles: