Google Search returns 503 error for complex searches

When I try to load a google search results page using HttpWebRequest in C # everything works very well if I use simple search terms like

http://www.google.com/search?q=stackoverflow

      

But when I try to make it more complex like

http://www.google.com/search?q=inurl%3A%22goethe%22%20filetype%3Apdf

      

which means

inurl:"goethe" filetype:pdf

      

I'll get a 503 error because Google thinks I'm a bot. Is there a workaround?

Edit: UserAgent is set to "Mozilla / 5.0".

+3


source to share


3 answers


ok .. if your search is done programmatically then google is just so right. You are a bot :-)



Hooray!

+3


source


I don't believe this has much to do with how complex your request is. The only thing that really matters is if they think you are a bot. If you are sending requests at a very high rate, Google will think you are a bot, so there are several possible solutions:



It is also important to note that if you are making web requests without storing cookies, this could be another signal for Google to think you are a bot. You also have to be very careful not to shut down google proxies because you are clearing out a big G. I have a hard time finding free proxies and if you offend them then they will be shut down to be a good citizen!

Good luck!

+1


source


Try the Google Custom Search API and Tools. This will allow you to receive search results without the fear of being denied access (up to the limit).

Alternatively, simulate all the nuances of a typical search query. For example, in my browser a search inurl:"goethe" filetype:pdf

results in this URL .
Then there are cookies and other http headers. Make it look more like the browser requesting it.

+1


source







All Articles