Google Search returns 503 error for complex searches
When I try to load a google search results page using HttpWebRequest in C # everything works very well if I use simple search terms like
http://www.google.com/search?q=stackoverflow
But when I try to make it more complex like
http://www.google.com/search?q=inurl%3A%22goethe%22%20filetype%3Apdf
which means
inurl:"goethe" filetype:pdf
I'll get a 503 error because Google thinks I'm a bot. Is there a workaround?
Edit: UserAgent is set to "Mozilla / 5.0".
source to share
I don't believe this has much to do with how complex your request is. The only thing that really matters is if they think you are a bot. If you are sending requests at a very high rate, Google will think you are a bot, so there are several possible solutions:
- Reduce the speed at which you send requests.
- Use a proxy to make multiple requests.
It is also important to note that if you are making web requests without storing cookies, this could be another signal for Google to think you are a bot. You also have to be very careful not to shut down google proxies because you are clearing out a big G. I have a hard time finding free proxies and if you offend them then they will be shut down to be a good citizen!
Good luck!
source to share
Try the Google Custom Search API and Tools. This will allow you to receive search results without the fear of being denied access (up to the limit).
Alternatively, simulate all the nuances of a typical search query. For example, in my browser a search inurl:"goethe" filetype:pdf
results in this URL .
Then there are cookies and other http headers. Make it look more like the browser requesting it.
source to share