The web crawler gets slower over time

Question

The web crawler gets slower over time

I am doing a data extraction project where I need to create a web cleanup program written using python using selenium and phantomjs headless webkit as a public crawl browser like facebook friendlist. The program starts up pretty fast, but after a day from starting it gets slower and slower and I can't figure out why? Can anyone give me an idea why it is getting slower? I am working on a local machine that has a pretty good command of 4GB of RAM and a quad core processor. Does FB provide any API for finding friends of friends?

+3

python-2.7 phantomjs selenium-webdriver facebook-graph-api facebook-sdk-3.0

Soumya 12 Aug 14 at 2:49 am

source to share

1 answer

QAMate.com · Answer 1 · 2014-08-14T07:54:29+0000

We faced the same problem. We solved this by closing the browser automatically after a certain period of time. Clear the temporary cache and open a new browser instance and continue the process.

The web crawler gets slower over time

More articles: