The web crawler gets slower over time

I am doing a data extraction project where I need to create a web cleanup program written using python using selenium and phantomjs headless webkit as a public crawl browser like facebook friendlist. The program starts up pretty fast, but after a day from starting it gets slower and slower and I can't figure out why? Can anyone give me an idea why it is getting slower? I am working on a local machine that has a pretty good command of 4GB of RAM and a quad core processor. Does FB provide any API for finding friends of friends?

+3


source to share


1 answer


We faced the same problem. We solved this by closing the browser automatically after a certain period of time. Clear the temporary cache and open a new browser instance and continue the process.



0


source







All Articles