Starting a new job (launch) for each launch url using scrapyd

I have two separate spiders ...

  • Spider 1 will get a list of URLs from HTML pages

  • Spider 2 will use the cleaned url in the previous spider as the launch url and start cleaning pages

.. now what i am trying to do is ... i am trying to schedule it in such a way that .... after every hour or so .. i want to run the whole 2 url spider in parallel, on at the same time

i deployed it to scrapyD and passed start_url from python script to each expanded pause as argument..like

for url in start_urls:
r = requests.post("http://localhost:6800/schedule.json",
                  params={
                      'project': 'project',
                      'spider': 'spider',
                      'start_urls': url
                  })

      

and inside the spider, reading this argument, start_urls, from kwargs and assigning it to Start_urls

but what i noticed is when i pass multiple urls to the same deployed spider using for loop it never runs in parallel

only one task is running at any given time, other tasks are in a pending state (not working)

scrapyd and service settings as they are changed by default after only two settings

max_proc    = 100
max_proc_per_cpu = 25

      

how can i achieve while approaching real parallelism using python-scrapy-scrapyd

or will I need to use python-multi processing-pool-apply_async or some other solution

+3
python multiprocessing scrapy scrapyd


source to share


No one has answered this question yet

Check out similar questions:

nine
Run multiple spider spiders at the same time with scrapyd
2
Scrapyd jobs don't end
1
Can't connect to scrapyd api
1
Executing scrapy commands using os.system or subprocess.call
1
Scrapyd, Celery and Django work with Supervisor - GenericHTTPChannellProtocol Error
1
scrapyd work is not over
1
Keep the scrapyd running
0
Scrapyd Can't Crawl Spider-Created Port | Windows
0
I / O for scrapyd instance hosted on Amazon EC2 linux instance
0
scrapyd works like daemon can't find spider or project



All Articles
Loading...
X
Show
Funny
Dev
Pics