Need help with this regex
I am new to scrapy, I am trying to crawl a site with CrawlSpider, I want it to crawl it recursively based on the Next button. But it doesn't work. I think the problem is coming from the regex, but I've checked so many times, I can't find the error. It only scans the landing page without going to the next page.
# -*- coding: utf-8 -*-
start_urls = ['https://shopping.yahoo.com/merchantrating/?mid=13652']
rules = (
Rule(LinkExtractor(allow = "/merchantrating/;_ylt=Anf3hF19R8MGFPwuYuJUny4cEb0F\?mid=13652&sort=1&start=\d+"), callback = 'parse_start_url', follow = True),
)
def parse_start_url(self, response):
sel = Selector(response)
contents = sel.xpath('//p')
for content in contents:
item = BedbugsItem()
item['pageContent'] = content.xpath('text()').extract()
self.items.append(item)
return self.items
+3
source to share