Scrapy doesn't include my FilePipeline

These are my .py settings:

from scrapy.log import INFO


BOT_NAME = 'images'

SPIDER_MODULES = ['images.spiders']
NEWSPIDER_MODULE = 'images.spiders'
LOG_LEVEL = INFO

ITEM_PIPELINES = {
    "images.pipelines.WritePipeline": 800
}

DOWNLOAD_DELAY = 0.5

      

This is my pipelines.py:

from scrapy import Request
from scrapy.pipelines.files import FilesPipeline


class WritePipeline(FilesPipeline):

    def get_media_requests(self, item, info):
        for url in item["file_urls"]:
            yield Request(url)

    def item_completed(self, results, item, info):
        return item

      

This is very standard, normal material. And yet this is my log line:

2015-06-25 18:16:41 [scrapy] INFO: Enabled item pipelines: 

      

So the pipeline is not included. What am I doing wrong here? I've used Scrapy several times and I'm pretty sure the spider is fine. The element is a regular element with file_urls

and files

.

+3


source to share


2 answers


Oops, I forgot to add FILES_STORE

in settings. Look here for an explanation.

Relevant quote:



Then configure the storage target to a valid value that will be used to store the uploaded images. Otherwise, the pipeline will remain disabled even if you enable it in the ITEM_PIPELINES setting.

+2


source


I don't know about FilesPipeline

, but for each pipeline, you need to implement a method process_item(self, item, spider)

.



0


source







All Articles