How can I keep the field names of an element in the same file using Scrapy?

During the screening process, it includes many Item field names.

1. Item class (Items.py)

class HelloItem(scrapy.Item):
   Name = scrapy.Field()
   Address = scrapy.Field()
   ...

      

2. Spider class (spider.py)

class HelloSpider(scrapy.Spider):

    def parse(self, response):
       item = HelloItem()
       item["Name"] = ...
       item["Address'] = ...
       ...

      

3. settings.py

EXPORT_FIELDS = ["Name", "Address", ...]

      

I have defined a parameter EXPORT_FIELDS

in settings.py

that will be used to define the field ordering for custom pipelines of CSV items. The CSV code is similar to this one except that it self.exporter.fields_to_export

loads settings.getlist("EXPORT_FIELDS")

.


You can see that there are three places that I have to define the field names (Name, Address, etc.). If one day I need to rename some of the field names, I have to change them in these three files.

So is there a way to combine the Item field name definitions in just one file? (or two files are ok too, the smaller the better)

+3


source to share


1 answer


You couldn't use items at all, and gave out dictionaries instead. This way you won't need it items.py

.

However, as the project grows Item

, it is a good idea to define a subclass , and the repetition you are talking about is the lesser evil.



Due to the item definition, you might get an error message when you try to clear the item field with a typo on one of your spiders.

Item classes also allow you to work with item loaders .

0


source







All Articles