What's the switch to do in scrapy?
The scrapy tutorials say that in order to save the output to csv or any other format, we have to use the following command:
scrapy crawl spider -o result.csv -t csv
in general we can use this command:
scrapy crawl my_spider -o file_name.extension -t extension
but i used this command without -t and no problem:
scrapy crawl spider -o result.csv
My question is role -t
?
source to share
Whenever you are unsure, check out the source code .
As per the crawl.py
source code , if you don't specify the format explicitly, Scrapy will detect it - the filename extension will be used as the format:
if not opts.output_format:
opts.output_format = os.path.splitext(opts.output)[1].replace(".", "")
In your case will be used csv
.
source to share
You can usually get an explanation of the command line tool options by invoking the command with the option --help
:
C:\>scrapy crawl --help
Usage
=====
scrapy crawl [options] <spider>
Run a spider
Options
=======
--help, -h show this help message and exit
-a NAME=VALUE set spider argument (may be repeated)
--output=FILE, -o FILE dump scraped items into FILE (use - for stdout)
--output-format=FORMAT, -t FORMAT
format to use for dumping items with -o
...
therefore it is -t
used to specify the format used when dumping items to a file.
source to share