Using Scrapy, getting "Error: ImportError: No module named testspiders.spiders.followall"

I am trying to run Scrapy from a script and am following the tutorial here . I am running the error message that states Error: ImportError: No module named testspiders.spiders.followall

. I've searched for a solution but haven't found a match yet.

I am actually running this python script through node.js which has a module named python-shell , which simply allows you to run the python script using the following simple code:

var PythonShell = require('python-shell');'', function (err) {
  if (err) throw err;


Verbatim, my code is copied from scrapy site:

from twisted.internet import reactor
from scrapy.crawler import Crawler
from scrapy import log, signals
from testspiders.spiders.followall import FollowAllSpider
from scrapy.utils.project import get_project_settings

spider = FollowAllSpider(domain='')
settings = get_project_settings()
crawler = Crawler(settings)
crawler.signals.connect(reactor.stop, signal=signals.spider_closed)


My directory structure was only changed from the express framework by adding a directory and python file, and a few lines of code that uses the python shell:



NOTE. It also doesn't work if I go into the python directory and run python

and I get the same error message:ImportError: No module named testspiders.spiders.followall


source to share

1 answer

when you run the crawler with scrapy

, then the root scraper dir (parent directory testpiders /) is automatically added to the path. When running a script with, python

it is not. You have a working directory and everything defined in PATH and PYTHONPATH.

You can check the current path in python with sys.path

So, to make the import instructions work with python

, you can:

  • add testpiders / parent dir to the path using sys.path.append () (you need to do this before import testspiders ...)
  • add parent directory to PYTHONPATH system variable
  • execute command python

    from parent directory testpiders /
  • edit import operations (so they work according to your PATH)


All Articles