Scrapy + tor always returns 403 but I can curl and browse

I am trying to set up scrapy


I am using scrapy 0.24.6

  • At first I tried to use polipo

    to access tor

    as the proxy http ( ubuntu / ) I can set my web browser to use polipo and I can browse with TOR and I can curl. I tried HttpProxyMiddleware

    and used env var or wrote my own middleware, same result: scrapy

    always returns 403

  • Then I tried to use tor

    directly, I can configure my web browser to use the socks proxy again and I can curl with torsocks

    but scrapy

    always returns 403

Does anyone have any idea what might be wrong?

It looks like the error is coming from scrapy

because I have the same headers / user agent with and without tor, but through tor I always get 403


source to share

All Articles