Scrapy + tor always returns 403 but I can curl and browse

I am trying to set up scrapy

+tor

I am using scrapy 0.24.6

  • At first I tried to use polipo

    to access tor

    as the proxy http ( https://pkmishra.github.io/blog/2013/04/16/scrapy-run-using-tor-and-multiple-agents-part-2- ubuntu / ) I can set my web browser to use polipo and I can browse with TOR and I can curl. I tried HttpProxyMiddleware

    and used env var or wrote my own middleware, same result: scrapy

    always returns 403

  • Then I tried to use tor

    directly, I can configure my web browser to use the socks proxy again and I can curl with torsocks

    but scrapy

    always returns 403

Does anyone have any idea what might be wrong?

It looks like the error is coming from scrapy

because I have the same headers / user agent with and without tor, but through tor I always get 403

+3


source to share





All Articles