Scrapy + tor always returns 403 but I can curl and browse

I am trying to set up scrapy

+tor

I am using scrapy 0.24.6

  • At first I tried to use polipo

    to access tor

    as the proxy http ( https://pkmishra.github.io/blog/2013/04/16/scrapy-run-using-tor-and-multiple-agents-part-2- ubuntu / ) I can set my web browser to use polipo and I can browse with TOR and I can curl. I tried HttpProxyMiddleware

    and used env var or wrote my own middleware, same result: scrapy

    always returns 403

  • Then I tried to use tor

    directly, I can configure my web browser to use the socks proxy again and I can curl with torsocks

    but scrapy

    always returns 403

Does anyone have any idea what might be wrong?

It looks like the error is coming from scrapy

because I have the same headers / user agent with and without tor, but through tor I always get 403

+3
scrapy tor torsocks


source to share


No one has answered this question yet

Check out similar questions:

338
How can I connect to Tor hidden service using cURL in PHP?
8
How to connect to https site using Scrapy via Polipo over TOR?
6
How to use Tor socks5 in R getURL
five
Scrapy gets NoneType Error while using Privoxy Proxy for Tor
3
Changing Torah's Identity in Scrapy over Polipo
2
Connection refused with scrapy, privoxy and tor
2
Scrapy-Splash with Tor
1
curl urin-url over http proxy does not return expected origin
1
spider gets stuck on use
0
Bazaar CVS (bzr) with proxy (socks using torus)



All Articles
Loading...
X
Show
Funny
Dev
Pics