Using curl requests vs Python

When cleaning up the site, which would be preferable: using curl or using the Python requests library?

I originally planned to use queries and explicitly specify the user agent. However, when I use this, I often get the "HTTP 429 too many requests" message, whereas with curl it doesn't seem to be happening.

I need to update the metadata information for 10,000 titles, and I need a way to output the information for each title in a parallel way.

What are the advantages and disadvantages of using each to push information?

+3


source to share


3 answers


Since you want to parallelize queries, you must use requests

with grequests

(if you are using gevent or erequests

if using eventlet). You may have to figure out how quickly you got to the website, though, as they might do some rationalization and bounce you for too many requests in too short a period of time.



+3


source


Using queries will allow you to do this programmatically, which should result in cleaner products.



If you are using curl you are making os.system calls which are slower.

+2


source


Any day I will use the language version on top of the external program because it is less of a hassle.

Only if it turns out to be inoperative, I will return to this. Always consider that human time is infinitely more valuable than machine time. Any "performance gain" in such an application is likely to be affected by network latency.

0


source







All Articles