Limiting the number of HTTP requests per second in Python

Question

Limiting the number of HTTP requests per second in Python

I wrote a script that fetches urls from a file and sends HTTP requests to all urls at the same time. Now I want to limit the number of HTTP requests per second and the bandwidth for each interface ( eth0

, eth1

etc.) in the session. Is there a way to achieve this in Python?

+3

python python-multithreading throttling bandwidth-throttling

Naveen 29 Sep 14 at 11:21

source to share

2 answers

You can use the worker concept as described in the documentation: https://docs.python.org/3.4/library/queue.html

Add a wait () command inside your workers to make them wait between requests (in the example from the documentation: inside "while true" after task_done).

Example: 5 "Working" - Threads with a latency of 1 s between requests will take less than 5 samples per second.

0

Kuishi 29 Sep 14 at 12:35

source to share

Georgi · Accepted Answer · 2014-09-29T11:41:29+0000

You can use the Semaphore object, which is part of the Python standard library: python doc

Or if you want to work with threads directly, you can use wait ([timeout]).

There is no library in Python that can work with Ethernet or some other network interface. The lowest you can go for is socket.

Based on your answer, here's my suggestion. Pay attention to active_count. Use this only to verify that your script only runs two threads. In this case there will be three of them, because number one is your script, then you have two url requests.

import time
import requests
import threading

# Limit the number of threads.
pool = threading.BoundedSemaphore(2)

def worker(u):
    # Request passed URL.
    r = requests.get(u)
    print r.status_code
    # Release lock for other threads.
    pool.release()
    # Show the number of active threads.
    print threading.active_count()

def req():
    # Get URLs from a text file, remove white space.
    urls = [url.strip() for url in open('urllist.txt')]
    for u in urls:
        # Thread pool.
        # Blocks other threads (more than the set limit).
        pool.acquire(blocking=True)
        # Create a new thread.
        # Pass each URL (i.e. u parameter) to the worker function.
        t = threading.Thread(target=worker, args=(u, ))
        # Start the newly create thread.
        t.start()

req()

Limiting the number of HTTP requests per second in Python

More articles: