Query vs. Query Futures - Response Time Inaccurate?
I have Python code that fires 10 GET requests and measures the response time:
from datetime import datetime
from requests_futures.sessions import FuturesSession
import requests
class CustomSession(FuturesSession):
def __init__(self, *args, **kwargs):
super(CustomSession, self).__init__(*args, **kwargs)
self.timing = {}
self.timing = {}
def request(self, method, url, *args, **kwargs):
background_callback = kwargs.pop('background_callback', None)
test_id = kwargs.pop('test_id', None)
# start counting
self.timing[test_id] = {}
self.timing[test_id]['cS'] = datetime.now()
def time_it(sess, resp):
# here if you want to time the server stuff only
self.timing[test_id]['cE'] = datetime.now()
if background_callback:
background_callback(sess, resp)
# here if you want to include any time in the callback
return super(CustomSession, self).request(method, url, *args,
background_callback=time_it,
**kwargs)
# using requests-futures
print('requests-futures:')
session = CustomSession()
futures = []
for i in range(10):
futures.append(session.get('http://google.com/', test_id=i))
for future in futures:
try:
r = future.result()
#print((session.timing[i]['cE'] - session.timing[i]['cS']))
except Exception as e:
print(e)
for i in range(10):
print((session.timing[i]['cE'] - session.timing[i]['cS']).total_seconds() * 1000)
# using requests
print('requests:')
for i in range(10):
check_start_timestamp = datetime.utcnow()
r = requests.get('http://google.com')
check_end_timestamp = datetime.utcnow()
cE = int((check_end_timestamp - check_start_timestamp).total_seconds() * 1000)
print(cE)
futures requests:
112.959
118.627
160.139
174.32
214.399
224.295
267.557
276.582
316.824
327.00800000000004
requests:
99 104 92 110 100 126 140 112 102 107
It seems that
- Response time s
requests-futures
appears as additive (time increases) - Using simple
requests
is much faster.
This is normal? Am I missing something that might cause a discrepancy?
source to share
Question 1
Response times with futures requests appear to be incremental (time increases)
The reason is that request_futures uses a thread pool behind the scenes. You can see this as time happens in chunks (separators added for clarity, number of threads can be changed by argument max_workers
):
-
The default pool size is 2:
161.226 172.41600000000003 --- 250.141 253.18600000000004 --- 329.32800000000003 342.71000000000004 --- 408.21200000000005 420.614 --- 487.356 499.311
-
Pool size 4:
149.781 154.761 151.971 155.385 --- 225.458 230.596 239.784 240.386 --- 313.801 314.056
-
Graph (blue - 2, red - 4):
As you can see, the bunch happens at approximately the same interval, which should be the response time for one request.
In theory, adding the pool size to 10
gives optimal results for your test, giving results like:
252.977
168.379
161.689
165.44
169.238
157.929
171.77
154.089
168.283
159.23999999999998
However, then the effects below have a greater effect.
Question 2
Using simple queries is much faster.
I can't be sure, but looking at the time for the first batch of requests, its only ~ 15 units (microseconds?) Off. This could be due to:
- Switching streams. Since regular requests are requested on the same thread as the requestor, the job starts immediately. For a thread pool, the request is only triggered when the OS switches to the desired thread. This will incur overhead costs.
- interview. Futures might use some kind of poll to test the outcome, so there might be a deferral.
The advantage of futures is that the total time for 10 requests is less and not the individual time, so this small difference is not an issue.
source to share