Process hangs on urllib2 reset socket
We have a server program that sometimes gets stuck in a call read
on
urllib2
reset when a connection is received, for example:
Traceback (most recent call last):
File "run.py", line 112, in fetch_stuff
raw = response.read()
File "/usr/lib/python2.7/socket.py", line 351, in read
data = self._sock.recv(rbufsize)
File "/usr/lib/python2.7/httplib.py", line 573, in read
s = self.fp.read(amt)
File "/usr/lib/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
error: [Errno 104] Connection reset by peer
Edit: On freezing, I mean the program is not crashing and is still active after a couple of hours, however it seems that it still gets stuck after typing this one error message.
However, AFAIK handles the code outside of the exception library handles correctly:
for i in range(retries):
try:
response = urllib2.urlopen(url)
raw = response.read() # fails here
...
except urllib2.HTTPError as e:
logging.error("HTTP Error for url=%s (code=%s, message=%s, headers=%s)" % (url, e.code, e.msg, e.hdrs))
except Exception as e:
logging.exception(e)
else:
logging.error(('Connection failed after {} tries').format(retries))
sys.exit(0)
I do not understand why this hung the whole process without further progress. Now we are trying to set the parameter to a timeout
value urlopen
, but I have doubts if this will solve the problem.
So, since I haven't found any useful links yet ( other than perhaps this answer ), is there a (obvious) fix for this, should I use another library, ...?
Also, what's actually going on? I understand the connection is reset, but what happens next?
source to share
Reading the call will block unless you are working on a non-blocking socket. Therefore your process will block when you call read ().
For some reason, the other side of the connection sends a packet with the RST flag set, closing the connection. When the OS detects this event, the recv system call returns with ECONNRESET defined in linux / include / errno.h and corresponding to error code 104.
Python translates the error code with the errno module ( https://docs.python.org/2/library/errno.html#module-errno ) and throws an exception. Error code 104, as expected errno.ECONNRESET:
>>> import errno
>>> print errno.ECONNRESET
104
Then you catch this exception and call
logging.exception(e)
which prints the stack trace. After that, either you continue the loop or you follow the else branch. Given your result, it is not clear to me what will happen.
This can be easily reproduced. Very simple client code:
import urllib2
import logging
r = urllib2.urlopen("http://localhost:8080")
try:
print "Reading!"
r.read()
except Exception as e:
logging.exception(e)
On the server side, directly from the command line:
β ~ [1] at 22:50:53 [Wed 12] $ nc -l -p 8080
Once the connection is established, the client blocks the read call. tcpkill can be used to kill the connection with the RST flag after some traffic is detected:
~ [1] at 22:51:19 [Wed 12] $ sudo tcpkill -i lo port 8080
And as expected, the client-side result:
β ~ [1] at 23:12:37 [Wed 12] $ python m.py
Reading!
ERROR:root:[Errno 104] Connection reset by peer
Traceback (most recent call last):
File "m.py", line 7, in <module>
r.read()
File "/usr/lib/python2.7/socket.py", line 351, in read
data = self._sock.recv(rbufsize)
File "/usr/lib/python2.7/httplib.py", line 561, in read
s = self.fp.read(amt)
File "/usr/lib/python2.7/httplib.py", line 1302, in read
return s + self._file.read(amt - len(s))
File "/usr/lib/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
error: [Errno 104] Connection reset by peer
Adding a timeout is unlikely to solve. If your connection is reset while your process is blocked on a read call (even if timed out), the result is exactly the same. I think you should first try to understand why the connection is reset. But a read on a socket that was closed with the RST flag is an event you cannot avoid and you must handle.
source to share