Apache lags behind when responding to gzipped requests
For an application I am developing, the user is sending a gzipped HTTP POST request (content-encoding: GZIP) with multi-data data (content-type: multipart / form-data). I am using mod_deflate as an input filter for unpacking and the web request is processed in Django using mod_wsgi.
As a rule, everything is in order. But for some requests (deterministic), there is almost a minute lag from request to response. Research shows that processing in django is done immediately, but the response from the server stops. If the request is not GZIPed, everything works well.
Note that in order to resolve the bug in mod_wsgi, I set the content length to the size of the uncompressed mesage.
Has anyone faced this problem? Is there a way to easily debug apache when handling responses?
source to share
What glitch do you think exists in mod_wsgi?
The simple fact is that WSGI 1.0 does not support mutating input filters that change the length of the content of the request content. So technically you cannot use mod_deflate in Apache on request content when using WSGI 1.0. Your setting of the content length, which will be a value other than the actual size, will most likely fill mod_deflate to work.
If you want to handle the compressed content of a request, you need to go outside the WSGI 1.0 specification and use non-standard code.
I suggest you read:
http://blog.dscpl.com.au/2009/10/details-on-wsgi-10-amendmentsclarificat.html
This explains this problem and suggestions about it.
I would highly suggest that you refer this question to the official mod_wsgi mailing list for a discussion of how you need to write your code.If although you are using one of the Python frameworks, you will probably be limited to what you can do as they will implement WSGI 1.0 where you can't.
UPDATE 1
From the discussion in the mod_wsgi list, the original WSGI application should be finalized into the next WSGI middleware. This will only work on WSGI adapters that actually provide an empty string as the final gatekeeper for input, which WSGI 1.0 does not require. This should only be used for small loads as everything is read into memory. If you need a large compressed load, then the accumulated data should be written to a file instead.
class Wrapper(object):
def __init__(self, application):
self.__application = application
def __call__(self, environ, start_response):
if environ.get('HTTP_CONTENT_ENCODING', '') == 'gzip':
buffer = cStringIO.StringIO()
input = environ['wsgi.input']
blksize = 8192
length = 0
data = input.read(blksize)
buffer.write(data)
length += len(data)
while data:
data = input.read(blksize)
buffer.write(data)
length += len(data)
buffer = cStringIO.StringIO(buffer.getvalue())
environ['wsgi.input'] = buffer
environ['CONTENT_LENGTH'] = length
return self.__application(environ, start_response)
application = Wrapper(original_wsgi_application_callable)
source to share