Ticket #594 (defect)
Opened 2 years ago
Last modified 2 years ago
Trouble with gzip and staticdir
Status: closed (fixed)
| Reported by: | Sheco | Assigned to: | rdelon |
|---|---|---|---|
| Priority: | normal | Milestone: | 3.0 |
| Component: | CherryPy code | Keywords: | staticdir gzip |
| Cc: |
When both the gzip and the staticdir tool are enabled, the static files are unstable, it has two behaviours, every time I reload the page, they appear in order, and repeat forever.
The first one is returning the file succesfully without the last 10 bytes.
The other one is raising this exception:
Traceback (most recent call last):
File "/home2/mariasys/maria2b/cherrypy/_cprequest.py", line 544, in respond
self.hooks.run('before_finalize')
File "/home2/mariasys/maria2b/cherrypy/_cprequest.py", line 79, in run
hook()
File "/home2/mariasys/maria2b/cherrypy/_cprequest.py", line 44, in __call__
return self.callback(**self.kwargs)
File "/home2/mariasys/maria2b/cherrypy/lib/encoding.py", line 208, in gzip
ct = response.headers.get('Content-Type').split(';')[0]
AttributeError: 'NoneType' object has no attribute 'split'
Attachments
Change History
10/25/06 12:42:02: Modified by Sheco
10/25/06 18:37:09: Modified by fumanchu
- description changed.
I'd bet that FF and IE are sending different Accept-Encoding headers. Can you find out what those are and map them to the differing output?
As always, a test case would be best. :)
10/26/06 01:59:35: Modified by michele
I was going to open a new ticket (or asking fumanchu or IRC) regarding gzipped contents, anyway while there is one...
We have noticed that not all versions of IE support them properly.
You can find relevant resources here:
http://www.thinkvitamin.com/features/webapps/serving-javascript-fast
and here:
http://support.microsoft.com/default.aspx?scid=kb;en-us;823386&Product=ie600
It would be nice if the gzip tools could check the user agent version and avoid sending gzipped content to these particular versions of IE (or just to IE :-D).
10/26/06 11:56:55: Modified by fumanchu
Regarding http://support.microsoft.com/kb/823386: it would be nice but not worth the code overhead, IMO. The issue has a hotfix already, and users who experience the problem should fix their browser or they won't be able to get anywhere meaningful on the 'Net. If you wish to do user-agent sniffing on your own, feel free:
def mygzip(*args, **kwargs): ua = cherrypy.request.headers.get("User-Agent", "").lower() if 'msie' in ua: ... # Additional checks return return encoding.gzip(*args, **kwargs) cherrypy.tools.gzip = Tool('before_finalize', mygzip, priority=80)
10/27/06 15:28:19: Modified by michele
Thanks fumanchu, I think you're right and the solution you proposed will "work for me". ;-)
10/27/06 16:52:12: Modified by Sheco
I am testing with this short script, tested under Fedora Enterprise Linux release 3, and ubuntu dapper drake.
It is decompressed(?) incorrectly by Firefox and Internet Explorer (I'm not getting the tracebacks now for some reason with firefox), but lynx does get it correctly.
test.py:
import cherrypy
import os
class root:
index = cherrypy.tools.staticfile.handler('test.txt',
os.path.dirname(os.path.abspath(__file__)))
cherrypy.config.update({ 'global': { 'tools.gzip.on': True } } )
cherrypy.quickstart(root(), '/')
test.txt:
123456789012345
10/27/06 17:09:09: Modified by Sheco
The server output...
189.150.3.65 - - [27/Oct/2006:17:03:34] "GET / HTTP/1.1" 200 16 "" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)" 189.150.3.65 - - [27/Oct/2006:17:03:36] "GET / HTTP/1.1" 304 - "" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1) Gecko/20061010 Firefox/2.0" 127.0.0.1 - - [27/Oct/2006:17:03:49] "GET / HTTP/1.0" 200 16 "" "Lynx/2.8.5dev.7 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7a"
10/28/06 16:11:49: Modified by fumanchu
Bah. I can't reproduce it. I tried (FF 1.5.0.7, IE 6, and Opera 8.02 on Win2k) against (Py 2.4 on Win2k, Py 2.3 on Debian sarge). Neither the initial 200 OK or the following 304 Not Modified showed any problems on any combination of the above. :/
10/31/06 09:02:22: Modified by Andrew <stromnov@gmail.com>
Probably it's a bug in ETag and GZip chain.
If page not modified, then ETag raises HTTPRedirect(304) and deletes cherrypy.response.headers['Content-Type']. From this point GZip fails at:
ct = response.headers.get('Content-Type').split(';')[0]
Also non working part is:
if not response.body:
# Response body is empty (might be a 304 for instance)
return
because response.body value not None, but <generator> with
>>> [chunk for chunk in response.body]
[]
10/31/06 11:30:03: Modified by Andrew <stromnov@gmail.com>
Look at:
Body().__set__(...) in _cprequest.py
save() hook in lib/sessions.py
save() hook wraps every sequence in to generator
New page:
Enter hook init
Enter hook decode
Enter hook trailing_slash
Enter hook save
request.body before: [u"<html>...</html"]
request.body after: <generator object>
Enter hook validate_etags
Enter hook encode
Exit hook encode
Enter hook gzip
Enter hook close
Matched (by ETag) page:
Enter hook init
Enter hook decode
Enter hook trailing_slash
Enter hook save
debug: request.body before: [u"<html>...</html"]
debug: request.body after: <generator object>
Enter hook validate_etags
debug: raise cherrypy.HTTPRedirect([], 304)
Enter hook save
debug: request.body before: []
debug: request.body after: <generator object>
Enter hook validate_etags
Enter hook encode
Enter hook gzip
debug: AttributeError: 'NoneType' object has no attribute 'split'
BTW: 'before_finalize' save() and validate_etags() hooks entered twice
10/31/06 12:14:24: Modified by fumanchu
Probably it's a bug in ETag and GZip chain.
If so, it's not present in current trunk.
If page not modified, then ETag raises HTTPRedirect(304) and deletes cherrypy.response.headers['Content-Type'].
304 also sets response.body to None.
From this point GZip fails at:
ct = response.headers.get('Content-Type').split(';')[0]
It won't reach that point because response.body is None
Also non working part is:
if not response.body:
# Response body is empty (might be a 304 for instance)
return
because response.body value not None, but <generator> with
>>> [chunk for chunk in response.body]
[]
Not if 304 was raised. HTTPRedirect(304) sets response.body to None.
save() hook wraps every sequence in to generator
Now this could be a problem. save() probably shouldn't do that in CP 3; it should set hooks instead. So it's a bug in the session Tool implementation, not ETag or GZip. What we really need is to establish a checklist of "gotchas" for Tools, and make sure all the builtins behave.
However, given all that, the script that Sheco posted doesn't mention ETags or sessions...
10/31/06 12:46:33: Modified by fumanchu
- attachment nogensess.patch added.
Session save using hooks not generators
11/04/06 14:40:15: Modified by fumanchu
- status changed from new to closed.
- resolution set to fixed.
Fixed in [1426].


Ok, something I didn't mention, the exception happens on Firefox (currently using 2.0), it doesn't happen with the Internet Explorer, but the case 1 still applies I get the file without the last 10 bytes, I'm testing with small text files.