Download Install Tutorial Docs FAQ Tools WikiLicense Team IRC Planet Involvement Shop Book

Ticket #594 (defect)

Opened 2 years ago

Last modified 2 years ago

Trouble with gzip and staticdir

Status: closed (fixed)

Reported by: Sheco Assigned to: rdelon
Priority: normal Milestone: 3.0
Component: CherryPy code Keywords: staticdir gzip
Cc:

When both the gzip and the staticdir tool are enabled, the static files are unstable, it has two behaviours, every time I reload the page, they appear in order, and repeat forever.

The first one is returning the file succesfully without the last 10 bytes.

The other one is raising this exception:

Traceback (most recent call last):
  File "/home2/mariasys/maria2b/cherrypy/_cprequest.py", line 544, in respond
    self.hooks.run('before_finalize')
  File "/home2/mariasys/maria2b/cherrypy/_cprequest.py", line 79, in run
    hook()
  File "/home2/mariasys/maria2b/cherrypy/_cprequest.py", line 44, in __call__
    return self.callback(**self.kwargs)
  File "/home2/mariasys/maria2b/cherrypy/lib/encoding.py", line 208, in gzip
    ct = response.headers.get('Content-Type').split(';')[0]
AttributeError: 'NoneType' object has no attribute 'split'

Attachments

nogensess.patch (1.6 kB) - added by fumanchu on 10/31/06 12:46:33.
Session save using hooks not generators

Change History

10/25/06 12:42:02: Modified by Sheco

Ok, something I didn't mention, the exception happens on Firefox (currently using 2.0), it doesn't happen with the Internet Explorer, but the case 1 still applies I get the file without the last 10 bytes, I'm testing with small text files.

10/25/06 18:37:09: Modified by fumanchu

  • description changed.

I'd bet that FF and IE are sending different Accept-Encoding headers. Can you find out what those are and map them to the differing output?

As always, a test case would be best. :)

10/26/06 01:59:35: Modified by michele

I was going to open a new ticket (or asking fumanchu or IRC) regarding gzipped contents, anyway while there is one...

We have noticed that not all versions of IE support them properly.

You can find relevant resources here:

http://www.thinkvitamin.com/features/webapps/serving-javascript-fast

and here:

http://support.microsoft.com/default.aspx?scid=kb;en-us;823386&Product=ie600

It would be nice if the gzip tools could check the user agent version and avoid sending gzipped content to these particular versions of IE (or just to IE :-D).

10/26/06 11:56:55: Modified by fumanchu

Regarding http://support.microsoft.com/kb/823386: it would be nice but not worth the code overhead, IMO. The issue has a hotfix already, and users who experience the problem should fix their browser or they won't be able to get anywhere meaningful on the 'Net. If you wish to do user-agent sniffing on your own, feel free:

def mygzip(*args, **kwargs):
    ua = cherrypy.request.headers.get("User-Agent", "").lower()
    if 'msie' in ua:
        ... # Additional checks
        return
    return encoding.gzip(*args, **kwargs)
cherrypy.tools.gzip = Tool('before_finalize', mygzip, priority=80)

10/27/06 15:28:19: Modified by michele

Thanks fumanchu, I think you're right and the solution you proposed will "work for me". ;-)

10/27/06 16:52:12: Modified by Sheco

I am testing with this short script, tested under Fedora Enterprise Linux release 3, and ubuntu dapper drake.

It is decompressed(?) incorrectly by Firefox and Internet Explorer (I'm not getting the tracebacks now for some reason with firefox), but lynx does get it correctly.

test.py:

import cherrypy
import os

class root:
  index = cherrypy.tools.staticfile.handler('test.txt', 
    os.path.dirname(os.path.abspath(__file__)))

cherrypy.config.update({ 'global': { 'tools.gzip.on': True } } )
cherrypy.quickstart(root(), '/')

test.txt:

123456789012345

10/27/06 17:09:09: Modified by Sheco

The server output...

189.150.3.65 - - [27/Oct/2006:17:03:34] "GET / HTTP/1.1" 200 16 "" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
189.150.3.65 - - [27/Oct/2006:17:03:36] "GET / HTTP/1.1" 304 - "" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1) Gecko/20061010 Firefox/2.0"
127.0.0.1 - - [27/Oct/2006:17:03:49] "GET / HTTP/1.0" 200 16 "" "Lynx/2.8.5dev.7 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7a"

10/28/06 16:11:49: Modified by fumanchu

Bah. I can't reproduce it. I tried (FF 1.5.0.7, IE 6, and Opera 8.02 on Win2k) against (Py 2.4 on Win2k, Py 2.3 on Debian sarge). Neither the initial 200 OK or the following 304 Not Modified showed any problems on any combination of the above. :/

10/31/06 09:02:22: Modified by Andrew <stromnov@gmail.com>

Probably it's a bug in ETag and GZip chain.

If page not modified, then ETag raises HTTPRedirect(304) and deletes cherrypy.response.headers['Content-Type']. From this point GZip fails at:

    ct = response.headers.get('Content-Type').split(';')[0]

Also non working part is:

    if not response.body:
        # Response body is empty (might be a 304 for instance)
        return

because response.body value not None, but <generator> with

    >>> [chunk for chunk in response.body]
    []

10/31/06 11:30:03: Modified by Andrew <stromnov@gmail.com>

Look at:

Body().__set__(...) in _cprequest.py

save() hook in lib/sessions.py

save() hook wraps every sequence in to generator

New page:

  Enter hook init
  Enter hook decode
  Enter hook trailing_slash
  Enter hook save
    request.body before:  [u"<html>...</html"]
    request.body after:  <generator object>
  Enter hook validate_etags
  Enter hook encode
  Exit hook encode
  Enter hook gzip
  Enter hook close

Matched (by ETag) page:

  Enter hook init
  Enter hook decode
  Enter hook trailing_slash
  Enter hook save
    debug: request.body before:  [u"<html>...</html"]
    debug: request.body after:  <generator object>
  Enter hook validate_etags
    debug: raise cherrypy.HTTPRedirect([], 304)
  Enter hook save
    debug: request.body before:  []
    debug: request.body after:  <generator object>
  Enter hook validate_etags
  Enter hook encode
  Enter hook gzip
    debug: AttributeError: 'NoneType' object has no attribute 'split'

BTW: 'before_finalize' save() and validate_etags() hooks entered twice

10/31/06 12:14:24: Modified by fumanchu

Probably it's a bug in ETag and GZip chain.

If so, it's not present in current trunk.

If page not modified, then ETag raises HTTPRedirect(304)
and deletes cherrypy.response.headers['Content-Type'].

304 also sets response.body to None.

From this point GZip fails at:
    ct = response.headers.get('Content-Type').split(';')[0]

It won't reach that point because response.body is None

Also non working part is:
    if not response.body:
        # Response body is empty (might be a 304 for instance)
        return

because response.body value not None, but <generator> with

    >>> [chunk for chunk in response.body]
    []

Not if 304 was raised. HTTPRedirect(304) sets response.body to None.

save() hook wraps every sequence in to generator

Now this could be a problem. save() probably shouldn't do that in CP 3; it should set hooks instead. So it's a bug in the session Tool implementation, not ETag or GZip. What we really need is to establish a checklist of "gotchas" for Tools, and make sure all the builtins behave.

However, given all that, the script that Sheco posted doesn't mention ETags or sessions...

10/31/06 12:46:33: Modified by fumanchu

  • attachment nogensess.patch added.

Session save using hooks not generators

11/04/06 14:40:15: Modified by fumanchu

  • status changed from new to closed.
  • resolution set to fixed.

Fixed in [1426].

Hosted by WebFaction

Log in as guest/cpguest to create tickets