Download Install Tutorial Docs FAQ Tools WikiLicense Team IRC Planet Involvement Shop Book

Creating Custom Tools for CherryPy 3

Abstract

CherryPy is an extremely capable platform for web application and framework development. One of the strengths of CherryPy is its modular design. Beginning in the 2.x series, CherryPy separated key-but-not-core functionality out into "filters". This provided two benefits; a slimmer, faster core system and a supported means of tying additional functionality into the framework. Today with CherryPy 3, that modularization continues with "tools", the successors to "filters".

Tools in CherryPy 3 allow you to write code that acts as an accessory to application code. Take some of the builtin tools, for example. The GZip tool compresses your output on the fly. An application should work without it, but it adds value. Some accessories, such as the session tool can become required by an application, but it is still useful to have such code abstracted from the application. Those are some of the purposes of tools.

Why replace Filters with Tools?

Filters provided a good way to add functionality to the CherryPy framework, but they were flawed in a couple ways.

First, they could only be configured at the path or class level through the config system and the _cp_filters attribute, respectively. Enabling a filter for a specific path through the configuration system was possible, doing so for a specific method or other callable attached to the CherryPy object tree was not.

Second, the core itself had no logic to know which filters were configured for a certain path and activate them. Instead, every filter was called at every stage of the CherryPy request process to see if it was "on". All those filter methods being called just to check and see if they were supposed to run added up to quite a negative performance hit.

Enter CherryPy 3 and the cherrypy.Tool. Tools solve both of the problems with filters that were mentioned above, and probably others as well.

First, they can be enabled for any point of your CherryPy application: a certain path, a certain class or a certain method or other callable. The old _cp_filters list has now been replaced by the _cp_config dictionary that allows you to configure tools (and many other things) for a specific class, method or other callable. Tools can also be used as decorators which provide syntactic sugar for configuring a tool for a specific callable.

Second, tools that are applicable to the current request are now determined at the start of the request by the core system. Unlike CherryPy 2.x, the core is not ignorant of the subsystems that will interact with the request. This substantially reduces the overhead of providing for a modular way to extend and enhance the CherryPy framework.

So CherryPy 3.x tools solve some specific problems that 2.x filters had while still providing a powerful way to code custom functionality into the CherryPy request handling process.

Why Not Just Use WSGI Middleware?

WSGI middleware provides a standard and framework independent way to write code that can interact with a WSGI compliant HTTP request/response handling system. While CherryPy 3.x fully supports the use of WSGI middleware, CherryPy Tools provide more flexibility for configuring and using code that interacts with the request/response. One of the outstanding benefits of tools over WSGI middleware is their granularity.

Being able to turn on a certain feature for a single method of a class that requires special treatment is a big bonus. With WSGI middleware, the entire application (and thus the class) would need to be wrapped with the middleware, invoking its functionality, for better or worse.

That said, WSGI middleware is still useful and a great way to write cross- framework Python code. You are encouraged to make the functionality of the tools you create for CherryPy available as WSGI middleware so that other frameworks may benefit as well.

Your First Custom Tool

Ok, example time. For those of you coming from the world of CherryPy 2.x filters, here is a simple filter that prints the current request path to the screen x number of times:

import cherrypy
import cherrypy.filters

class PrintPathFilter(object):
    def on_start_resource(self):
        if not cherrypy.config.get('print_path_filter.on', False):
            return
        multiplier = cherrypy.config.get('print_path_filter.multiplier', 1)
        for i in range(multiplier):
            print cherrypy.request.path

cherrypy.filters.input_filters.append(PrintPathFilter)

So there is the filter: created and hooked up into CherryPy, ready for use. Ok, it's a cheesy example, but it's good to start small. Let's look at what a tool with the same functionality is like.

import cherrypy

def print_path(multiplier=1):
    for i in range(multiplier):
        print cherrypy.request.path_info

cherrypy.tools.print_path = cherrypy.Tool('on_start_resource', print_path)

And there you have it. A tool with the same functionality. We can now enable it in the standard ways - a config file or dict passed to an application, a _cp_config dict on a particular class or callable or via use of the tool as a decorator. Now let's look at the example tool a bit closer.

Working from the "bottom-up", the cherrypy.Tool constructor takes 2 required and 2 optional arguments.

point

First, we need to tell it at what point in the CherryPy request/response handling process we want our tool to be triggered. Different request attributes are obtained and set and different points in the request process. Since we just care about the request's path_info in this example, we'll grab it as soon as it is available, which means using the first "hook point", called "on_start_resource". See the bottom of this document for a quick list of available hook points with short descriptions.

callable

Second, we need to provide the function that will be called back at that hook point. Here, we provide our print_path callable. The Tool class will find all config entries related to our tool and pass them as keyword arguments to our callback. Thus, if

'tools.print_path.on' = True
'tools.print_path.multiplier' = 5

is set in the config, the multiplier of 5 will get passed to the Tool's callable. [The 'on' config entry is special; it's never passed as a keyword argument.]

The tool can also be invoked as a decorator like this:

    @cherrypy.expose
    @cherrypy.tools.print_path(multiplier=5)
    def index(self):
        return "Hello, world!"

name

This argument is optional as long as you set the Tool onto a Toolbox. That is:

def foo():
    cherrypy.request.foo = True
cherrypy.tools.TOOLNAME = cherrypy.Tool('on_start_resource', foo)

The above will set the 'name' arg for you (to 'TOOLNAME'). The only time you would need to provide this argument is if you're bypassing the toolbox in some way.

priority

This specifies a priority order (from 0 - 100) that determines the order in which callbacks in the same hook point are called. The lower the priority number, the sooner it will run (that is, we call .sort(priority) on the list). The default priority for a tool is set to 50 and most built-in tools use that default value.

Custom Toolboxes

Explain Custom Toolboxes

Just the Beginning

Hopefully that information is enough to get you up and running and create some simple but useful CherryPy 3 tools. Much more than what you have seen in this tutorial is possible. Also, remember to take advantage of the fact that CherryPy is open source! Checkout the builtin tools and the libraries that they are built upon.

In closing, here is a slightly more complicated tool that acts as a "traffic meter" and triggers a callback if a certain traffic threshold is exceeded within a certain time frame. It should probably launch its own watchdog thread that actually checks the log and triggers the alerts rather than waiting on a request to do so, but I wanted to keep it simple for the purpose of example.

# traffictool.py
import time

import cherrypy


class TrafficAlert(cherrypy.Tool):
    
    def __init__(self, listclass=list):
        """Initialize the TrafficAlert Tool with the given listclass."""

        # A ring buffer subclass of list would probably be a more robust
        # choice than a standard Python list.
        
        self._point = "on_start_resource"
        self._name = None
        self._priority = 50
        # set the args of self.callable as attributes on self
        self._setargs()
        # a log for storing our per-path traffic data
        self._log = {}
        # a history of the last alert for a given path
        self._history = {}
        self.__doc__ = self.callable.__doc__
        self._struct = listclass
        
    def log_hit(self, path):
        """Log the time of a hit to a unique sublog for the path."""
        log = self._log.setdefault(path, self._struct())
        log.append(time.time())

    def last_alert(self, path):
        """Returns the time of the last alert for path."""
        return self._history.get(path, 0)
    
    def check_alert(self, path, window, threshhold, delay, callback=None):
        # set the bar
        now = time.time()
        bar = now - window
        hits = [t for t in self._log[path] if t > bar]
        num_hits = len(hits)
        if num_hits > threshhold:
            if self.last_alert(path) + delay < now:
                self._history[path] = now
                if callback:
                    callback(path, window, threshhold, num_hits)
                else:
                    msg = '%s - %s hits within the last %s seconds.'
                    msg = msg % (path, num_hits, window)
                    cherrypy.log.error(msg, 'TRAFFIC')

    def callable(self, window=60, threshhold=100, delay=30, callback=None):
        """Alert when traffic thresholds are exceeded.

        window: the time frame within which the threshhold applies
        threshhold: the number of hits within the window that will trigger
                    an alert
        delay: the delay between alerts
        callback: a callback that accepts(path, window, threshhold, num_hits)
        """
        
        path = cherrypy.request.path_info
        self.log_hit(path)
        self.check_alert(path, window, threshhold, delay, callback)


cherrypy.tools.traffic_alert = TrafficAlert()

if __name__ == '__main__':
    class Root(object):
        @cherrypy.expose
        def index(self):
            return "Hi!!"

        @cherrypy.expose
        @cherrypy.tools.traffic_alert(threshhold=5)
        def popular(self):
            return "A popular page."

    cherrypy.quickstart(Root())
    

Appendix

Here is a quick rundown of the "hook points" that you can hang your tools on:

  • on_start_resource - The earliest hook; the Request-Line and request headers have been processed and a dispatcher has set request.handler and request.config.
  • before_request_body - Tools that are hooked up here run right before the request body would be processed.
  • before_handler - Right before the request.handler (the "exposed" callable that was found by the dispatcher) is called.
  • before_finalize - This hook is called right after the page handler has been processed and before CherryPy formats the final response object. It helps you for example to check for what could have been returned by your page handler and change some headers if needed.
  • on_end_resource - Processing is complete - the response is ready to be returned. This doesn't always mean that the request.handler (the exposed page handler) has executed! It may be a generator. If your tool absolutely needs to run after the page handler has produced the response body, you need to either use on_end_request instead, or wrap the response.body in a generator which applies your tool as the response body is being generated (what a mouthful--see caching.tee_output for an example).
  • before_error_response - Called right before an error response (status code, body) is set.
  • after_error_response - Called right after the error response (status code, body) is set and just before the error response is finalized.
  • on_end_request - The request/response conversation is over, all data has been written to the client, nothing more to see here, move along.

Older versions

replace this with this
2.2before_handlerbefore_main

Hosted by WebFaction

Log in as guest/cpguest to create tickets