| 1 | Nevow Object Traversal |
|---|
| 2 | ====================== |
|---|
| 3 |
|
|---|
| 4 | *Object traversal* is the process Nevow uses to determine what object to use to |
|---|
| 5 | render HTML for a particular URL. When an HTTP request comes in to the web |
|---|
| 6 | server, the object publisher splits the URL into segments, and repeatedly calls |
|---|
| 7 | methods which consume path segments and return objects which represent that |
|---|
| 8 | path, until all segments have been consumed. At the core, the Nevow traversal |
|---|
| 9 | API is very simple. However, it provides some higher level functionality layered |
|---|
| 10 | on top of this to satisfy common use cases. |
|---|
| 11 |
|
|---|
| 12 | * `Object Traversal Basics`_ |
|---|
| 13 | * `locateChild in depth`_ |
|---|
| 14 | * `childFactory method`_ |
|---|
| 15 | * `child_* methods and attributes`_ |
|---|
| 16 | * `Dots in child names`_ |
|---|
| 17 | * `children dictionary`_ |
|---|
| 18 | * `The default trailing slash handler`_ |
|---|
| 19 | * `ICurrentSegments and IRemainingSegments`_ |
|---|
| 20 |
|
|---|
| 21 | Object Traversal Basics |
|---|
| 22 | ----------------------- |
|---|
| 23 |
|
|---|
| 24 | The *root resource* is the top-level object in the URL space; it conceptually |
|---|
| 25 | represents the URI "/". The Nevow *object traversal* and *object publishing* |
|---|
| 26 | machinery uses only two methods to locate an object suitable for publishing and |
|---|
| 27 | to generate the HTML from it; these methods are described in the interface |
|---|
| 28 | ``nevow.inevow.IResource``:: |
|---|
| 29 |
|
|---|
| 30 |
|
|---|
| 31 | class IResource(compy.Interface): |
|---|
| 32 | def locateChild(self, ctx, segments): |
|---|
| 33 | """Locate another object which can be adapted to IResource |
|---|
| 34 | Return a tuple of resource, path segments |
|---|
| 35 | """ |
|---|
| 36 |
|
|---|
| 37 | def renderHTTP(self, ctx): |
|---|
| 38 | """Render a request |
|---|
| 39 | """ |
|---|
| 40 |
|
|---|
| 41 | ``renderHTTP`` can be as simple as a method which simply returns a string of HTML. |
|---|
| 42 | Let's examine what happens when object traversal occurs over a very simple root |
|---|
| 43 | resource:: |
|---|
| 44 |
|
|---|
| 45 | from zope.interface import implements |
|---|
| 46 |
|
|---|
| 47 | class SimpleRoot(object): |
|---|
| 48 | implements(inevow.IResource) |
|---|
| 49 |
|
|---|
| 50 | def locateChild(self, ctx, segments): |
|---|
| 51 | return self, () |
|---|
| 52 |
|
|---|
| 53 | def renderHTTP(self, ctx): |
|---|
| 54 | return "Hello, world!" |
|---|
| 55 |
|
|---|
| 56 | This resource, when passed as the root resource to ``appserver.NevowSite`` or |
|---|
| 57 | ``wsgi.createWSGIApplication``, will immediately return itself, consuming all path |
|---|
| 58 | segments. This means that for every URI a user visits on a web server which is |
|---|
| 59 | serving this root resource, the text "Hello, world!" will be rendered. Let's |
|---|
| 60 | examine the value of ``segments`` for various values of URI: |
|---|
| 61 |
|
|---|
| 62 | /foo/bar |
|---|
| 63 | ('foo', 'bar') |
|---|
| 64 |
|
|---|
| 65 | / |
|---|
| 66 | ('', ) |
|---|
| 67 |
|
|---|
| 68 | /foo/bar/baz.html |
|---|
| 69 | ('foo', 'bar', 'baz.html') |
|---|
| 70 |
|
|---|
| 71 | /foo/bar/directory/ |
|---|
| 72 | ('foo', 'bar', 'directory', '') |
|---|
| 73 |
|
|---|
| 74 | So we see that Nevow does nothing more than split the URI on the string '/' and |
|---|
| 75 | pass these path segments to our application for consumption. Armed with these |
|---|
| 76 | two methods alone, we already have enough information to write applications |
|---|
| 77 | which service any form of URL imaginable in any way we wish. However, there are |
|---|
| 78 | some common URL handling patterns which Nevow provides higher level support for. |
|---|
| 79 |
|
|---|
| 80 | ``locateChild`` in depth |
|---|
| 81 | ------------------------ |
|---|
| 82 |
|
|---|
| 83 | One common URL handling pattern involves parents which only know about their |
|---|
| 84 | direct children. For example, a ``Directory`` object may only know about the |
|---|
| 85 | contents of a single directory, but if it contains other directories, it does |
|---|
| 86 | not know about the contents of them. Let's examine a simple ``Directory`` object |
|---|
| 87 | which can provide directory listings and serves up objects for child directories |
|---|
| 88 | and files:: |
|---|
| 89 |
|
|---|
| 90 | from zope.interface import implements |
|---|
| 91 |
|
|---|
| 92 | class Directory(object): |
|---|
| 93 | implements(inevow.IResource) |
|---|
| 94 |
|
|---|
| 95 | def __init__(self, directory): |
|---|
| 96 | self.directory = directory |
|---|
| 97 |
|
|---|
| 98 | def renderHTTP(self, ctx): |
|---|
| 99 | html = ['<ul>'] |
|---|
| 100 | for child in os.listdir(self.directory): |
|---|
| 101 | fullpath = os.path.join(self.directory, child) |
|---|
| 102 | if os.path.isdir(fullpath): |
|---|
| 103 | child += '/' |
|---|
| 104 | html.extend(['<li><a href="', child, '">', child, '</a></li>']) |
|---|
| 105 | html.append('</ul>') |
|---|
| 106 | return ''.join(html) |
|---|
| 107 |
|
|---|
| 108 | def locateChild(self, ctx, segments): |
|---|
| 109 | name = segments[0] |
|---|
| 110 | fullpath = os.path.join(self.directory, name) |
|---|
| 111 | if not os.path.exists(fullpath): |
|---|
| 112 | return None, () # 404 |
|---|
| 113 |
|
|---|
| 114 | if os.path.isdir(fullpath): |
|---|
| 115 | return Directory(fullpath), segments[1:] |
|---|
| 116 | if os.path.isfile(fullpath): |
|---|
| 117 | return static.File(fullpath), segments[1:] |
|---|
| 118 |
|
|---|
| 119 | Because this implementation of ``locateChild`` only consumed one segment and |
|---|
| 120 | returned the rest of them (``segments[1:]``), the object traversal process will |
|---|
| 121 | continue by calling ``locateChild`` on the returned resource and passing the |
|---|
| 122 | partially-consumed segments. In this way, a directory structure of any depth can |
|---|
| 123 | be traversed, and directory listings or file contents can be rendered for any |
|---|
| 124 | existing directories and files. |
|---|
| 125 |
|
|---|
| 126 | So, let us examine what happens when the URI "/foo/bar/baz.html" is traversed, |
|---|
| 127 | where "foo" and "bar" are directories, and "baz.html" is a file. |
|---|
| 128 |
|
|---|
| 129 | Directory('/').locateChild(ctx, ('foo', 'bar', 'baz.html')) |
|---|
| 130 | Returns Directory('/foo'), ('bar', 'baz.html') |
|---|
| 131 |
|
|---|
| 132 | Directory('/foo').locateChild(ctx, ('bar', 'baz.html')) |
|---|
| 133 | Returns Directory('/foo/bar'), ('baz.html, ) |
|---|
| 134 |
|
|---|
| 135 | Directory('/foo/bar').locateChild(ctx, ('baz.html')) |
|---|
| 136 | Returns File('/foo/bar/baz.html'), () |
|---|
| 137 |
|
|---|
| 138 | No more segments to be consumed; ``File('/foo/bar/baz.html').renderHTTP(ctx)`` is |
|---|
| 139 | called, and the result is sent to the browser. |
|---|
| 140 | |
|---|
| 141 | ``childFactory`` method |
|---|
| 142 | ----------------------- |
|---|
| 143 |
|
|---|
| 144 | Consuming one URI segment at a time by checking to see if a requested resource |
|---|
| 145 | exists and returning a new object is a very common pattern. Nevow's default |
|---|
| 146 | implementation of ``IResource``, ``nevow.rend.Page``, contains an implementation of |
|---|
| 147 | ``locateChild`` which provides more convenient hooks for implementing object |
|---|
| 148 | traversal. One of these hooks is ``childFactory``. Let us imagine for the sake of |
|---|
| 149 | example that we wished to render a tree of dictionaries. Our data structure |
|---|
| 150 | might look something like this:: |
|---|
| 151 |
|
|---|
| 152 | tree = dict( |
|---|
| 153 | one=dict( |
|---|
| 154 | foo=None, |
|---|
| 155 | bar=None), |
|---|
| 156 | two=dict( |
|---|
| 157 | baz=dict( |
|---|
| 158 | quux=None))) |
|---|
| 159 |
|
|---|
| 160 | Given this data structure, the valid URIs would be: |
|---|
| 161 |
|
|---|
| 162 | * / |
|---|
| 163 | * /one |
|---|
| 164 | * /one/foo |
|---|
| 165 | * /one/bar |
|---|
| 166 | * /two |
|---|
| 167 | * /two/baz |
|---|
| 168 | * /two/baz/quux |
|---|
| 169 |
|
|---|
| 170 | Let us construct a ``rend.Page`` subclass which uses the default ``locateChild`` |
|---|
| 171 | implementation and overrides the ``childFactory`` hook instead:: |
|---|
| 172 |
|
|---|
| 173 | class DictTree(rend.Page): |
|---|
| 174 | def __init__(self, dataDict): |
|---|
| 175 | self.dataDict = dataDict |
|---|
| 176 |
|
|---|
| 177 | def renderHTTP(self, ctx): |
|---|
| 178 | if self.dataDict is None: |
|---|
| 179 | return "Leaf" |
|---|
| 180 | html = ['<ul>'] |
|---|
| 181 | for key in self.dataDict.keys(): |
|---|
| 182 | html.extend(['<li><a href="', key, '">', key, '</a></li>']) |
|---|
| 183 | html.append('</ul>') |
|---|
| 184 | return ''.join(html) |
|---|
| 185 |
|
|---|
| 186 | def childFactory(self, ctx, name): |
|---|
| 187 | if name not in self.dataDict: |
|---|
| 188 | return rend.NotFound # 404 |
|---|
| 189 | return DictTree(self.dataDict[name]) |
|---|
| 190 |
|
|---|
| 191 | As you can see, the ``childFactory`` implementation is considerably shorter than the |
|---|
| 192 | equivalent ``locateChild`` implementation would have been. |
|---|
| 193 |
|
|---|
| 194 | ``child_*`` methods and attributes |
|---|
| 195 | ---------------------------------- |
|---|
| 196 |
|
|---|
| 197 | Often we may wish to have some hardcoded URLs which are not dynamically |
|---|
| 198 | generated based on some data structure. For example, we might have an |
|---|
| 199 | application which uses an external CSS stylesheet, an external JavaScript file, |
|---|
| 200 | and a folder full of images. The ``rend.Page`` ``locateChild`` implementation provides a |
|---|
| 201 | convenient way for us to express these relationships by using ``child``-prefixed |
|---|
| 202 | methods:: |
|---|
| 203 |
|
|---|
| 204 | class Linker(rend.Page): |
|---|
| 205 | def renderHTTP(self, ctx): |
|---|
| 206 | return """<html> |
|---|
| 207 | <head> |
|---|
| 208 | <link href="css" rel="stylesheet" /> |
|---|
| 209 | <script type="text/javascript" src="scripts" /> |
|---|
| 210 | <body> |
|---|
| 211 | <img src="images/logo.png" /> |
|---|
| 212 | </body> |
|---|
| 213 | </html>""" |
|---|
| 214 |
|
|---|
| 215 | def child_css(self, ctx): |
|---|
| 216 | return static.File('/Users/dp/styles.css') |
|---|
| 217 |
|
|---|
| 218 | def child_scripts(self, ctx): |
|---|
| 219 | return static.File('/Users/dp/scripts.js') |
|---|
| 220 |
|
|---|
| 221 | def child_images(self, ctx): |
|---|
| 222 | return static.File('/Users/dp/images/') |
|---|
| 223 |
|
|---|
| 224 | One thing you may have noticed is that all of the examples so far have returned |
|---|
| 225 | new object instances whenever they were implementing a traversal API. However, |
|---|
| 226 | there is no reason these instances cannot be shared. One could for example |
|---|
| 227 | return a global resource instance, an instance which was previously inserted in |
|---|
| 228 | a dict, or lazily create and cache dynamic resource instances on the fly. The |
|---|
| 229 | ``rend.Page`` ``locateChild`` implementation also provides a convenient way to express |
|---|
| 230 | that one global resource instance should always be used for a particular url, |
|---|
| 231 | the ``child``-prefixed attribute:: |
|---|
| 232 |
|
|---|
| 233 | class FasterLinker(Linker): |
|---|
| 234 | child_css = static.File('/Users/dp/styles.css') |
|---|
| 235 | child_scripts = static.File('/Users/dp/scripts.js') |
|---|
| 236 | child_images = static.File('/Users/dp/images/') |
|---|
| 237 |
|
|---|
| 238 | Dots in child names |
|---|
| 239 | ------------------- |
|---|
| 240 |
|
|---|
| 241 | When a URL contains dots, which is quite common in normal URLs, it is simple |
|---|
| 242 | enough to handle these URL segments in ``locateChild`` or ``childFactory`` -- one of the |
|---|
| 243 | passed segments will simply be a string containing a dot. However, it is not |
|---|
| 244 | immediately obvious how one would express a URL segment with a dot in it when |
|---|
| 245 | using ``child``-prefixed methods. The solution is really quite simple:: |
|---|
| 246 |
|
|---|
| 247 | class DotChildren(rend.Page): |
|---|
| 248 | return '<html><head><script type="text/javascript" src="scripts.js" /></head></html>' |
|---|
| 249 |
|
|---|
| 250 | setattr(DotChildren, 'child_scripts.js', static.File('/Users/dp/scripts.js')) |
|---|
| 251 |
|
|---|
| 252 | The same technique could be used to install a child method with a dot in the |
|---|
| 253 | name. |
|---|
| 254 |
|
|---|
| 255 | children dictionary |
|---|
| 256 | ------------------- |
|---|
| 257 |
|
|---|
| 258 | The final hook supported by the default implementation of locateChild is the |
|---|
| 259 | ``rend.Page.children`` dictionary:: |
|---|
| 260 |
|
|---|
| 261 | class Main(rend.Page): |
|---|
| 262 | children = { |
|---|
| 263 | 'people': People(), |
|---|
| 264 | 'jobs': Jobs(), |
|---|
| 265 | 'events': Events()} |
|---|
| 266 |
|
|---|
| 267 | def renderHTTP(self, ctx): |
|---|
| 268 | return """\ |
|---|
| 269 | <html> |
|---|
| 270 | <head> |
|---|
| 271 | <title>Our Site</title> |
|---|
| 272 | </head> |
|---|
| 273 | <body> |
|---|
| 274 | <p>bla bla bla</p> |
|---|
| 275 | </body> |
|---|
| 276 | </html>""" |
|---|
| 277 |
|
|---|
| 278 |
|
|---|
| 279 | Hooks are checked in the following order: |
|---|
| 280 |
|
|---|
| 281 | 1. ``self.dictionary`` |
|---|
| 282 | 2. ``self.child_*`` |
|---|
| 283 | 3. ``self.childFactory`` |
|---|
| 284 |
|
|---|
| 285 | The default trailing slash handler |
|---|
| 286 | ---------------------------------- |
|---|
| 287 |
|
|---|
| 288 | When a URI which is being handled ends in a slash, such as when the '/' URI is |
|---|
| 289 | being rendered or when a directory-like URI is being rendered, the string '' |
|---|
| 290 | appears in the path segments which will be traversed. Again, handling this case |
|---|
| 291 | is trivial inside either ``locateChild`` or ``childFactory``, but it may not be |
|---|
| 292 | immediately obvious what ``child``-prefixed method or attribute will be looked up. |
|---|
| 293 | The method or attribute name which will be used is simply ``child`` with a single |
|---|
| 294 | trailing underscore. |
|---|
| 295 |
|
|---|
| 296 | The ``rend.Page`` class provides an implementation of this method which can work in |
|---|
| 297 | two different ways. If the attribute ``addSlash`` is True, the default trailing |
|---|
| 298 | slash handler will return ``self``. In the case when ``addSlash`` is True, the default |
|---|
| 299 | ``rend.Page.renderHTTP`` implementation will simply perform a redirect which adds |
|---|
| 300 | the missing slash to the URL. |
|---|
| 301 |
|
|---|
| 302 | The default trailing slash handler also returns self if ``addSlash`` is false, but |
|---|
| 303 | emits a warning as it does so. This warning may become an exception at some |
|---|
| 304 | point in the future. |
|---|
| 305 |
|
|---|
| 306 | ``ICurrentSegments`` and ``IRemainingSegments`` |
|---|
| 307 | ----------------------------------------------- |
|---|
| 308 |
|
|---|
| 309 | During the object traversal process, it may be useful to discover which segments |
|---|
| 310 | have already been handled and which segments are remaining to be handled. This |
|---|
| 311 | information may be obtained from the ``context`` object which is passed to all the |
|---|
| 312 | traversal APIs. The interfaces ``nevow.inevow.ICurrentSegments`` and |
|---|
| 313 | ``nevow.inevow.IRemainingSegments`` are used to retrieve this information. To |
|---|
| 314 | retrieve a tuple of segments which have previously been consumed during object |
|---|
| 315 | traversal, use this syntax:: |
|---|
| 316 |
|
|---|
| 317 | segs = ICurrentSegments(ctx) |
|---|
| 318 |
|
|---|
| 319 | The same is true of ``IRemainingSegments``. ``IRemainingSegments`` is the same value |
|---|
| 320 | which is passed as ``segments`` to ``locateChild``, but may also be useful in the |
|---|
| 321 | implementations of ``childFactory`` or a ``child``-prefixed method, where this |
|---|
| 322 | information would not otherwise be available. |
|---|
| 323 | |
|---|
| 324 | Conclusion |
|---|
| 325 | ========== |
|---|
| 326 |
|
|---|
| 327 | Nevow makes it easy to handle complex URL hierarchies. The most basic object |
|---|
| 328 | traversal interface, ``nevow.inevow.IResource.locateChild``, provides powerful and |
|---|
| 329 | flexible control over the entire object traversal process. Nevow's canonical |
|---|
| 330 | ``IResource`` implementation, ``rend.Page``, also includes the convenience hooks |
|---|
| 331 | ``childFactory`` along with ``child``-prefixed method and attribute semantics to |
|---|
| 332 | simplify common use cases. |
|---|