1 | Nevow Object Traversal |
---|
2 | ====================== |
---|
3 |
|
---|
4 | *Object traversal* is the process Nevow uses to determine what object to use to |
---|
5 | render HTML for a particular URL. When an HTTP request comes in to the web |
---|
6 | server, the object publisher splits the URL into segments, and repeatedly calls |
---|
7 | methods which consume path segments and return objects which represent that |
---|
8 | path, until all segments have been consumed. At the core, the Nevow traversal |
---|
9 | API is very simple. However, it provides some higher level functionality layered |
---|
10 | on top of this to satisfy common use cases. |
---|
11 |
|
---|
12 | * `Object Traversal Basics`_ |
---|
13 | * `locateChild in depth`_ |
---|
14 | * `childFactory method`_ |
---|
15 | * `child_* methods and attributes`_ |
---|
16 | * `Dots in child names`_ |
---|
17 | * `children dictionary`_ |
---|
18 | * `The default trailing slash handler`_ |
---|
19 | * `ICurrentSegments and IRemainingSegments`_ |
---|
20 |
|
---|
21 | Object Traversal Basics |
---|
22 | ----------------------- |
---|
23 |
|
---|
24 | The *root resource* is the top-level object in the URL space; it conceptually |
---|
25 | represents the URI "/". The Nevow *object traversal* and *object publishing* |
---|
26 | machinery uses only two methods to locate an object suitable for publishing and |
---|
27 | to generate the HTML from it; these methods are described in the interface |
---|
28 | ``nevow.inevow.IResource``:: |
---|
29 |
|
---|
30 |
|
---|
31 | class IResource(compy.Interface): |
---|
32 | def locateChild(self, ctx, segments): |
---|
33 | """Locate another object which can be adapted to IResource |
---|
34 | Return a tuple of resource, path segments |
---|
35 | """ |
---|
36 |
|
---|
37 | def renderHTTP(self, ctx): |
---|
38 | """Render a request |
---|
39 | """ |
---|
40 |
|
---|
41 | ``renderHTTP`` can be as simple as a method which simply returns a string of HTML. |
---|
42 | Let's examine what happens when object traversal occurs over a very simple root |
---|
43 | resource:: |
---|
44 |
|
---|
45 | from zope.interface import implements |
---|
46 |
|
---|
47 | class SimpleRoot(object): |
---|
48 | implements(inevow.IResource) |
---|
49 |
|
---|
50 | def locateChild(self, ctx, segments): |
---|
51 | return self, () |
---|
52 |
|
---|
53 | def renderHTTP(self, ctx): |
---|
54 | return "Hello, world!" |
---|
55 |
|
---|
56 | This resource, when passed as the root resource to ``appserver.NevowSite`` or |
---|
57 | ``wsgi.createWSGIApplication``, will immediately return itself, consuming all path |
---|
58 | segments. This means that for every URI a user visits on a web server which is |
---|
59 | serving this root resource, the text "Hello, world!" will be rendered. Let's |
---|
60 | examine the value of ``segments`` for various values of URI: |
---|
61 |
|
---|
62 | /foo/bar |
---|
63 | ('foo', 'bar') |
---|
64 |
|
---|
65 | / |
---|
66 | ('', ) |
---|
67 |
|
---|
68 | /foo/bar/baz.html |
---|
69 | ('foo', 'bar', 'baz.html') |
---|
70 |
|
---|
71 | /foo/bar/directory/ |
---|
72 | ('foo', 'bar', 'directory', '') |
---|
73 |
|
---|
74 | So we see that Nevow does nothing more than split the URI on the string '/' and |
---|
75 | pass these path segments to our application for consumption. Armed with these |
---|
76 | two methods alone, we already have enough information to write applications |
---|
77 | which service any form of URL imaginable in any way we wish. However, there are |
---|
78 | some common URL handling patterns which Nevow provides higher level support for. |
---|
79 |
|
---|
80 | ``locateChild`` in depth |
---|
81 | ------------------------ |
---|
82 |
|
---|
83 | One common URL handling pattern involves parents which only know about their |
---|
84 | direct children. For example, a ``Directory`` object may only know about the |
---|
85 | contents of a single directory, but if it contains other directories, it does |
---|
86 | not know about the contents of them. Let's examine a simple ``Directory`` object |
---|
87 | which can provide directory listings and serves up objects for child directories |
---|
88 | and files:: |
---|
89 |
|
---|
90 | from zope.interface import implements |
---|
91 |
|
---|
92 | class Directory(object): |
---|
93 | implements(inevow.IResource) |
---|
94 |
|
---|
95 | def __init__(self, directory): |
---|
96 | self.directory = directory |
---|
97 |
|
---|
98 | def renderHTTP(self, ctx): |
---|
99 | html = ['<ul>'] |
---|
100 | for child in os.listdir(self.directory): |
---|
101 | fullpath = os.path.join(self.directory, child) |
---|
102 | if os.path.isdir(fullpath): |
---|
103 | child += '/' |
---|
104 | html.extend(['<li><a href="', child, '">', child, '</a></li>']) |
---|
105 | html.append('</ul>') |
---|
106 | return ''.join(html) |
---|
107 |
|
---|
108 | def locateChild(self, ctx, segments): |
---|
109 | name = segments[0] |
---|
110 | fullpath = os.path.join(self.directory, name) |
---|
111 | if not os.path.exists(fullpath): |
---|
112 | return None, () # 404 |
---|
113 |
|
---|
114 | if os.path.isdir(fullpath): |
---|
115 | return Directory(fullpath), segments[1:] |
---|
116 | if os.path.isfile(fullpath): |
---|
117 | return static.File(fullpath), segments[1:] |
---|
118 |
|
---|
119 | Because this implementation of ``locateChild`` only consumed one segment and |
---|
120 | returned the rest of them (``segments[1:]``), the object traversal process will |
---|
121 | continue by calling ``locateChild`` on the returned resource and passing the |
---|
122 | partially-consumed segments. In this way, a directory structure of any depth can |
---|
123 | be traversed, and directory listings or file contents can be rendered for any |
---|
124 | existing directories and files. |
---|
125 |
|
---|
126 | So, let us examine what happens when the URI "/foo/bar/baz.html" is traversed, |
---|
127 | where "foo" and "bar" are directories, and "baz.html" is a file. |
---|
128 |
|
---|
129 | Directory('/').locateChild(ctx, ('foo', 'bar', 'baz.html')) |
---|
130 | Returns Directory('/foo'), ('bar', 'baz.html') |
---|
131 |
|
---|
132 | Directory('/foo').locateChild(ctx, ('bar', 'baz.html')) |
---|
133 | Returns Directory('/foo/bar'), ('baz.html, ) |
---|
134 |
|
---|
135 | Directory('/foo/bar').locateChild(ctx, ('baz.html')) |
---|
136 | Returns File('/foo/bar/baz.html'), () |
---|
137 |
|
---|
138 | No more segments to be consumed; ``File('/foo/bar/baz.html').renderHTTP(ctx)`` is |
---|
139 | called, and the result is sent to the browser. |
---|
140 | |
---|
141 | ``childFactory`` method |
---|
142 | ----------------------- |
---|
143 |
|
---|
144 | Consuming one URI segment at a time by checking to see if a requested resource |
---|
145 | exists and returning a new object is a very common pattern. Nevow's default |
---|
146 | implementation of ``IResource``, ``nevow.rend.Page``, contains an implementation of |
---|
147 | ``locateChild`` which provides more convenient hooks for implementing object |
---|
148 | traversal. One of these hooks is ``childFactory``. Let us imagine for the sake of |
---|
149 | example that we wished to render a tree of dictionaries. Our data structure |
---|
150 | might look something like this:: |
---|
151 |
|
---|
152 | tree = dict( |
---|
153 | one=dict( |
---|
154 | foo=None, |
---|
155 | bar=None), |
---|
156 | two=dict( |
---|
157 | baz=dict( |
---|
158 | quux=None))) |
---|
159 |
|
---|
160 | Given this data structure, the valid URIs would be: |
---|
161 |
|
---|
162 | * / |
---|
163 | * /one |
---|
164 | * /one/foo |
---|
165 | * /one/bar |
---|
166 | * /two |
---|
167 | * /two/baz |
---|
168 | * /two/baz/quux |
---|
169 |
|
---|
170 | Let us construct a ``rend.Page`` subclass which uses the default ``locateChild`` |
---|
171 | implementation and overrides the ``childFactory`` hook instead:: |
---|
172 |
|
---|
173 | class DictTree(rend.Page): |
---|
174 | def __init__(self, dataDict): |
---|
175 | self.dataDict = dataDict |
---|
176 |
|
---|
177 | def renderHTTP(self, ctx): |
---|
178 | if self.dataDict is None: |
---|
179 | return "Leaf" |
---|
180 | html = ['<ul>'] |
---|
181 | for key in self.dataDict.keys(): |
---|
182 | html.extend(['<li><a href="', key, '">', key, '</a></li>']) |
---|
183 | html.append('</ul>') |
---|
184 | return ''.join(html) |
---|
185 |
|
---|
186 | def childFactory(self, ctx, name): |
---|
187 | if name not in self.dataDict: |
---|
188 | return rend.NotFound # 404 |
---|
189 | return DictTree(self.dataDict[name]) |
---|
190 |
|
---|
191 | As you can see, the ``childFactory`` implementation is considerably shorter than the |
---|
192 | equivalent ``locateChild`` implementation would have been. |
---|
193 |
|
---|
194 | ``child_*`` methods and attributes |
---|
195 | ---------------------------------- |
---|
196 |
|
---|
197 | Often we may wish to have some hardcoded URLs which are not dynamically |
---|
198 | generated based on some data structure. For example, we might have an |
---|
199 | application which uses an external CSS stylesheet, an external JavaScript file, |
---|
200 | and a folder full of images. The ``rend.Page`` ``locateChild`` implementation provides a |
---|
201 | convenient way for us to express these relationships by using ``child``-prefixed |
---|
202 | methods:: |
---|
203 |
|
---|
204 | class Linker(rend.Page): |
---|
205 | def renderHTTP(self, ctx): |
---|
206 | return """<html> |
---|
207 | <head> |
---|
208 | <link href="css" rel="stylesheet" /> |
---|
209 | <script type="text/javascript" src="scripts" /> |
---|
210 | <body> |
---|
211 | <img src="images/logo.png" /> |
---|
212 | </body> |
---|
213 | </html>""" |
---|
214 |
|
---|
215 | def child_css(self, ctx): |
---|
216 | return static.File('/Users/dp/styles.css') |
---|
217 |
|
---|
218 | def child_scripts(self, ctx): |
---|
219 | return static.File('/Users/dp/scripts.js') |
---|
220 |
|
---|
221 | def child_images(self, ctx): |
---|
222 | return static.File('/Users/dp/images/') |
---|
223 |
|
---|
224 | One thing you may have noticed is that all of the examples so far have returned |
---|
225 | new object instances whenever they were implementing a traversal API. However, |
---|
226 | there is no reason these instances cannot be shared. One could for example |
---|
227 | return a global resource instance, an instance which was previously inserted in |
---|
228 | a dict, or lazily create and cache dynamic resource instances on the fly. The |
---|
229 | ``rend.Page`` ``locateChild`` implementation also provides a convenient way to express |
---|
230 | that one global resource instance should always be used for a particular url, |
---|
231 | the ``child``-prefixed attribute:: |
---|
232 |
|
---|
233 | class FasterLinker(Linker): |
---|
234 | child_css = static.File('/Users/dp/styles.css') |
---|
235 | child_scripts = static.File('/Users/dp/scripts.js') |
---|
236 | child_images = static.File('/Users/dp/images/') |
---|
237 |
|
---|
238 | Dots in child names |
---|
239 | ------------------- |
---|
240 |
|
---|
241 | When a URL contains dots, which is quite common in normal URLs, it is simple |
---|
242 | enough to handle these URL segments in ``locateChild`` or ``childFactory`` -- one of the |
---|
243 | passed segments will simply be a string containing a dot. However, it is not |
---|
244 | immediately obvious how one would express a URL segment with a dot in it when |
---|
245 | using ``child``-prefixed methods. The solution is really quite simple:: |
---|
246 |
|
---|
247 | class DotChildren(rend.Page): |
---|
248 | return '<html><head><script type="text/javascript" src="scripts.js" /></head></html>' |
---|
249 |
|
---|
250 | setattr(DotChildren, 'child_scripts.js', static.File('/Users/dp/scripts.js')) |
---|
251 |
|
---|
252 | The same technique could be used to install a child method with a dot in the |
---|
253 | name. |
---|
254 |
|
---|
255 | children dictionary |
---|
256 | ------------------- |
---|
257 |
|
---|
258 | The final hook supported by the default implementation of locateChild is the |
---|
259 | ``rend.Page.children`` dictionary:: |
---|
260 |
|
---|
261 | class Main(rend.Page): |
---|
262 | children = { |
---|
263 | 'people': People(), |
---|
264 | 'jobs': Jobs(), |
---|
265 | 'events': Events()} |
---|
266 |
|
---|
267 | def renderHTTP(self, ctx): |
---|
268 | return """/ |
---|
269 | <html> |
---|
270 | <head> |
---|
271 | <title>Our Site</title> |
---|
272 | </head> |
---|
273 | <body> |
---|
274 | <p>bla bla bla</p> |
---|
275 | </body> |
---|
276 | </html>""" |
---|
277 |
|
---|
278 |
|
---|
279 | Hooks are checked in the following order: |
---|
280 |
|
---|
281 | 1. ``self.dictionary`` |
---|
282 | 2. ``self.child_*`` |
---|
283 | 3. ``self.childFactory`` |
---|
284 |
|
---|
285 | The default trailing slash handler |
---|
286 | ---------------------------------- |
---|
287 |
|
---|
288 | When a URI which is being handled ends in a slash, such as when the '/' URI is |
---|
289 | being rendered or when a directory-like URI is being rendered, the string '' |
---|
290 | appears in the path segments which will be traversed. Again, handling this case |
---|
291 | is trivial inside either ``locateChild`` or ``childFactory``, but it may not be |
---|
292 | immediately obvious what ``child``-prefixed method or attribute will be looked up. |
---|
293 | The method or attribute name which will be used is simply ``child`` with a single |
---|
294 | trailing underscore. |
---|
295 |
|
---|
296 | The ``rend.Page`` class provides an implementation of this method which can work in |
---|
297 | two different ways. If the attribute ``addSlash`` is True, the default trailing |
---|
298 | slash handler will return ``self``. In the case when ``addSlash`` is True, the default |
---|
299 | ``rend.Page.renderHTTP`` implementation will simply perform a redirect which adds |
---|
300 | the missing slash to the URL. |
---|
301 |
|
---|
302 | The default trailing slash handler also returns self if ``addSlash`` is false, but |
---|
303 | emits a warning as it does so. This warning may become an exception at some |
---|
304 | point in the future. |
---|
305 |
|
---|
306 | ``ICurrentSegments`` and ``IRemainingSegments`` |
---|
307 | ----------------------------------------------- |
---|
308 |
|
---|
309 | During the object traversal process, it may be useful to discover which segments |
---|
310 | have already been handled and which segments are remaining to be handled. This |
---|
311 | information may be obtained from the ``context`` object which is passed to all the |
---|
312 | traversal APIs. The interfaces ``nevow.inevow.ICurrentSegments`` and |
---|
313 | ``nevow.inevow.IRemainingSegments`` are used to retrieve this information. To |
---|
314 | retrieve a tuple of segments which have previously been consumed during object |
---|
315 | traversal, use this syntax:: |
---|
316 |
|
---|
317 | segs = ICurrentSegments(ctx) |
---|
318 |
|
---|
319 | The same is true of ``IRemainingSegments``. ``IRemainingSegments`` is the same value |
---|
320 | which is passed as ``segments`` to ``locateChild``, but may also be useful in the |
---|
321 | implementations of ``childFactory`` or a ``child``-prefixed method, where this |
---|
322 | information would not otherwise be available. |
---|
323 | |
---|
324 | Conclusion |
---|
325 | ========== |
---|
326 |
|
---|
327 | Nevow makes it easy to handle complex URL hierarchies. The most basic object |
---|
328 | traversal interface, ``nevow.inevow.IResource.locateChild``, provides powerful and |
---|
329 | flexible control over the entire object traversal process. Nevow's canonical |
---|
330 | ``IResource`` implementation, ``rend.Page``, also includes the convenience hooks |
---|
331 | ``childFactory`` along with ``child``-prefixed method and attribute semantics to |
---|
332 | simplify common use cases. |
---|