E-factory

最新推荐文章于 2025-11-25 11:44:22 发布

原创最新推荐文章于 2025-11-25 11:44:22 发布 · 159 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#python #xml #html #java #web

本文介绍如何利用Python的lxml库中的builder模块轻松构建XML及HTML文档。通过示例展示了基于属性访问的元素创建方法，以及如何通过ElementMaker处理多个命名空间。

E-factory为生成XML和HTML提供了一种简单而紧凑的语法

# coding:utf-8
from lxml.builder import E

def CLASS(*args): # class is a reserved word in Python
    return {"class":' '.join(args)}

html = page = (
    E.html(       # create an Element called "html"
        E.head(
            E.title("This is a sample document")
        ),
        E.body(
            E.h1("Hello!", CLASS("title")),
            E.p("This is a paragraph with ", E.b("bold"), " text in it!"),
            E.p("This is another paragraph, with a", "\n      ",
            E.a("link", href="http://www.python.org"), "."),
            E.p("Here are some reserved characters: <spam&egg>."),
            etree.XML("<p>And finally an embedded XHTML fragment.</p>"),
        )
    )
)

print(etree.tostring(page, pretty_print=True))
 
'''
输出：
<html>
  <head>
    <title>This is a sample document</title>
  </head>
  <body>
    <h1 class="title">Hello!</h1>
    <p>This is a paragraph with <b>bold</b> text in it!</p>
    <p>This is another paragraph, with a
      <a href="http://www.python.org">link</a>.</p>
    <p>Here are some reserved characters: &lt;spam&amp;egg&gt;.</p>
    <p>And finally an embedded XHTML fragment.</p>
  </body>
</html>
'''

基于属性访问的元素创建使为XML语言构建简单词汇表变得容易

from lxml.builder import ElementMaker # lxml only !

E = ElementMaker(namespace="http://my.de/fault/namespace",
                 nsmap={'p' : "http://my.de/fault/namespace"})

DOC = E.doc
TITLE = E.title
SECTION = E.section
PAR = E.par

my_doc = DOC(
    TITLE("The dog and the hog"),
    SECTION(
        TITLE("The dog"),
        PAR("Once upon a time, "),
        PAR("And then ")
    ),
    SECTION(
        TITLE("The hog"),
        PAR("Sooner or later ")
    )
)

print(etree.tostring(my_doc, pretty_print=True))

'''
输出：
<p:doc xmlns:p="http://my.de/fault/namespace">
  <p:title>The dog and the hog</p:title>
  <p:section>
    <p:title>The dog</p:title>
    <p:par>Once upon a time, </p:par>
    <p:par>And then </p:par>
  </p:section>
  <p:section>
    <p:title>The hog</p:title>
    <p:par>Sooner or later </p:par>
  </p:section>
</p:doc>
'''

其中一个例子是模块lxml.html.builder，它为html提供了词汇表
在处理多个命名空间时，最好为每个命名空间URI定义一个ElementMaker
同样，请注意上面的示例如何在命名常量中预定义标记生成器
这使得将名称空间的所有标记声明放在一个Python模块中以及从那里导入/使用标记名常量变得非常容易
这样可以避免诸如输入错误或意外丢失名称空间之类的陷阱