xml.dom读取xml指定节点
使用xml.dom解析xml
文件对象模型(Document Object Model,简称DOM),Python中用 xml.dom.minidom来解析 xml 文件。
from xml.dom.minidom import parse
import xml.dom.minidom
几个关键方法
getElementsByTagName
hasAttribute或者getAttribute
childNodes
例如:
root.getElementsByTagName(“A”),获取标签为A的一组标签。数组索引[0]表示标签中的第1个;
hasAttribute或者getAttribute方法可以判断或者获得元素的属性所对应的值;
childNodes方法是列表,childNodes[0]是节点。
XML示例
<collection shelf="New Arrivals">
<movie title="Enemy Behind">
<type>War, Thriller</type>
<format title="storage">
<material>optical disk</material>
<price>cheap</price>
<history>long</history>
</format>
<year>2003</year>
<rating>PG</rating>
<stars>10</stars>
<description>Talk about a US-Japan war</description>
</movie>
<movie title="Transformers">
<type>Anime, Science Fiction</type>
<format>DVD</format>
<year>1989</year>
<rating>R</rating>
<stars>8</stars>
<description>A schientific fiction</description>
获取XML指定节点的实现
from xml.dom.minidom import parse
import xml.dom.minidom
DOMTree = xml.dom.minidom.parse(r"movie.xml")
collection = DOMTree.documentElement
movies = collection.getElementsByTagName("movie")
for movie in movies:
print("-----------Movie-----------")
if movie.hasAttribute("title"):
print("Title: %s" % movie.getAttribute("title"))
formatVar = movie.getElementsByTagName("format")
if formatVar[0].hasAttribute("title"):
materialVar = formatVar[0].getElementsByTagName("material")[0]
print('material: %s' % materialVar.childNodes[0].data)
priceVar = formatVar[0].getElementsByTagName("price")[0]
print('price: %s' % priceVar.childNodes[0].data)
historyVar = formatVar[0].getElementsByTagName("history")[0]
print('history: %s' % historyVar.childNodes[0].data)
效果:
Root element : New Arrivals
-----------Movie-----------
Title: Enemy Behind
material: optical disk
price: cheap
history: long
-----------Movie-----------
Title: Transformers
-----------Movie-----------
Title: Trigun
-----------Movie-----------
Title: Ishtar
参考
https://blog.youkuaiyun.com/sersin39/article/details/107433699