什么是XML
XML是一种可扩展标记语言非常像HTML或SGML的标记语言。 这是由万维网联盟推荐的,可以作为开放标准。XML对于存储小到中等数量的数据非常有用,而不需要使用SQL。
作用:数据交互 配置应用程序和网站 节点自由拓展
特点: XML与操作系统编程语言的开发平台无关 实现不同系统之间的数据转换
首先准备一份XML格式的文件
<?xml version="1.0" encoding='utf-8'?> <data name="XML"> <country name="Liechtenstein"> <rank>1</rank> <year>2008</year> <gdppc>141100</gdppc> <neighbor name="Austria" direction="E"/> <neighbor name="Switzerland" direction="W"/> </country> <country name="Singapore"> <rank>4</rank> <year>2011</year> <gdppc>59900</gdppc> <neighbor name="Malaysia" direction="N"/> </country> <country name="Panama"> <rank>68</rank> <year>2011</year> <gdppc>13600</gdppc> <neighbor name="Costa Rica" direction="W"/> <neighbor name="Colombia" direction="E"/> </country> </data>
第一步当然是导入ElementTree
import xml.etree.ElementTree as ET
然后加载文档到内存里 形成一个倒桩的树结构
tree=ET.parse('XML.xml')
其次就是获取根节点
root=tree.getroot()
最后就可以解析字符串,根据需求拿到需要的字段
查找指定子节点及XML文件的修改及保存
XML={} print('tag:',root.tag,'attrib:',root.attrib,'text:',root.text) for ele in root: print('tag:',ele.tag,'attrib:',ele.attrib) value=[] for e in ele: # print('tag:',e.tag,'attrib:',e.attrib,'text',e.text) if e.text is None: value.append(e.attrib) else: value.append({e.tag:e.text}) XML[ele.attrib['name']]=value print(XML) node=root.find('country')#查找root节点下第一个tag为country的节点 print(node.attrib['name']) nodes=root.findall('country') for node in nodes: if node.attrib['name']=='Liechtenstein': root.remove(node) break tree.write('mingbai.xml')#保存修改后的XML文件 print('删除完成')parse()方法
以下方法创建一个SAX解析器并使用它来解析文档。
xml.sax.parse( xmlfile, contenthandler[, errorhandler])
准备一份XML格式的文件
<?xml version="1.0" encoding='utf-8'?> <collection shelf = "New Arrivals"> <movie title = "Enemy Behind"> <type>War, Thriller</type> <format>DVD</format> <year>2013</year> <rating>PG</rating> <stars>10</stars> <description>Talk about a US-Japan war</description> </movie> <movie title = "Transformers"> <type>Anime, Science Fiction</type> <format>DVD</format> <year>1989</year> <rating>R</rating> <stars>8</stars> <description>A schientific fiction</description> </movie> <movie title = "Trigun"> <type>Anime, Action</type> <format>DVD</format> <episodes>4</episodes> <rating>PG</rating> <stars>10</stars> <description>Vash the Stampede!</description> </movie> <movie title = "Ishtar"> <type>Comedy</type> <format>VHS</format> <rating>PG</rating> <stars>2</stars> <description>Viewable boredom</description> </movie> </collection>解析文件
import xml.sax class MovieHandler( xml.sax.ContentHandler ): def __init__(self): self.CurrentData = "" self.type = "" self.format = "" self.year = "" self.rating = "" self.stars = "" self.description = "" def startElement(self, tag, attributes): self.CurrentData = tag if tag == "movie": print ("*****Movie*****") title = attributes["title"] print ("Title:", title) def endElement(self, tag): if self.CurrentData == "type": print ("Type:", self.type) elif self.CurrentData == "format": print ("Format:", self.format) elif self.CurrentData == "year": print ("Year:", self.year) elif self.CurrentData == "rating": print ("Rating:", self.rating) elif self.CurrentData == "stars": print ("Stars:", self.stars) elif self.CurrentData == "description": print ("Description:", self.description) self.CurrentData = "" #清空缓冲区 def characters(self, content): if self.CurrentData == "type": self.type = content elif self.CurrentData == "format": self.format = content elif self.CurrentData == "year": self.year = content elif self.CurrentData == "rating": self.rating = content elif self.CurrentData == "stars": self.stars = content elif self.CurrentData == "description": self.description = content if ( __name__ == "__main__"): parser = xml.sax.make_parser() #1.create an XMLReader parser.setFeature(xml.sax.handler.feature_namespaces, 0) #2.namepsaces 工作目录 工作空间 命名空间 Handler = MovieHandler() parser.setContentHandler( Handler ) #覆盖其原来的ContextHandler parser.parse("movies.xml")
DVD管理系统
<?xml version="1.0" encoding='utf-8'?> <dvds> <dvd> <name>不堪回首的往事</name> <price>300</price> <state>1</state> </dvd> <dvd> <name>北京一夜</name> <price>400</price> <state>0</state> </dvd> <dvd> <name>南山南</name> <price>500</price> <state>1</state> </dvd> </dvds>
读取其初始数据
import xml.etree.ElementTree as ET tree=ET.parse('测试.xml') root=tree.getroot() dvds={} def getdvds(): for dvd in root: for ele in dvd: n_dvd=DVD() for ele in dvd: if ele.tag=='name': n_dvd.name=ele.text elif ele.tag=='price': n_dvd.price=ele.text elif ele.tag=='state': n_dvd.state=ele.text dvds[n_dvd.name]=n_dvd return dvds