一、概述
XML:extensiable markup language 可扩展标记语言。
特点:标记可以由用户自己随意扩展. 比如:html中标记如何写,属性如何写,属性值如何写,都是有规范的(w3c规定的),但是在xml文件中,标记没有规范,可以随意扩展!作用:
1、做其他技术的配置文件
2、在不同语言环境下交换数据(相当于桥梁):比如说java结果应用于c语言
二、解析
在HTML中,每个浏览器都会根据html万当解析出一个DOM (document object model 文档对象模型)对象出来。document就代表真个数,document.getElementById()就是在树上根据id找节点,
文档展示:
<?xml version="1.0" encoding="UTF-8"?><!-- 这一行是声明行,所有xml文件都必须写,version为xml的版本号,encoding为使用什么编码 -->
<!DOCTYPE books SYSTEM "books.dtd"><!-- 这是使用外部dtd文件声明这个xml的书写规则 -->
<books><!-- 这些标记都是自己扩展的,这样的xml文档,必须有我们程序猿自己解析 -->
<book>
<title>神雕侠侣</title>
<author>金庸</author>
<price unit="元">100</price>
</book>
<book>
<title>红楼梦</title>
<author>红色</author>
<price unit="元">150</price>
</book>
<book>
<title>一念永恒</title>
<author>耳根</author>
<price unit="元">50</price>
</book>
<book>
<title>哈利巴特</title>
<author>红色</author>
<price unit="元">200</price>
</book>
</books>
dtd文档展示
<!ELEMENT books (book*) >
<!ELEMENT book (title,author,price) >
<!ELEMENT title (#PCDATA) >
<!ELEMENT author (#PCDATA) >
<!ELEMENT price (#PCDATA) >
<!ATTLIST price unit CDATA #REQUIRED >
dtd文档规则:
<!DOCTYPE students SYSTEM “student.dtd”> |
引入外部dtd文件 |
<!ELEMENT students (student*) >
|
定义元素 元素名 子标签(*表示数量不限) |
<!ELEMENT student (id,name,age) >
|
父标签 子标签(“,”为顺序,顺序必须为括号中子标签出现的顺序,个数可以自己指定) |
<ELEMENT id (#PCDATA) > #PCDATA(parsed character data) |
为固定格式,表示id标签中间填写的内容会被解析,不会被当成文本 |
<!ATTLIST age unit CDATA #REQUIRED> |
这是对属性规则的定义。标签名 属性名 CDATA(character data)表示这个标签中的内容不会被解析, 但是内容中不能带有”<>”否则会报错 #REQUIRED表示这个属性是必须的 |
<!ATTLIST age unit CDATA #IMPLIED> |
#IMPLIED表示这个属性不是必须的 |
<!ATTLIST age unit CDATA #FIXED “岁”> |
#FIXED 指定属性值,”岁”为指定的属性值 |
<!ATTLIST age unit (岁|万岁) “岁”> |
枚举:可以数的清 |
<age unit=”岁”><![CDATA[<aa></aa>20]] |
这个<aa></aa>20会原样输出 |
解析xml的工具有很多,这些工具能分为两大类:
1、 将xml解析为一颗树:jdom dom4j …
优点:这种解析方式,可以随时访问一颗树上的任意节点
缺点:这种方式,势必要把整个xml文件的内容都加载到内存中,然后在创建DOM,如果xml文件的内容特别特别大(超过58M(本身大小是64M的,但是java虚拟机将内存划
分为5片区域,栈内存、堆内存、方法区、常量池、静态存储区,堆内存占58M~59M),就会内存溢出)
2、 将xml不解析为一棵树,而是从上向下扫描:sax
优点:不会加载整个xml文件到内存中,而是局部加载,也就是说当下扫描到那里,就仅仅加载哪里,只解析哪里,这样无论文件多大,都不会造成内存溢出
缺点:不会随时随地任意访问xml文件的任意节点,不能修改xml文件
(1)jdom解析
jdom是对原生java解析xml api的封装,操作起来更简单更方便
代码:
import java.io.File;
import java.io.FileOutputStream;
import java.util.List;
import org.jdom.Attribute;
import org.jdom.Document;
import org.jdom.Element;
import org.jdom.input.SAXBuilder;
import org.jdom.output.XMLOutputter;
import org.junit.Test;
public class Demo {
@Test
public void select1() throws Exception{
//创建一个解析器
SAXBuilder sb = new SAXBuilder();
//用解析器来解析一个xml文件,得到一个Document对象,此时树已经生成
Document doc = sb.build(new File("./src/books.xml"));
//通过document,可以获取根对象
Element rootElem = doc.getRootElement();
//再从根对象获取所有子节点
List<Element> childElem = rootElem.getChildren();
for (Element elem : childElem) {
System.out.println("==========================");
System.out.println(elem.getName());
Element title = elem.getChild("title");
Element author = elem.getChild("author");
Element price = elem.getChild("price");
Attribute unit = price.getAttribute("unit");
System.out.println(title.getName()+":"+title.getValue());
System.out.println(author.getName()+":"+author.getValue());
System.out.println(price.getName()+":"+price.getValue());
System.out.println(unit.getName()+":"+unit.getValue());
}
}
@Test
public void add() throws Exception{
SAXBuilder sb = new SAXBuilder();
Document doc = sb.build(new File("./src/books.xml"));
Element rootElem = doc.getRootElement();
List<Element> childElem = rootElem.getChildren();
Element book = new Element("book");
Element title = new Element("title");
Element author = new Element("author");
Element price = new Element("price");
title.setText("哈利巴特");
author.setText("哈礼物");
price.setText("200");
price.setAttribute("unit","元");
book.addContent(title);
book.addContent(author);
book.addContent(price);
childElem.add(book);
//把修改后的xml文件输出
XMLOutputter xop = new XMLOutputter();
xop.output(doc, new FileOutputStream("./src/books.xml"));
}
@Test
public void update() throws Exception{
SAXBuilder sb = new SAXBuilder();
Document doc = sb.build(new File("./src/books.xml"));
Element rootElem = doc.getRootElement();
List<Element> childElem = rootElem.getChildren();
for (Element elem : childElem) {
Element price = elem.getChild("price");
if ("200".equals(price.getText())) {
Element author = elem.getChild("author");
Attribute unit = price.getAttribute("unit");
unit.setName("hello");
author.setText("红色");
}
}
//把修改后的xml文件输出
XMLOutputter xop = new XMLOutputter();
xop.output(doc, new FileOutputStream("./src/books.xml"));
}
@Test
public void delete() throws Exception{
SAXBuilder sb = new SAXBuilder();
Document doc = sb.build(new File("./src/books.xml"));
Element rootElem = doc.getRootElement();
List<Element> childElem = rootElem.getChildren();
for (Element elem : childElem) {
Element price = elem.getChild("price");
if ("200".equals(price.getText())) {
price.detach();
//elem.detach(); 增强for循环底层使是迭代器,在迭代器迭代期间,不能执行增删改查
}
}
//把修改后的xml文件输出
XMLOutputter xop = new XMLOutputter();
xop.output(doc, new FileOutputStream("./src/books.xml"));
}
}
(2)dom4j
jdom能完成什么功能,dom4j也能完成,但是dom4j还支持强大的XPath!
import java.io.File;
import java.io.FileOutputStream;
import java.util.List;
import org.dom4j.Attribute;
import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.io.OutputFormat;
import org.dom4j.io.SAXReader;
import org.dom4j.io.XMLWriter;
import org.dom4j.tree.DefaultDocument;
import org.dom4j.tree.DefaultElement;
import org.junit.Test;
public class Demo {
@Test
public void select() throws Exception{
//创建xml解析器
SAXReader reader = new SAXReader();
//加载xml文件,获取到一个document对象
Document read = reader.read(new File("src/dbooks.xml"));
//获取到根对象,之后由根对对象获取到其他子节点对对象
Element root = read.getRootElement();
List<Element> list = root.elements();
for (Element book : list) {
List<Element> list2 = book.elements();
System.out.println("=========================");
Element title = book.element("title");
Element price = book.element("price");
Attribute unit = price.attribute("unit");
Element desc = book.element("description");
System.out.println(title.getName()+":"+title.getText());
System.out.println(price.getName()+":"+price.getText()+unit.getText());
System.out.println(desc.getName()+":"+desc.getText());
}
}
@Test
public void add() throws Exception{
SAXReader reader = new SAXReader();
Document read = reader.read(new File("src/dbooks.xml"));
Element root = read.getRootElement();
Element book = new DefaultElement("book");
Element title = new DefaultElement("title");
Element price = new DefaultElement("price");
Element description = new DefaultElement("description");
price.setAttributeValue("unit", "元");
title.setText("水浒传");
price.setText("500");
description.setText("一群大老爷们打群架的故事");
book.add(title);
book.add(price);
book.add(description);
root.add(book);
//将增加的信息添加进xml
OutputFormat format = OutputFormat.createPrettyPrint();
format.setEncoding("utf-8");
XMLWriter writer = new XMLWriter(new FileOutputStream(new File("src/dbooks.xml")),format);
writer.write(read);
System.out.println("over");
}
@Test
public void update() throws Exception{
SAXReader reader = new SAXReader();
Document read = reader.read(new File("src/dbooks.xml"));
Element root = read.getRootElement();
List<Element> list = root.elements();
for (Element book : list) {
Element price = book.element("price");
if ("200".equals(price.getText())) {
Element title = book.element("title");
title.setText("齐天大圣");
}
}
//将增加的信息添加进xml
OutputFormat format = OutputFormat.createPrettyPrint();
format.setEncoding("utf-8");
XMLWriter writer = new XMLWriter(new FileOutputStream(new File("src/dbooks.xml")),format);
writer.write(read);
System.out.println("over");
}
@Test
public void delete() throws Exception{
SAXReader reader = new SAXReader();
Document read = reader.read(new File("src/dbooks.xml"));
Element root = read.getRootElement();
List<Element> list = root.elements();
Element em =null;
for (Element book : list) {
Element price = book.element("price");
if ("200".equals(price.getText())) {
em = book;
}
}
em.detach();
//将增加的信息添加进xml
OutputFormat format = OutputFormat.createPrettyPrint();
format.setEncoding("utf-8");
XMLWriter writer = new XMLWriter(new FileOutputStream(new File("src/dbooks.xml")),format);
writer.write(read);
System.out.println("over");
}
}
xPath使用:
import java.io.File;
import java.util.List;
import org.dom4j.Attribute;
import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;
import org.junit.Test;
public class Demo {
@Test
public void sele() throws Exception{
//创建xml解析器
SAXReader reader = new SAXReader();
//加载xml文件,获取到一个document对象
Document read = reader.read(new File("src/test.xml"));
//使用XPath技术,不需要获取根节点,而是直接调用document的一个api: selectNodes
/*List<Attribute> node = read.selectNodes("//@id");
for (Attribute elem : node) {
System.out.println(elem.getName());
}*/
List<Element> node = read.selectNodes("//BBB[@id]");
for (Element elem : node) {
System.out.println(elem.getName());
}
}
}
(3)sax
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.apache.xerces.jaxp.SAXParserFactoryImpl;
import org.junit.Test;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
public class Demo {
@Test
public void saxFind() throws Exception{
//创建解析器
SAXParserFactory spf = new SAXParserFactoryImpl();
SAXParser sp = spf.newSAXParser();
XMLReader reader = sp.getXMLReader();
//设置事件
reader.setContentHandler(new DefaultHandler(){
//查找名字为title的所有内容
//因为标记解析解析完一个就会释放,所以无法直接在内容中判断,只有在外部定义一个变量接收开始标记,然后进行判断
private String tag;
//碰到标记开始
@Override
public void startElement(String uri, String localName,
String qName, Attributes attributes) throws SAXException {
//System.out.println("<"+qName+">");
tag=qName;
}
//碰到内容开始
@Override
public void characters(char[] ch, int start, int length)
throws SAXException {
//System.out.println(new String(ch,start,length));
if ("title".equals(tag)) {
System.out.println(new String(ch,start,length));
}
}
//碰到结束标记开始
@Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
//System.out.println("</"+qName+">");
}
});
//使用解析器,去解析一个xml文件
reader.parse("src/dbooks.xml");
}
}