前几天,做了一个作业题,其中用到了dom4j对xml文件的解析。其实关于dom4j,以前就所有了解,只是接口对应的文档结构不是很熟悉。这里总结下,方便以后查找。
一 jar中节点元素定义如下:
|-Element 定义XML 元素;
|-Document定义了XML文档;
-DocumentType 定义XML DOCTYPE声明;
-Entity定义 XML entity;
-Attribute定义了XML的属性;
-ProcessingInstruction 定义 XML 处理指令;
-CharacterData是一个标识借口,标识基于字符的节点。如CDATA,Comment, Text;
|- CDATA 定义了XML CDATA 区域;
|-Text 定义XML 文本节点;
|- Comment 定义了XML注释的行为;
二 利用jar包,创建如下xml文件:
示例xml:students.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="students.xsl"?>
<students>
<!--A Student Catalog-->
<student sn="01">
<name>sam</name>
<age>18</age>
</student>
<student sn="02">
<name>lin</name>
<age>20</age>
</student>
</students>
|
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.DocumentHelper;
import org.dom4j.Element;
import org.dom4j.io.XMLWriter;
public class XmlGen {
public Document generateDocumentByMethod() {
Document document = DocumentHelper.createDocument();
// ProcessingInstruction
Map<String, String> inMap = new HashMap<String, String>();
inMap.put("type", "text/xsl");
inMap.put("href", "students.xsl");
document.addProcessingInstruction("xml-stylesheet", inMap);
// root element
Element studentsElement = document.addElement("students");
studentsElement.addComment("An Student Catalog");
// son element
Element stuElement = studentsElement.addElement("student");
stuElement.addAttribute("sn", "01");
Element nameElement = stuElement.addElement("name");
nameElement.setText("sam");
Element ageElement = stuElement.addElement("age");
ageElement.setText("18");
// son element
Element anotherStuElement = studentsElement.addElement("student");
anotherStuElement.addAttribute("sn", "02");
Element anotherNameElement = anotherStuElement.addElement("name");
anotherNameElement.setText("lin");
Element anotherAgeElement = anotherStuElement.addElement("age");
anotherAgeElement.setText("20");
return document;
}
public Document generateDocumentByString() {
String text = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
"<?xml-stylesheet type=\"text/xsl\" href=\"students.xsl\"?>" +
"<students><!--An Student Catalog--> <student sn=\"01\">" +
"<name>sam</name><age>18</age></student><student sn=\"02\">" +
"<name>lin</name><age>20</age></student></students>";
Document document = null;
try {
document = DocumentHelper.parseText(text);
} catch (DocumentException e) {
e.printStackTrace();
}
return document;
}
public void saveDocument(Document
document, File outputXml) {
try {
// 美化格式
OutputFormat format = OutputFormat.createPrettyPrint();
/*// 缩减格式
OutputFormat format = OutputFormat.createCompactFormat();*/
/*// 指定XML编码
format.setEncoding("GBK");*/
XMLWriter output = new XMLWriter(new FileWriter(outputXml),
format);
output.write(document);
output.close();
} catch (IOException e) {
System.out.println(e.getMessage());
}
}
public static void main(String[]
argv) {
XmlGen dom4j = new XmlGen();
Document document = null;
// document=dom4j.generateDocumentByMethod();
document = dom4j.generateDocumentByString();
dom4j.saveDocument(document, new File("output.xml"));
}
}
三 遍历xml文件
需要便利的文件如下:
遍历程序源代码:
public static void main(String[] args) {
SAXReader xmlReader = new SAXReader();
Document document = null;
try {
document = xmlReader.read(new File(xmlFilePath));
//获取root节点元素
Element root=document.getRootElement();
//获取root节点名字,即标签名称
String rootName=root.getName();
System.out.println(rootName);
//获取root节点下的名字为class的属性
Attribute rootAttribute=root.attribute("class");
String rootAname=rootAttribute.getName();
System.out.println("rootAname:"+rootAname);
String rootAvalue=rootAttribute.getValue();
System.out.println("rootAvalue :"+rootAvalue);
String rootValue = root.attributeValue("class"); //通过index获取属性
System.out.println("rootValue:"+rootValue);
Attribute rootAttributeByIndex=root.attribute(0);
System.out.println("rootAttributeByIndex name is :"+rootAttributeByIndex.getName());
//获取属性迭代器
Iterator rootAttributeIterator=root.attributeIterator();
//获取属性列表
List<Attribute> rootAttributeList=root.attributes();
//获取所有标签名为property的字节点
for (Iterator it=root.elementIterator("property");it.hasNext();){
Element element=(Element)it.next();
String tagName=element.getName();
System.out.println("tagName :"+tagName);
Attribute attribute=element.attribute("name");
String aName=attribute.getName();
System.out.println("aName is :" + aName);
String aValue=attribute.getValue();
System.out.println("aValue is "+aValue);
//根据标签名,获取指定的子节点
Element child=element.element("value");
String childName=child.getName();
System.out.println("childName is "+childName);
String childText=child.getText();
System.out.println("childText is :" + childText);
System.out.println("--------next childElement-------");
}
} catch (DocumentException e) {
e.printStackTrace();
}
}
上述程序的输出结果为:
object
rootAname:class
rootAvalue :com.qunar.fresh.Student
rootValue:com.qunar.fresh.Student
rootAttributeByIndex name is :class
tagName :property
aName is :name
aValue is name
childName is value
cTag is :${name}
--------next childElement-------
tagName :property
aName is :name
aValue is age
childName is value
cTag is :${age}
--------next childElement-------
tagName :property
aName is :name
aValue is birth
childName is value
cTag is :${birth}
--------next childElement-------
rootAname:class
rootAvalue :com.qunar.fresh.Student
rootValue:com.qunar.fresh.Student
rootAttributeByIndex name is :class
tagName :property
aName is :name
aValue is name
childName is value
cTag is :${name}
--------next childElement-------
tagName :property
aName is :name
aValue is age
childName is value
cTag is :${age}
--------next childElement-------
tagName :property
aName is :name
aValue is birth
childName is value
cTag is :${birth}
--------next childElement-------
dom4j大体遍历的程序大概就是这样的,当时我使用的时候在获取节点的text值花了较多时间,当时主要的问题是和property文件一起进行读,并且第一次是根据property文件为text元素赋值了,但是后来单独对xml文件进行了读写,还是当时的那个property赋的值,和想想的不一样,所以一直没有找到.当时怀疑的dom4j的缓存的问题.下面,就这个方面详细实验下.
注意:经过了反复实验,代码如下:
public static void main(String[] args) {
AnalyzeProperties analyzeProperties = new AnalyzeProperties(proFilePath);
SAXReader xmlReader = new SAXReader();
Document document = null;
try {
document = xmlReader.read(new File(xmlFilePath));
//获取root节点元素
Element root=document.getRootElement();
//获取root节点名字,即标签名称
String rootName=root.getName();
System.out.println(rootName);
//获取root节点下的名字为class的属性
Attribute rootAttribute=root.attribute("class");
String rootAname=rootAttribute.getName();
System.out.println("rootAname:"+rootAname);
String rootAvalue=rootAttribute.getValue();
System.out.println("rootAvalue :"+rootAvalue);
String rootValue = root.attributeValue("class"); //通过index获取属性
System.out.println("rootValue:"+rootValue);
Attribute rootAttributeByIndex=root.attribute(0);
System.out.println("rootAttributeByIndex name is :"+rootAttributeByIndex.getName());
//获取属性迭代器
Iterator rootAttributeIterator=root.attributeIterator();
//获取属性列表
List<Attribute> rootAttributeList=root.attributes();
//获取所有标签名为property的字节点
for (Iterator it=root.elementIterator("property");it.hasNext();){
Element element=(Element)it.next();
String tagName=element.getName();
System.out.println("tagName :"+tagName);
Attribute attribute=element.attribute("name");
String aName=attribute.getName();
System.out.println("aName is :" + aName);
String aValue=attribute.getValue();
System.out.println("aValue is "+aValue);
//根据标签名,获取指定的子节点
Element child=element.element("value");
String childName=child.getName();
System.out.println("childName is "+childName);
String childText=child.getText();
System.out.println("childText is :"+childText);
//赋值
(1111) //child.setText((String)analyzeProperties.getValue(aValue));
System.out.println("after setText ,the text is :"+child.getText());
System.out.println("--------next childElement-------");
}
} catch (DocumentException e) {
e.printStackTrace();
}
}
通过反复增加,删除(1111)行的语句,text的值确实在变化,也就是说,dom4j的缓存问题因该不会存在.刚开始的实验结果,可能是因为程序过大,程序代码中某一个地方忘掉注释了或者程序线程没有死掉,导致结果不同吧.按照道理来说,程序终止了,缓存一定不会存在的.这里也得到一个经验吧,以后遇到不熟悉的jar包接口,一定写个小的demo演示总结下,不要和其他的业务系统放在一起测试,否则....就是会有比较奇葩的结果.到此为止吧,若你有建议,补充,欢迎评论回复哈.