java解析xml dom_Java解析XML文档(一):DOM-优快云博客

本文链接：https://blog.youkuaiyun.com/weixin_39935092/article/details/114047041

一.概述

DOM (Document Object Module),基于树和节点的文档对象类型。

在实际的项目中，曾遇到需要处理很大的xml文件，大概200多M,加载XML文件时，会报内存溢出java.lang.OutOfMemoryError ，那就需要sax解析了。

二.DOM编程

1.使用到的包(不需要导入其它包,JDK自带的)：

org.w3c.dom ：为文档对象模型 (DOM) 提供接口

javax.xml.parsers：解析器工厂工具

例如：

import javax.xml.parsers.DocumentBuilder;

import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;

import org.w3c.dom.Element;

import org.w3c.dom.NamedNodeMap;

import org.w3c.dom.Node;

import org.w3c.dom.NodeList;

2.解析流程

首先实例化一个文档解析器工厂，从此工厂中获取一个文档解析器，解析器解析XML文件转化后的输入流，从而构建一个文档org.w3c.dom.Document实例。接着利用API，对Document对象进行相应的操作即可。

3实例代码

(1)XML文件：plan.xml

10001

PPO 1

2005

30.00

20001

POS 1

29.99

30001

HMO 1

49.99

40001

EPO 1

39.95

(2)解析代码

import java.io.File;

import javax.xml.parsers.DocumentBuilder;

import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;

import org.w3c.dom.Element;

import org.w3c.dom.NamedNodeMap;

import org.w3c.dom.Node;

import org.w3c.dom.NodeList;

public class ParseXMLByDOM {

public static void main(String[] args) {

File file = new File("plan.xml");

if (!file.exists()) {

System.out.println("File is not found!");

return;

}

DocumentBuilderFactory fac = DocumentBuilderFactory.newInstance();

try {

DocumentBuilder builder = fac.newDocumentBuilder();

// InputStream is = new FileInputStream("plan.xml");

System.out.println("File name:" + file.getName());

Document doc = builder.parse(file);

// 或 builder.parse(is)

Element root = doc.getDocumentElement();

System.out.println("root name:" + root.getTagName());

NodeList nodeList = doc.getElementsByTagName("plan");

System.out.println("nodeList length:" + nodeList.getLength());

Node fNodes = nodeList.item(0);

System.out.println("father name:" + fNodes.getNodeName());

NamedNodeMap attrs = fNodes.getAttributes();// 父节点的所有属性

for (int i = 0; i < attrs.getLength(); i++) {

Node attribute = attrs.item(i);

System.out.println("father key:" + attribute.getNodeName()

+ ",value:" + attribute.getNodeValue());

}

NodeList cNodes = fNodes.getChildNodes();

System.out.println("childList length:" + cNodes.getLength());// 输出9，下面分析1

for (int j = 0; j < cNodes.getLength(); j++) {

Node cNode = cNodes.item(j);

if (Node.ELEMENT_NODE == cNode.getNodeType()) {

System.out.println("child key:" + cNode.getNodeName()

+ ",value:" + cNode.getFirstChild().getNodeValue());

}

} catch (Exception e) {

e.printStackTrace();

}

输出：

File name:plan.xml

root name:plan-info

nodeList length:4

father name:plan

father key:category,value:PPO

childList length:9

child key:carrier-id,value:10001

child key:name,value:PPO 1

child key:year,value:2005

child key:rate,value:30.00

(3)代码分析

1)在DOM解析时会将每个回车都视为 1个节点的子节点，所以cNodes有9个节点，其中多了5个回车。

2)判断节点类型：if (Node.ELEMENT_NODE == cNode.getNodeType()) ，也可以使用cNode instanceof Element 。

3)节点的属性也是它的子节点，所以节点取值cNode.getFirstChild().getNodeValue()，而不是cNode.getNodeValue() 。

分享到：

2010-03-18 11:15