上一篇中我们讨论了解析器如何将XML文件转化为Document(总体大概知道是如何解析,还需后期对底层代码好好了解),下面我们看看转化为Document之后,是如何取的Document里面的一些元素节点的呢?
再看代码之后我要了解一些基础的知识:
java.util.Vector:我个人的理解它的本质是一个大小可变化的数组,在这个基础上又封装了一些方法来存取数据
上一篇我们知道:Document document = builder.parse(inputStream);返回的对象是DeferredDocumentImpl
那我们继续逐行分析我们的例子代码:Element element = document.getDocumentElement();
它是调用父类com.sun.org.apache.xerces.internal.dom.CoreDocumentImpl的方法:
/**
* Convenience method, allowing direct access to the child node
* which is considered the root of the actual document content. For
* HTML, where it is legal to have more than one Element at the top
* level of the document, we pick the one with the tagName
* "HTML". For XML there should be only one top-level
*
* (HTML not yet supported.)
*/
public Element getDocumentElement() {
if (needsSyncChildren()) {
synchronizeChildren();
}
return docElement;
}
最后返回的对象是com.sun.org.apache.xerces.internal.dom.ElementImpl:
我们在来看看这句:NodeList bookNodes = element.getElementsByTagName("book");
/**
* Returns a NodeList of all descendent nodes (children,
* grandchildren, and so on) which are Elements and which have the
* specified tag name.
* <p>
* Note: NodeList is a "live" view of the DOM. Its contents will
* change as the DOM changes, and alterations made to the NodeList
* will be reflected in the DOM.
*
* @param tagname The type of element to gather. To obtain a list of
* all elements no matter what their names, use the wild-card tag
* name "*".
*
* @see DeepNodeListImpl
*/
public NodeList getElementsByTagName(String tagname) {
return new DeepNodeListImpl(this,tagname);
}
看看这句:bookNodes.getLength():
/** Returns the length of the node list. */
public int getLength() {
// Preload all matching elements. (Stops when we run out of subtree!)
item(java.lang.Integer.MAX_VALUE);
return nodes.size();
}
/** Returns the node at the specified index. */
public Node item(int index) {
Node thisNode;
// Tree changed. Do it all from scratch!
if(rootNode.changes() != changes) {
nodes = new Vector();
changes = rootNode.changes();
}
// In the cache
if (index < nodes.size())
return (Node)nodes.elementAt(index);
// Not yet seen
else {
// Pick up where we left off (Which may be the beginning)
if (nodes.size() == 0)
thisNode = rootNode;
else
thisNode=(NodeImpl)(nodes.lastElement());
// Add nodes up to the one we're looking for
while(thisNode != null && index >= nodes.size()) {
thisNode=nextMatchingElementAfter(thisNode);
if (thisNode != null)
nodes.addElement(thisNode);
}
// Either what we want, or null (not avail.)
return thisNode;
}
} // item(int):Node
下面这个方法Node current一开始传入进来就是DeferredElementImpl类
/**
* Iterative tree-walker. When you have a Parent link, there's often no
* need to resort to recursion. NOTE THAT only Element nodes are matched
* since we're specifically supporting getElementsByTagName().
*/
protected Node nextMatchingElementAfter(Node current) {
Node next;
while (current != null) {
// Look down to first child.
if (current.hasChildNodes()) {
current = (current.getFirstChild());
}
// Look right to sibling (but not from root!)
else if (current != rootNode && null != (next = current.getNextSibling())) {
current = next;
}
// Look up and right (but not past root!)
else {
next = null;
for (; current != rootNode; // Stop when we return to starting point
current = current.getParentNode()) {
next = current.getNextSibling();
if (next != null)
break;
}
current = next;
}
// Have we found an Element with the right tagName?
// ("*" matches anything.)
if (current != rootNode
&& current != null
&& current.getNodeType() == Node.ELEMENT_NODE) {
if (!enableNS) {
if (tagName.equals("*") ||
((ElementImpl) current).getTagName().equals(tagName))
{
return current;
}
} else {
// DOM2: Namespace logic.
if (tagName.equals("*")) {
if (nsName != null && nsName.equals("*")) {
return current;
} else {
ElementImpl el = (ElementImpl) current;
if ((nsName == null
&& el.getNamespaceURI() == null)
|| (nsName != null
&& nsName.equals(el.getNamespaceURI())))
{
return current;
}
}
} else {
ElementImpl el = (ElementImpl) current;
if (el.getLocalName() != null
&& el.getLocalName().equals(tagName)) {
if (nsName != null && nsName.equals("*")) {
return current;
} else {
if ((nsName == null
&& el.getNamespaceURI() == null)
|| (nsName != null &&
nsName.equals(el.getNamespaceURI())))
{
return current;
}
}
}
}
}
}
// Otherwise continue walking the tree
}
// Fell out of tree-walk; no more instances found
return null;
} // nextMatchingElementAfter(int):Node
我们先来看看com.sun.org.apache.xerces.internal.dom.ParentNode类中的current.hasChildNodes()这句都做了些什么:
/**
* Test whether this node has any children. Convenience shorthand
* for (Node.getFirstChild()!=null)
*/
public boolean hasChildNodes() {
if (needsSyncChildren()) {
synchronizeChildren();
}
return firstChild != null;
}
这里的synchronizeChildren()方法是实现类DeferredElementImpl实现如下:
protected final void synchronizeChildren() {
DeferredDocumentImpl ownerDocument =
(DeferredDocumentImpl) ownerDocument();
ownerDocument.synchronizeChildren(this, fNodeIndex);
} // synchronizeChildren()
这里就把text对象赋值给element: p.firstChild = firstNode;所以在后面才能取到
/**
* Synchronizes the node's children with the internal structure.
* Fluffing the children at once solves a lot of work to keep
* the two structures in sync. The problem gets worse when
* editing the tree -- this makes it a lot easier.
* This is not directly used in this class but this method is
* here so that it can be shared by all deferred subclasses of ParentNode.
*/
protected final void synchronizeChildren(ParentNode p, int nodeIndex) {
// we don't want to generate any event for this so turn them off
boolean orig = getMutationEvents();
setMutationEvents(false);
// no need to sync in the future
p.needsSyncChildren(false);
// create children and link them as siblings
ChildNode firstNode = null;
ChildNode lastNode = null;
for (int index = getLastChild(nodeIndex);
index != -1;
index = getPrevSibling(index)) {
ChildNode node = (ChildNode) getNodeObject(index);
if (lastNode == null) {
lastNode = node;
}
else {
firstNode.previousSibling = node;
}
node.ownerNode = p;
node.isOwned(true);
node.nextSibling = firstNode;
firstNode = node;
}
if (lastNode != null) {
p.firstChild = firstNode;
firstNode.isFirstChild(true);
p.lastChild(lastNode);
}
// set mutation events flag back to its original value
setMutationEvents(orig);
} // synchronizeChildren(ParentNode,int):void
我们可以看到在 type == 3,会新建DeferredTextImpl对象:
/** Instantiates the requested node object. */
public DeferredNode getNodeObject(int nodeIndex) {
// is there anything to do?
if (nodeIndex == -1) {
return null;
}
// get node type
int chunk = nodeIndex >> CHUNK_SHIFT;
int index = nodeIndex & CHUNK_MASK;
int type = getChunkIndex(fNodeType, chunk, index);
if (type != Node.TEXT_NODE && type != Node.CDATA_SECTION_NODE) {
clearChunkIndex(fNodeType, chunk, index);
}
// create new node
DeferredNode node = null;
switch (type) {
//
// Standard DOM node types
//
case Node.ATTRIBUTE_NODE: {
if (fNamespacesEnabled) {
node = new DeferredAttrNSImpl(this, nodeIndex);
} else {
node = new DeferredAttrImpl(this, nodeIndex);
}
break;
}
case Node.CDATA_SECTION_NODE: {
node = new DeferredCDATASectionImpl(this, nodeIndex);
break;
}
case Node.COMMENT_NODE: {
node = new DeferredCommentImpl(this, nodeIndex);
break;
}
// NOTE: Document fragments can never be "fast".
//
// The parser will never ask to create a document
// fragment during the parse. Document fragments
// are used by the application *after* the parse.
//
// case Node.DOCUMENT_FRAGMENT_NODE: { break; }
case Node.DOCUMENT_NODE: {
// this node is never "fast"
node = this;
break;
}
case Node.DOCUMENT_TYPE_NODE: {
node = new DeferredDocumentTypeImpl(this, nodeIndex);
// save the doctype node
docType = (DocumentTypeImpl)node;
break;
}
case Node.ELEMENT_NODE: {
if (DEBUG_IDS) {
System.out.println("getNodeObject(ELEMENT_NODE): "+nodeIndex);
}
// create node
if (fNamespacesEnabled) {
node = new DeferredElementNSImpl(this, nodeIndex);
} else {
node = new DeferredElementImpl(this, nodeIndex);
}
// save the document element node
if (docElement == null) {
docElement = (ElementImpl)node;
}
// check to see if this element needs to be
// registered for its ID attributes
if (fIdElement != null) {
int idIndex = binarySearch(fIdElement, 0,
fIdCount-1, nodeIndex);
while (idIndex != -1) {
if (DEBUG_IDS) {
System.out.println(" id index: "+idIndex);
System.out.println(" fIdName["+idIndex+
"]: "+fIdName[idIndex]);
}
// register ID
String name = fIdName[idIndex];
if (name != null) {
if (DEBUG_IDS) {
System.out.println(" name: "+name);
System.out.print("getNodeObject()#");
}
putIdentifier0(name, (Element)node);
fIdName[idIndex] = null;
}
// continue if there are more IDs for
// this element
if (idIndex + 1 < fIdCount &&
fIdElement[idIndex + 1] == nodeIndex) {
idIndex++;
}
else {
idIndex = -1;
}
}
}
break;
}
case Node.ENTITY_NODE: {
node = new DeferredEntityImpl(this, nodeIndex);
break;
}
case Node.ENTITY_REFERENCE_NODE: {
node = new DeferredEntityReferenceImpl(this, nodeIndex);
break;
}
case Node.NOTATION_NODE: {
node = new DeferredNotationImpl(this, nodeIndex);
break;
}
case Node.PROCESSING_INSTRUCTION_NODE: {
node = new DeferredProcessingInstructionImpl(this, nodeIndex);
break;
}
case Node.TEXT_NODE: {
node = new DeferredTextImpl(this, nodeIndex);
break;
}
//
// non-standard DOM node types
//
case NodeImpl.ELEMENT_DEFINITION_NODE: {
node = new DeferredElementDefinitionImpl(this, nodeIndex);
break;
}
default: {
throw new IllegalArgumentException("type: "+type);
}
} // switch node type
// store and return
if (node != null) {
return node;
}
// error
throw new IllegalArgumentException();
} // createNodeObject(int):Node
我这里有一点搞不明白,当执行完这句话node = new DeferredElementImpl(this, nodeIndex);之后,node里面属性name的值就有了。
后来也看了很多次,发现每次传入nodeIndex的值都不一样,我在想原因是不是在这里呢?