sax是基于事件回调模型的,比dom(document)更快捷。同时对解析过程有更多的控制。在java中可用的有原生态的包:javax.xml.parsers.SAXParser或者apache的Xerces中的。
解析无非两件事:一个解析器(SAXParser,XMLReader),一个事件句析或者叫作回调函数。
注意的是:
1:
当读到开始标签时会调用:public void startElement(String uri, String localName, String qName,Attributes atts) throws SAXException {}和public void characters(char[] ch, int start, int length)throws SAXException {}
2:
当读到结束标签时会调用:
public void endElement(String uri, String localName, String qName)throws SAXException{}和public void characters(char[] ch, int start, int length)throws SAXException {}
此时取到的char是空
3:处理的对象不需要在回调对象中传出。因为这时传到业务端(回调对象中的参数是按引用传递的)。代码示例以说明:
以下是回调类或者处理句柄或者叫业务对象
class RssItemHandler extends DefaultHandler{
private List<RssItem> items=new ArrayList<>();
private RssItem it=null;
private String currentTag = null;
private boolean isStartTag=false;
private String allowElement="link|pubDate|description|title";
public RssItemHandler(List<RssItem> items) {
super();
this.items = items;
}
@Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
// TODO Auto-generated method stub
if("item".equals(qName)){
it = new RssItem();
}
if(it!=null && allowElement.indexOf(qName)!=-1){
isStartTag=true;
currentTag = qName;
}
}
@Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
// TODO Auto-generated method stub
if("item".equals(qName)){
System.out.println(it.toString());
items.add(it);
it = null;
currentTag = null;
}
}
@Override
public void characters(char[] ch, int start, int length)
throws SAXException {
// TODO Auto-generated method stub
if(currentTag!=null && isStartTag){
String content = new String(ch,start,length);
if("title".equals(currentTag)){
it.setTitle(content);
}else if("link".equals(currentTag)){
try {
it.setUrl(new URL(content));
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}else if("description".equals(currentTag)){
it.setDescription(content);
}
isStartTag=false;
}
}
}
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
try{
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();
String xmluri="http://news.youkuaiyun.com/rss_news.html";
InputSource is=new InputSource(new URL(xmluri).openStream());
//
List<RssItem> its=new ArrayList<>();
DefaultHandler dh=new RssItemHandler(its);
parser.parse(is, dh);
System.out.println(its.size());
for(RssItem ri:its){
System.out.println("[RV]"+ri.toString());
}
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
不需要在DefaultHandler的子类中传出
private List<RssItem> items=new ArrayList<>();
属性,不需要像http://www.iteye.com/topic/763895说明的哪样把 List<Book> books属性传出.
4.parse方法的第一个参数建议用InputSource
最后:运行环境:java7