API文档:
SGMLParser.reset()
Reset the instance. Loses all unprocessed data. This is called implicitly at instantiation time.
翻译文档:
reset有SGMLParser的__init__调用,在reset进行初始化的工作。
例子:
#! /usr/bin/env python
#coding=utf-8
from sgmllib import SGMLParser
class URLLister(SGMLParser):
def reset(self):
SGMLParser.reset(self)
self.urls=[]
def start_a(self,attrs):
href = [v for k,v in attrs if k=='href']
if href:
self.urls.extend(href)
import urllib
usock = urllib.urlopen('http://www.baidu.com')
parser = URLLister()
parser.feed(usock.read())
usock.close()
parser.close()
for url in parser.urls:
print url