Python在不温不火的发展了10年后,突然火起来了,原因是蟒父Guido van Rossum加入了google,而google又迫切希望拥有一门可以和Sun、IBM的Java,MS的C#比肩的下一代语言,看起来
python很有潜力成为这样一门语言。于是,作为独立IT爱好者,同时,也是一个非常想去google拿高工资的我,在“智廉”网上看到
google招人要求后,决定好好学学python。
我学习过Perl,同Perl的格言There's More Than One Way to Do It不同,Python的格言是There should be one-- and preferably only one --obvious way to do it. 很难说哪种哲学正确,但Python的易读易用性比Perl的确好一些。
华硕的WL500GD无线路由器可以外接一个移动硬盘兼做FTP服务器,我去年10月入手一台后,在WL500GD上7x24小时一直运行FTP服务,FTP和其他服务一起,通过 http://192.168.1.1/的浏览器界面可以设置管理路由器。当然也包括检查FTP的访问情况,如下图所示:
无疑这样通过浏览器访问需要多次点击,较麻烦,而且光看IP很难估计出访客来自哪里,通常还要通过 http://www.ip138.com/可以查询IP地址来自哪里,剪贴复制的操作也不可少,趁着学习Python,我决定写一个漂亮点的程序,一次解决这些问题。
仔细分析一下,这个小小的Python程序还真运用了不少技术,通过urllib2模块提交IP数据并抓取结果页面,抓取需要HTTP基本验证才能访问的页面(访问路由器日志),通过sgmllib模块来分析抓取下来的html页面,取得需要的信息,通过anydbm模块缓存IP地址地理位置信息,通过正则表达式和string的一些访问处理文本……
好了,如果你恰巧也有一部华硕的WL500G的话,不妨试试这个程序,其他无线路由的原理相同,但需要略为修改:
Check_WL500GD_FTP_Log.py
import urllib, urllib2, sgmllib, anydbm, re


class LogsParser(sgmllib.SGMLParser):
def __init__(self):
sgmllib.SGMLParser.
__init__(self)
self.in_textarea
= False
self.log
= ""
def start_textarea(self, attributes):
for name, value in attributes:
if name == 'class' and value == 'content_log_td' :
self.in_textarea
= True
return
def end_textarea(self) :
self.in_textarea
= False
return
def handle_data(self, data) :
if self.in_textarea == True :
self.log
+= data
return


class LocationParser(sgmllib.SGMLParser):
def __init__(self):
sgmllib.SGMLParser.
__init__(self)
self.in_location
= False
self.location_info
= ""
def start_ul(self, attributes) :
for name, value in attributes:
if name == 'class' and value == 'ul1' :
self.in_location
= True
def end_ul(self) :
self.in_location
= False
return
def handle_data(self, data) :
if self.in_location == True :
self.location_info
+= data
return

def Get_Logs() :
# create a password manager

password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
# Add the username and password.

# If we knew the realm, we could use it instead of ``None``.

top_level_url = "http://192.168.1.1/"
password_mgr.add_password(None, top_level_url,
"admin", "password") # Change password to fit your router

handler = urllib2.HTTPBasicAuthHandler(password_mgr)
# create "opener" (OpenerDirector instance)

opener = urllib2.build_opener(handler)
# use the opener to fetch a URL

opener.open(top_level_url)
# Install the opener.

# Now all calls to urllib2.urlopen use our opener.

urllib2.install_opener(opener)

p
= LogsParser( )
f
= urllib2.urlopen("http://192.168.1.1/Main_LogStatus_Content.asp")
data
= f.read()
if not data:
return
p.feed(data)
p.close()
f.close()
return p.log.split('\n')


def Get_Location_from_IP138(ip) :
l
= LocationParser()
submit_data
= {'ip' : '127.0.0.1', 'action' : '2' }
submit_data[
'ip']=ip
f
=urllib2.urlopen("http://www.ip138.com/ips8.asp", urllib.urlencode(submit_data))
data
= f.read()
if not data:
return
l.feed(data)
l.close()
f.close()
ipinfo
= l.location_info
return ipinfo[ipinfo.rindex("查询结果3:")+11:]


class Location() :
def __init__(self) :
self.ip_db
= anydbm.open('ip_location_cache.dbm', 'c')
self.ip_match
= re.compile(r'\d+\.\d+\.\d+\.\d+')
def __del__(self) :
self.ip_db.close()
def info(self, ip) :
if self.ip_match.match(ip,1) != None :
if self.ip_db.has_key(ip) == 0 : # no IP info in cache

self.ip_db[ip] = Get_Location_from_IP138(ip)
return self.ip_db[ip]
return ""

logs
= Get_Logs()
loc
= Location()

for line in logs :
if line.strip() != "" :
print line, loc.info(line[line.rindex(' ')+1:])
raw_input()

这个程序在CPython下运行通过,对于Web浏览器界面的运行结果如下:
我学习过Perl,同Perl的格言There's More Than One Way to Do It不同,Python的格言是There should be one-- and preferably only one --obvious way to do it. 很难说哪种哲学正确,但Python的易读易用性比Perl的确好一些。
华硕的WL500GD无线路由器可以外接一个移动硬盘兼做FTP服务器,我去年10月入手一台后,在WL500GD上7x24小时一直运行FTP服务,FTP和其他服务一起,通过 http://192.168.1.1/的浏览器界面可以设置管理路由器。当然也包括检查FTP的访问情况,如下图所示:

无疑这样通过浏览器访问需要多次点击,较麻烦,而且光看IP很难估计出访客来自哪里,通常还要通过 http://www.ip138.com/可以查询IP地址来自哪里,剪贴复制的操作也不可少,趁着学习Python,我决定写一个漂亮点的程序,一次解决这些问题。
仔细分析一下,这个小小的Python程序还真运用了不少技术,通过urllib2模块提交IP数据并抓取结果页面,抓取需要HTTP基本验证才能访问的页面(访问路由器日志),通过sgmllib模块来分析抓取下来的html页面,取得需要的信息,通过anydbm模块缓存IP地址地理位置信息,通过正则表达式和string的一些访问处理文本……
好了,如果你恰巧也有一部华硕的WL500G的话,不妨试试这个程序,其他无线路由的原理相同,但需要略为修改:





class LogsParser(sgmllib.SGMLParser):

def __init__(self):

__init__(self)

= False

= ""

def start_textarea(self, attributes):

for name, value in attributes:

if name == 'class' and value == 'content_log_td' :

= True

return

def end_textarea(self) :

= False

return

def handle_data(self, data) :

if self.in_textarea == True :

+= data

return


class LocationParser(sgmllib.SGMLParser):

def __init__(self):

__init__(self)

= False

= ""

def start_ul(self, attributes) :

for name, value in attributes:

if name == 'class' and value == 'ul1' :

= True

def end_ul(self) :

= False

return

def handle_data(self, data) :

if self.in_location == True :

+= data

return


def Get_Logs() :

# create a password manager

password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()

# Add the username and password.

# If we knew the realm, we could use it instead of ``None``.

top_level_url = "http://192.168.1.1/"

"admin", "password") # Change password to fit your router

handler = urllib2.HTTPBasicAuthHandler(password_mgr)

# create "opener" (OpenerDirector instance)

opener = urllib2.build_opener(handler)

# use the opener to fetch a URL

opener.open(top_level_url)

# Install the opener.

# Now all calls to urllib2.urlopen use our opener.

urllib2.install_opener(opener)


= LogsParser( )

= urllib2.urlopen("http://192.168.1.1/Main_LogStatus_Content.asp")

= f.read()

if not data:

return




return p.log.split('\n')


def Get_Location_from_IP138(ip) :

= LocationParser()

= {'ip' : '127.0.0.1', 'action' : '2' }

'ip']=ip

=urllib2.urlopen("http://www.ip138.com/ips8.asp", urllib.urlencode(submit_data))

= f.read()

if not data:

return




= l.location_info

return ipinfo[ipinfo.rindex("查询结果3:")+11:]


class Location() :

def __init__(self) :

= anydbm.open('ip_location_cache.dbm', 'c')

= re.compile(r'\d+\.\d+\.\d+\.\d+')

def __del__(self) :


def info(self, ip) :

if self.ip_match.match(ip,1) != None :

if self.ip_db.has_key(ip) == 0 : # no IP info in cache

self.ip_db[ip] = Get_Location_from_IP138(ip)

return self.ip_db[ip]

return ""


= Get_Logs()

= Location()

for line in logs :

if line.strip() != "" :

print line, loc.info(line[line.rindex(' ')+1:])


这个程序在CPython下运行通过,对于Web浏览器界面的运行结果如下:
Jun 25 02:43:53 FTP server: user anonymous logged in from 219.129.83.12 广东省韶关市 电信ADSL Jun 25 02:43:53 FTP server: user anonymous logged in from 219.129.83.12 广东省韶关市 电信ADSL Jun 25 03:08:58 ntp client: time is synchronized to time.nist.gov Jun 25 05:09:04 ntp client: time is synchronized to time.nist.gov Jun 25 07:09:09 ntp client: time is synchronized to time.nist.gov Jun 25 08:48:11 FTP server: user anonymous logged in from 219.146.252.21 山东省青岛市 广电网 Jun 25 08:48:44 FTP server: user anonymous logged in from 219.146.252.21 山东省青岛市 广电网 Jun 25 08:58:28 FTP server: user anonymous quit by session timeout Jun 25 09:09:09 ntp client: time is synchronized to time.nist.gov Jun 25 09:41:20 FTP server: user anonymous logged in from 222.244.8.189 湖南省长沙市 (宁乡县)电信 Jun 25 09:41:22 FTP server: user anonymous logged in from 222.244.8.189 湖南省长沙市 (宁乡县)电信Python有趣么?嗯,Cool

