有人在论坛 上问 将日志格式化的方法,
刚好学python,就拿这个练手了:
09:55:54: ERROR1 /tmp/error/log.3 50 times
Mon Jun 28 00:00:53 2009
09:55:54: ERROR1 /tmp/error/log.3 50 times
09:56:09: ERROR1 /tmp/error/log.14 50 times
10:56:12: ERROR1 /tmp/error/log.14 100 times
10:56:23: ERROR2 /tmp/error/log.5 50 times
11:56:26: ERROR2 /tmp/error/log.1 50 times
11:56:27: ERROR2 /tmp/error/log.5 100 times
Mon Jun 29 00:00:53 2009
15:56:29: ERROR3 /tmp/error/log.1 100 times
15:56:32: ERROR3 /tmp/error/log.1 150 times
16:56:33: ERROR4 /tmp/error/log.6 50 times
16:56:36: ERROR4 /tmp/error/log.6 100 times
16:56:40: ERROR4 /tmp/error/log.12 50 times
Mon Jun 30 00:00:53 2009
格式化成:
2009|Jun|28|10|ERROR1|1
2009|Jun|28|10|ERROR2|1
2009|Jun|29|16|ERROR4|3
2009|Jun|29|15|ERROR3|2
python 源代码:
#!/usr/bin/env python importre importsys defformat_log(logf): re_hour = re.compile(r'^/d/d:') re_sp_split = re.compile(r'/s+') #extract the day from time line make_day = lambdad:d[4] + '|'+ d[1]+ '|'+ d[2] make_key = lambdad,c:d + '|'+ c[0][0:2] + '|'+ c[1] cur_day = 'n/a|n/a|n/a' log_cnt=dict() forline inlogf: m = re_hour.match(line) cols = re_sp_split.split(line) iflen(cols) < 3: #not a valid log continue ifm isnotNone: #this is log line ,not date key = make_key(cur_day, cols) #date | hour | err_type ifkey notinlog_cnt: log_cnt[key] = 1 else: log_cnt[key] += 1 else: iflen(cols) < 5: #not a valid log continue cur_day=make_day(cols) sorted_key = log_cnt.keys() sorted_key.sort() forkey insorted_key: print'%s|%d'% (key, log_cnt[key]) if__name__ == '__main__': format_log(sys.stdin) |
python 比较易懂,这点比perl 要好多了。
上边的代码,即使不会python的人也能看懂60%
帖子链接:
http://bbs.chinaunix.net/viewthread.php?tid=1497342&extra=&page=1