Chapter 9 ：Dictionaries (Assignment)

最新推荐文章于 2022-09-21 09:53:28 发布

原创最新推荐文章于 2022-09-21 09:53:28 发布 · 279 阅读

0 ·

CC 4.0 BY-SA版权

本文介绍了一个Python脚本，该脚本可以读取指定文件中的邮件地址，并统计出现频率最高的邮件地址。

这里写图片描述

name = raw_input("Enter file:")
if len(name) < 1 : name = "mbox-short.txt"
lst2=list()
lst=list()
handle = open(name)
for line in handle:
    line=line.rstrip()
    if not line.startswith('From '):continue
    lst=line.split()
    lst2.append(lst[1])
words=dict()
for word in lst2:
    words[word]=words.get(word,0)+1

bignum=None
bigcount=None
for num,count in words.items():
    if bigcount is None or bigcount<count:
        bigcount=count
        bignum=num

print bignum,bigcount

可以把这次的代码看成两部分组成：
part 1：

name = raw_input("Enter file:")
if len(name) < 1 : name = "mbox-short.txt"
lst2=list()
lst=list()
handle = open(name)
for line in handle:
    line=line.rstrip()
    if not line.startswith('From '):continue
    lst=line.split()
    lst2.append(lst[1])

一开始提前邮箱地址的代码和上次的没什么区别，唯一要注意的是，上次偷懒所以每次循环都打印一次找到的邮箱地址，而没有真正意义上把所有mail存入到一个list中。所以这次新建一个lst2，调用append()方法存入所有邮箱。

顺便提一下这里的lst[1]应该是String类型的。

part2：

words=dict()
for word in lst2:
    words[word]=words.get(word,0)+1

bignum=None
bigcount=None
for num,count in words.items():
    if bigcount is None or bigcount<count:
        bigcount=count
        bignum=num

print bignum,bigcount