Python: 字典应用题

本文介绍了一个Python程序,用于读取并分析邮件数据文件,统计并找出发送邮件最多的人员。程序通过读取每一行,使用split()方法进行字符串拆分,并检查是否为‘From’行,进而提取发件人的邮件地址,利用字典记录每个发件人的邮件数量,最后通过遍历字典找到发送邮件最多的人。

Write a program to read through the mbox-short.txt and figure out who has sent the greatest number of mail messages. The program looks for 'From ' lines and takes the second word of those lines as the person who sent the mail. The program creates a Python dictionary that maps the sender's mail address to a count of the number of times they appear in the file. After the dictionary is produced, the program reads through the dictionary using a maximum loop to find the most prolific committer.

 

name = input("Enter file:")
if len(name) < 1 : name = "mbox-short.txt"
handle = open(name)
counts = dict()
persons = list()
clist = list

for line in handle:
    line = line.rstrip()
    clist = line.split()
#Guardian pattern
if line == '': continue if clist[0] is 'From': persons.append(clist[1])
for person in persons: counts[person] = counts.get(person,0) +1 bigcount = None bigperson = None for p,c in counts.items(): if bigcount is None or c>bigcount: bigcount = c bigperson = p print(bigperson, bigcount)

python 数据结构综合题:

  1. 每一个代码块都实现一个功能,功能之间不要冗杂:
  2. 以行为单位读文件
  3. .split() 把string拆成一个list, 如果list首元素为From,则把list[2]拿出来,存到新列表person list里面
  4. 用dictionary做histogram,把(key, value)是(人名,次数)
  5. 遍历 dictionary,找最大值

这样,功能分布清晰,提高代码可读性,也有利于debug

 

Guardian pattern:

防止line是空行,否则closet[0] 会报错: out of range. 因为是空list,没有list[0]元素

 

有个很好的debug教程:

Python data structures - week 4 - Assignment Chapter 8 - Worked exercise ( 看助教是怎么找错的~

转载于:https://www.cnblogs.com/Jessiezyr/p/10468314.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值