def getText():
txt = open("./hamlet.txt","r").read()
txt = txt.lower()
for ch in '!"#$%&()*+,-./:;<=>?@[\\]^_{|}~':
txt = txt.replace(ch,"")
return txt
hamletTxt = getText()
words = hamletTxt.split()
counts = {}
for word in words:
counts[word] = counts.get(word,0)+1
items = list(counts.items())
items.sort(key=lambda x:x[1],reverse=True)
for i in range(10):
word,count = items[i]
print("{0:<10}{1:>5}".format(word,count))

本文展示了一个简单的Python脚本,用于读取《哈姆雷特》文本文件,并统计其中单词出现的频率。该脚本将所有字符转换为小写并移除标点符号,然后按频率从高到低列出前十个最常用的单词。
1286

被折叠的 条评论
为什么被折叠?



