统计hamlet中出现词频最高的10个单词

最新推荐文章于 2024-08-15 23:40:54 发布

zigzag_gy

最新推荐文章于 2024-08-15 23:40:54 发布

阅读量6.2k

点赞数 3

CC 4.0 BY-SA版权

本文链接：https://blog.youkuaiyun.com/zigzag_gy/article/details/88081906

本文介绍了一种使用Python代码统计英文文学作品《哈姆雷特》中单词出现频率的方法，通过处理特殊字符并转换为小写，实现了单词的准确计数，最后输出了出现频率最高的前10个单词。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

先下载好hamlet的英文版本，用python统计书中出现次数最高的单词
代码如下：

def getText():          #处理特殊字符，把文章单词全部变成小写
    txt = open("Hamlet.txt","r").read()
    txt = txt.lower()
    for ch in '!"#$%&()*+,-./:;<=>?@[\\]^_‘{|}~':
        txt = txt.replace(ch," ")
    return txt
hamletTxt = getText()    #调用定义的函数
words = hamletTxt.split() #把单词分隔
counts = {}
for word in words:
    counts[word] = counts.get(word,0)+1
items = list(counts.items())
items.sort(key=lambda x:x[1],reverse=True)
for i in range(10):
    word,count = items[i]
    print("{0:<10}{1:>5}".format(word,count))

图片如下：
在这里插入图片描述