1.读入待分析的字符串 2.分解提取单词 3.计数字典 4.排除语法型词汇 5.排序 6.输出TOP(20)...

1.读入待分析的字符串

 

 1 fo=open('test.txt','w')
 2 fo.write('''You gotta go and get angry at all of my honesty
 3 You know I try but I don’t do too well with apologies
 4 I hope I don’t run out of time, could someone call a referee?
 5 Cause I just need one more shot at forgiveness
 6 I know you know that I made those mistakes maybe once or twice
 7 By once or twice I mean maybe a couple a hundred times
 8 So let me, oh let me redeem, oh redeem, oh myself tonight
 9 Cause I just need one more shot at second chances
10 Yeah, is it too late now to say sorry?
11 Cause I’m missing more than just your body
12 Is it too late now to say sorry?
13 Yeah I know that I let you down
14 Is it too late to say I’m sorry now?
15 I’m sorry, yeah
16 Sorry, yeah
17 Sorry
18 Yeah I know that I let you down
19 Is it too late to say sorry now?
20 I’ll take every single piece of the blame if you want me to
21 But you know that there is no innocent one in this game for two
22 I’ll go, I’ll go and then you go, you go out and spill the truth
23 Can we both say the words and forget this?
24 Is it too late now to say sorry?
25 Cause I’m missing more than just your body
26 Is it too late now to say sorry?
27 Yeah I know that I let you down
28 Is it too late to say I’m sorry now?
29 I’m not just trying to get you back on me
30 Cause I’m missing more than just your body
31 Is it too late now to say sorry?
32 Yeah I know that I let you down
33 Is it too late to say sorry now?
34 I’m sorry, yeah
35 Sorry, oh
36 Sorry
37 Yeah I know that I let you down
38 Is it too late to say sorry now?
39 I’m sorry, yeah
40 Sorry, oh
41 Sorry
42 Yeah I know that I let you down
43 Is it too late to say sorry now?''')
44 fo.close()

   fo=open('test.txt','r')
   sorry=fo.read()

 

2.分解提取单词 

sorry=sorry.lower()
for i in '?,':
    sorry=sorry.replace(i,' ')
words=sorry.split(' ')#以空格分隔
print(words)

 

 

3.计数字典

4.排除语法型词汇

 

dic={}#定义一个空字典
words.sort()#排列切片好的单词
d=set(words)#集合d的元素就是切片好的单词
for i in d:
    dic[i]=words.count(i)#循环插入值为空的主键i

 

5.排序

wc=list(dic.items())

wc.sort(key=lambda x:x[1],reverse=True)#排序

 

6.输出TOP(20)

for i in range(20):
    print(wc[i])

运行结果:

 

转载于:https://www.cnblogs.com/zhuyinyinyin/p/7595206.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值