jieba分词器

最新推荐文章于 2024-06-19 17:25:40 发布

不写代码的程序员~zs

最新推荐文章于 2024-06-19 17:25:40 发布

阅读量229

点赞数

分类专栏：自然语言处理文章标签：自然语言处理 nlp

本文链接：https://blog.youkuaiyun.com/m0_57064565/article/details/119453603

版权

自然语言处理专栏收录该内容

15 篇文章

订阅专栏

默认就是精确切分

搜索引擎格式切分

关键字抽取

import jieba.analyse as analyse

seg_list = jieba.cut(text,cut_all=False)
print("分词结果：")
print(" ".join(seg_list))

#获取关键字
with open('data/nba.txt',encoding='utf8') as f:
    lines = f.read()  #读入文本保存到字符串中
withWeight = True
tags = analyse.extract_tags(lines,topK=20,withWeight=withWeight,allowPOS=())
 
if withWeight is True:
    print(tags) #元组列表  
    print("-------")
    for tag in tags:
        print("tag: %s\t\t weight: %f" % (tag[0],tag[1]))
else:
    print(tags)
    print("--------")
    print(" ".join(tags))