
自然语言处理
HeatDeath
Learn by doing!
展开
-
关于jieba结巴中文分词的基本尝试
In [1]: import jiebaIn [2]: a = jieba.cut("我来到了清华大学",cut_all=True)In [3]: aOut[3]: <generator object Tokenizer.cut at 0x000001E8E9CBFDB0>In [4]: list(a)Building prefix dict from the default dictionar原创 2017-04-08 01:38:06 · 5756 阅读 · 1 评论 -
使用 wordcloud, jieba, PIL, matplotlib, numpy 进行分词,统计词频,并绘制词云的一次尝试
#coding=utf-8from wordcloud import WordCloudimport jiebaimport PILimport matplotlib.pyplot as pltimport numpy as npdef wordcloudplot(txt): path = r'ancient_style.ttf' # path = unicode(pat原创 2017-04-08 15:36:45 · 3174 阅读 · 1 评论