
NLP-Python
文章平均质量分 78
fxjtoday
这个作者很懒,什么都没留下…
展开
-
POS Tagging
POS tagging:part-of-speech tagging, or word classes or lexical categories. 说法很多其实就是词性标注。那么用nltk的工具集的off-the-shelf工具可以简单的对文本进行POS tagging>>> text = nltk.word_tokenize("And now for something completely different")>>> nltk.pos_tag(text)[('And', 'C原创 2010-08-26 17:35:00 · 25038 阅读 · 1 评论 -
Classify Text With NLTK
<br />Classification is the task of choosing the correct class label for a given input.<br />A classifier is called supervised if it is built based on training corpora containing the correct label for each input.<br /> <br />这里就以一个例子来说明怎样用nltk来实现分类器训练和原创 2010-09-03 18:09:00 · 9698 阅读 · 0 评论 -
Extracting Information from Text With NLTK
因为现实中的数据多为‘非结构化数据’,比如一般的txt文档,或是‘半结构化数据’,比如html,对于这样的数据需要采用一些技术才能从中提取出有用的信息。如果所有数据都是‘结构化数据’,比如Xml或关系数据库,那么就不需要特别去提取了,可以根据元数据去任意取到你想要的信息。那么就来讨论一下用NLTK来实现文本信息提取的方法,first, the raw text of the document is split into sentences using a sentence segmenter, and ea原创 2010-09-08 16:58:00 · 5567 阅读 · 1 评论