.NET中文分词实现http://http://
使用
Lucene.Net.dll http://www.apache.org/dist/incubator/lucene.net/binaries/2.9.4g-incubating/
PanGu.dll http://pangusegment.codeplex.com/releases/view/50811
PanGu.Lucene.Analyzer.dll
和字典文件 http://pangusegment.codeplex.com/releases/view/31531
示例代码:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Lucene.Net.Analysis;
using Lucene.Net.Analysis.PanGu;
using System.IO;
using System.Collections;
namespace FcCApp {
class Program {
static void Main(string[] args){
String text = "基于java语言开发的轻量级的中文分词工具包";
Analyzer anal = new PanGuAnalyzer();//使用盘古分词
StringReader sb = new StringReader(text);
TokenStream ts= anal.ReusableTokenStream("", sb);
Token t = null;
while ((t=ts.Next())!=null){
Console.Write(t.TermText()+"|");
}
}
}
}
结果:
基于|java|语言|开发|的|轻量级|的|中文|分词|工具包|
示例下载地址:
http://download.youkuaiyun.com/detail/lijun7788/4412762