本文中使用paoding2.0.4
1.准备工作
需要的文件:
paoding-analysis.jar
dicpaoding-analysis.properties
2.安装导入
将paoding-analysis.jar放到classpath 目录下并且导入工程
修改paoding-analysis.properties ,将paoding.dic.home设置为字典的存放目录。例如,字典放在classpath目录下,则设置为“ paoding.dic.home=classpath:dic”。放在其他位置按照类似“
paoding.dic.home=/paodingdic/paoding/dic
”进行修改。在配置文件中加入下列内容:
paoding.knife.class.letterKnife=net.paoding.analysis.knife.LetterKnife
paoding.knife.class.numberKnife=net.paoding.analysis.knife.NumberKnife
paoding.knife.class.cjkKnife=net.paoding.analysis.knife.CJKKnife
3.使用。
首先加入import net.paoding.analysis.analyzer.PaodingAnalyzer;
将代码中原来使用的analyzer,例如StandardAnalyzer替换为PaodingAnalyzer即可。
运行程序,有可能会出现如下的错误:
error in handler path=file:/C:/Users/Administrator/Workspaces/MyEclipse 8.5/lucene0623/lib/paoding-analysis.jar!/paoding-analysis.properties
error in handler jarPath=/C:/Users/Administrator/Workspaces/MyEclipse 8.5/lucene0623/lib/paoding-analysis.jar!/
net.paoding.analysis.exception.PaodingAnalysisException: java.io.FileNotFoundException: C:\Users\Administrator\Workspaces\MyEclipse
8.5\lucene0623\lib\paoding-analysis.jar! (系统找不到指定的文件。)
at net.paoding.analysis.knife.PaodingMaker.getProperties(PaodingMaker.java:138)
at net.paoding.analysis.analyzer.PaodingAnalyzer.init(PaodingAnalyzer.java:70)
at net.paoding.analysis.analyzer.PaodingAnalyzer.<init>(PaodingAnalyzer.java:59)
at net.paoding.analysis.analyzer.PaodingAnalyzer.<init>(PaodingAnalyzer.java:52)
at crawler.newcrawler.crawler(newcrawler.java:131)
at crawler.newcrawler.main(newcrawler.java:490)
Caused by: java.io.FileNotFoundException: C:\Users\Administrator\Workspaces\MyEclipse 8.5\lucene0623\lib\paoding-analysis.jar! (系统找不到指定的文件。)
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(ZipFile.java:114)
at java.util.jar.JarFile.<init>(JarFile.java:133)
at java.util.jar.JarFile.<init>(JarFile.java:97)
at net.paoding.analysis.knife.PaodingMaker.getFileLastModified(PaodingMaker.java:242)
at net.paoding.analysis.knife.PaodingMaker.loadProperties(PaodingMaker.java:207)
at net.paoding.analysis.knife.PaodingMaker.getProperties(PaodingMaker.java:129)
... 5 more
出现这个错误是因为Paoding的jar文件以及配置文件的存放路径中存在空格,比如此处“MyEclipse 8.5”中间有空格。修改文件夹名称,将空格去掉,就可以正常运行了。
正常运行时控制台会有如下提示信息:
09:04:38,703 INFO PaodingMaker:134 - config paoding analysis from: E:\search\all\eclipse\workspace\lucenedemo\file:\E:\search\all\eclipse\workspace\lucenedemo\lib\paoding-analysis.jar!\paoding-analysis.properties;E:\search\all\eclipse\workspace\lucenedemo\file:\E:\search\all\eclipse\workspace\lucenedemo\lib\paoding-analysis.jar!\paoding-analysis-default.properties;E:\search\all\eclipse\workspace\lucenedemo\file:\E:\search\all\eclipse\workspace\lucenedemo\lib\paoding-analysis.jar!\paoding-analyzer.properties;E:\search\all\eclipse\workspace\lucenedemo\file:\E:\search\all\eclipse\workspace\lucenedemo\lib\paoding-analysis.jar!\paoding-dic-home.properties;E:\paoding2.0.4\paoding\dic\paoding-dic-names.properties;E:\search\all\eclipse\workspace\lucenedemo\file:\E:\search\all\eclipse\workspace\lucenedemo\lib\paoding-analysis.jar!\paoding-knives.properties;E:\search\all\eclipse\workspace\lucenedemo\file:\E:\search\all\eclipse\workspace\lucenedemo\lib\paoding-analysis.jar!\paoding-knives-user.properties
09:04:38,750 INFO PaodingMaker:427 - add knike: net.paoding.analysis.knife.CJKKnife
09:04:38,765 INFO PaodingMaker:427 - add knike: net.paoding.analysis.knife.LetterKnife
09:04:38,765 INFO PaodingMaker:427 - add knike: net.paoding.analysis.knife.NumberKnife