mmseg4j-1.8.3版本支持solr1.4.1,当前最高版本版本mmseg1.8.5版本过高不支持solr1.4.1
在$SOLR_HOME目录下新建lib,dic两个文件夹
下载mmseg4j-1.8.3.zip到D:/solrworkspace/mmseg4j-1.8.3.zip
解压D:/solrworkspace/mmseg4j-1.8.3.zip 为D:/solrworkspace/mmseg4j-1.8.3($MMSEG_HOME)
复制$MMSEG_HOME/data 目录下 *.dic 到目录 $SOLR_HOME/dic目录下
复制$MMSEG_HOME/mmseg4j-all-1.8.3.jar 到目录$SOLR_HOME/lib目录下
修改$SOLR_HOME/config/schema.xml
复制
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true" />
......
</analyzer>
</analyzer>
</fieldType>
分别为
<fieldType name="text_mmseg_complex" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="com.chenlb.mmseg4j.solr.MMSegTokenizerFactory" mode="complex" dicPath="dic"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true" />
......
</analyzer>
</fieldType>
<fieldType name="text_mmseg_max_word" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="com.chenlb.mmseg4j.solr.MMSegTokenizerFactory" mode="max-word" dicPath="dic"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true" />
......
</analyzer>
</fieldType>
<fieldType name="text_mmseg_simple" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="com.chenlb.mmseg4j.solr.MMSegTokenizerFactory" mode="simple" dicPath="dic"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true" />
......
</analyzer>
</fieldType>
添加3条
<fields>
<field name="textMmsegComplex" type="text_mmseg_complex" indexed="true" stored="false"/>
<field name="textMmsegMaxWord" type="text_mmseg_max_word" indexed="true" stored="false"/>
<field name="textMmsegSimple" type="text_mmseg_simple" indexed="true" stored="false"/>
</fields>
访问http://localhost:8080/solr/admin/analysis.jsp
Filed 选择框选择 name 后面的输入框填写textMmsegComplex,textMmsegMaxWord,textMmsegSimple3种值,分别对应mmseg3中分词格式
Field value (Index) 被索引的分词词组,Field value (Query) 被查询的分词词组
后面的输入框输入你想要被分词的语句或词组
点击Analyze可以看到分词后被索引,和查询的结果