lucene,Field.Index, Field.Store

本文详细解析了Lucene中Field的存储与索引属性配置,包括不同组合的意义及应用场景,帮助理解如何有效进行文档检索。

网上很多例子用的是lucene1.4.3,新版本的lucene在doc.add(new Field("content",curArt.getContent(),Field.Store.NO,Field.Index.TOKENIZED)); 这些地方与旧版本有很大的区别。
Field有两个属性可选:存储和索引。通过存储属性你可以控制是否对这个Field进行存储;通过索引属性你可以控制是否对该Field进行索引。这看起来似乎有些废话,事实上对这两个属性的正确组合很重要。
Field.Index             Field.Store       说明
TOKENIZED(分词)   YES                   被分词索引且存储
TOKENIZED             NO                   被分词索引但不存储
NO                         YES                   这是不能被搜索的,它只是被搜索内容的附属物。如URL等
UN_TOKENIZED     YES/NO             不被分词,它作为一个整体被搜索,搜一部分是搜不出来的
NO                         NO                   没有这种用法

如果要对某Field进行查找,那么一定要把Field.Index设置为TOKENIZED或UN_TOKENIZED。TOKENIZED会对Field的内容进行分词;而UN_TOKENIZED不会,只有全词匹配,该Field才会被选中。
如果Field.Store是No,那么就无法在搜索结果中从索引数据直接提取该域的值,会使null。

package com.boe.cim.teacher.luence; import java.nio.file.Paths; import java.util.List; import org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.document.IntField; import org.apache.lucene.document.StringField; import org.apache.lucene.document.TextField; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.IndexWriterConfig; import org.apache.lucene.index.Term; import org.apache.lucene.store.Directory; import org.apache.lucene.store.FSDirectory; import com.boe.cim.teacher.pojo.RequirementInfo; /** * @author hyh * @description 创建文档索引 */ public class LuceneIndexRequirement { private Directory dir; /** *实例化indexerWriter * @return * @throws Exception */ private IndexWriter getWriter()throws Exception{ //中文分词器 SmartChineseAnalyzer analyzer=new SmartChineseAnalyzer(); IndexWriterConfig iwc=new IndexWriterConfig(analyzer); IndexWriter writer=new IndexWriter(dir, iwc); return writer; } /** * 获取indexreDir * @param indexreDir * @throws Exception */ public void index(String indexreDir,List<RequirementInfo> listRequirement)throws Exception{ dir=FSDirectory.open(Paths.get(indexreDir)); IndexWriter writer=getWriter(); for(int i=0;i<listRequirement.size();i++){ Document doc=new Document(); RequirementInfo requirement = listRequirement.get(i); //StringField 只索引不分词 doc.add(new StringField("id",String.valueOf(requirement.getId()), Field.Store.YES)); doc.add(new StringField("requirement", requirement.getRequirement(), Field.Store.YES)); doc.add(new StringField("department",requirement.getDepartment(),Field.Store.YES)); doc.add(new StringField("liaisonman", requirement.getLiaisonman(), Field.Store.YES)); doc.add(new StringField("requirementtype", Integer.toString(requirement.getRequirementtype()), Field.Store.YES)); doc.add(new TextField("requirementbackground", requirement.getRequirementbackground(), Field.Store.YES)); doc.add(new TextField("requirementcontents", requirement.getRequirementcontents(), Field.Store.YES)); // writer.deleteDocuments(new Term("id",String.valueOf(teacher.getId()))); writer.updateDocument(new Term("id",String.valueOf(requirement.getId())), doc); } writer.close(); } } 上述代码是否正确
08-19
package com.boe.cim.teacher.luence; import java.nio.file.Paths; import java.util.List; import org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.document.IntField; import org.apache.lucene.document.StringField; import org.apache.lucene.document.TextField; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.IndexWriterConfig; import org.apache.lucene.index.Term; import org.apache.lucene.store.Directory; import org.apache.lucene.store.FSDirectory; import com.boe.cim.teacher.pojo.TeacherInfo; /** * @author hyh * @description 创建文档索引 */ public class LuceneIndex { private Directory dir; /** *实例化indexerWriter * @return * @throws Exception */ private IndexWriter getWriter()throws Exception{ //中文分词器 SmartChineseAnalyzer analyzer=new SmartChineseAnalyzer(); IndexWriterConfig iwc=new IndexWriterConfig(analyzer); IndexWriter writer=new IndexWriter(dir, iwc); return writer; } /** * 获取indexDir * @param indexDir * @throws Exception */ public void index(String indexDir,List<TeacherInfo> listTeacher)throws Exception{ dir=FSDirectory.open(Paths.get(indexDir)); IndexWriter writer=getWriter(); for(int i=0;i<listTeacher.size();i++){ Document doc=new Document(); TeacherInfo teacher = listTeacher.get(i); //StringField 只索引不分词 doc.add(new StringField("id",String.valueOf(teacher.getId()), Field.Store.YES)); doc.add(new StringField("teacher", teacher.getTeacher(), Field.Store.YES)); doc.add(new StringField("school",teacher.getSchool(),Field.Store.YES)); doc.add(new StringField("department", teacher.getDepartment(), Field.Store.YES)); doc.add(new TextField("researchdirector", teacher.getResearchdirector(), Field.Store.YES)); doc.add(new TextField("instruments", teacher.getInstruments(), Field.Store.YES)); doc.add(new TextField("achievements", teacher.getAchievements(), Field.Store.YES)); // writer.deleteDocuments(new Term("id",String.valueOf(teacher.getId()))); writer.updateDocument(new Term("id",String.valueOf(teacher.getId())), doc); } writer.close(); } } 上述代码的作用
09-02
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值