lucene in action 第三章(2)

本文详细介绍了Lucene的多种查询类型,包括TermQuery、TermRangeQuery、NumericRangeQuery、PrefixQuery、BooleanQuery、PhraseQuery、WildcardQuery和FuzzyQuery,并通过示例展示了如何在实际应用中使用这些查询类型进行高效检索。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

lucene的这种各种各样的查询类型
1、TermQuery 
   最简单的Query类型,某一个field是否含有一个term的value

2、TermRangeQuery 
   由于term在index中是按照字典顺序排列的,可以使用TermRangeQuery查询一个范围内的Term
例如
Queryquery = new TermRangeQuery("city", "aa", "am", true,true);
TopDocs hits = searcher.search(query,20);

可以查血从aa* ab* ..... am*的term。 后面的true和false代表是否包括aa和am

3、NumericRangeQuery 
查询一个数值的范围。  这个必须查血NumericFiled
Query query =NumericRangeQuery.newIntRange("intID", from, to,true,true);
TopDocs hits = searcher.search(query, 20);

4、PrefixQuery 前缀查询
    查询一个term是否满足一个前缀。
    比如 prefix=“bri ”bridge和“bright”都可以满足
Term t = new Term(field, prefix);
Query query = new PrefixQuery(t);
TopDocs hits = searcher.search(query, 20);

5、BooleanQuery联合多个查找
Term t= new Term("contents", "bri");
Queryquery1 = new PrefixQuery(t);
Queryquery2 = NumericRangeQuery.newIntRange("intID", 1, 3, true,true);

//create a boolean query
BooleanQuery query = newBooleanQuery();
query.add(query1,BooleanClause.Occur.SHOULD);
query.add(query2,BooleanClause.Occur.MUST);

TopDocs hits = searcher.search(query,20);
注意BooleanClause.Occur.MUST是and的意思,BooleanClause.Occur.SHOULD是or的意思,BooleanClause.Occur.MUST_NOT是not的意思

6、PhraseQuery短语查询
     我们想查询一个短语 fox quick 或者 quick fox 或者quick brown fox,或者quickred fox。
     可以使用phraseQuery, PhraseQuery使用Edit distance( 编辑距离 ) 来量度,编辑距离是一个字符串变化到另一个字符串需要的替换,删除,插入的次数总和。每一次这种操作叫做一次slop。可以使用 setSlop来限制短语slop的最大值。
edit distance如下图

    lucene <wbr>in <wbr>action <wbr>第三章(2) <wbr>search <wbr>各种各样的Query类型

比如: quick fox 到quick [xxx] fox 需要 1slop
fox quick 到 quick[xxx] fox 需要 3 slop 先用quick替换 fox,再用fox替换quick,再插入一个xxx总共3次。
    PhraseQuery query = newPhraseQuery();

// set max slop to 10
query.setSlop(10);
query.add(new Term("contents","  quick   "   ));
query.add(new Term("contents","  fox "));
TopDocs hits =searcher.search(query, 20);

7、WildcardQuery通配符查询

    PrefixQuery是WildcardQuery的特殊形式
   *代表一个或者多个,?代表0个或者一个
                                //use wildchard "?ridg*"
WildcardQuery query = newWildcardQuery(new Term("contents", "?ridg*"));
TopDocs hits = searcher.search(query,20);

8、FuzzyQuery  模糊查询
    FuzzyQuery与PhraseQury 一类似都是以Edit distance 来做的,只不过FuzzyQuery是在term内部,而PhraseQuery是在term之间。     
  例如      FuzzyQuery query = newFuzzyQuery(new Term("contents", " Amsteedam"));可以查出  Amsterdam,他们之间的编辑距离是1。
如下
 IndexSearchersearcher = new IndexSearcher(dir);
// "Amsterdam" is similar to"Amsteedam"
FuzzyQuery query = newFuzzyQuery(new Term("contents", "Amsteedam"));
TopDocs hits =searcher.search(query, 20);
showResult(hits,searcher);

packagecharpter3;

importjava.io.File;
importjava.io.IOException;

importorg.apache.lucene.analysis.standard.StandardAnalyzer;
importorg.apache.lucene.document.Document;
importorg.apache.lucene.document.Field;
importorg.apache.lucene.document.Field.TermVector;
importorg.apache.lucene.document.NumericField;
importorg.apache.lucene.index.CorruptIndexException;
importorg.apache.lucene.index.IndexReader;
importorg.apache.lucene.index.IndexWriter;
importorg.apache.lucene.index.Term;
importorg.apache.lucene.queryParser.ParseException;
importorg.apache.lucene.queryParser.QueryParser;
importorg.apache.lucene.search.BooleanClause;
importorg.apache.lucene.search.BooleanQuery;
importorg.apache.lucene.search.FuzzyQuery;
importorg.apache.lucene.search.IndexSearcher;
importorg.apache.lucene.search.NumericRangeQuery;
importorg.apache.lucene.search.PhraseQuery;
importorg.apache.lucene.search.PrefixQuery;
importorg.apache.lucene.search.Query;
importorg.apache.lucene.search.ScoreDoc;
importorg.apache.lucene.search.TermQuery;
importorg.apache.lucene.search.TermRangeQuery;
importorg.apache.lucene.search.TopDocs;
importorg.apache.lucene.search.WildcardQuery;
importorg.apache.lucene.store.Directory;
importorg.apache.lucene.store.FSDirectory;
importorg.apache.lucene.util.Version;

public classQuerys {
privateIndexWriter writer;
protectedString[] ids = { "1", "2", "3" };
protectedString[] unindexed = { "Netherlands", "Italy", "China"};
protectedString[] unstored = { "Amsterdam has a lot ofbridge",
"Venice haslots of canals", "Amsterddam bridges are a lot"};
protectedString[] text = { "Amsterdam", "Venice", "Aeijing"};

privateDirectory dir = null;
privateIndexReader indexReader = null;

publicQuerys(String indexDir) throws IOException {
dir =FSDirectory.open(new File(indexDir));
this.writer =new IndexWriter(dir, new StandardAnalyzer(
Version.LUCENE_36), true,IndexWriter.MaxFieldLength.UNLIMITED);
this.writer.setInfoStream(System.out);

// create aindex reader instance
indexReader =IndexReader.open(dir);
}

public voidaddDocuments() throws CorruptIndexException, IOException{
for (int i =0; i < ids.length; i++) {
Document doc =new Document();

NumericFieldnfield = new NumericField("intID", 10);
nfield.setIntValue(i);
doc.add(nfield);

doc.add(newField("id", ids[i], Field.Store.YES,
Field.Index.NOT_ANALYZED));
doc.add(newField("country", unindexed[i], Field.Store.YES,
Field.Index.NO));
doc.add(newField("contents", unstored[i], Field.Store.YES,
Field.Index.ANALYZED));
doc.add(newField("city", text[i], Field.Store.YES,
Field.Index.ANALYZED));
writer.addDocument(doc);

}

System.out.println("docs = " +writer.numDocs());

}

public voidindex() throws CorruptIndexException, IOException{
this.addDocuments();
this.commit();
}

public voidexpressionQuery() throws CorruptIndexException,IOException,
ParseException{

IndexSearchersearcher = new IndexSearcher(this.indexReader);

QueryParserpraser = newQueryParser(Version.LUCENE_CURRENT,
"contents",new StandardAnalyzer(Version.LUCENE_CURRENT));

//note
Query query =praser.parse("+bridge -Amsterdam");
System.out.println("query = " +query.toString());
TopDocs hits =searcher.search(query, 20);
showResult(hits, searcher);

}

public voidtermQuery(String fieldName, String q)
throwsCorruptIndexException, IOException, ParseException{
//IndexSearcher searcher = newIndexSearcher(dir);

// build aindexSearch on a indexReader
IndexSearchersearcher = new IndexSearcher(this.indexReader);

Term t = newTerm(fieldName, q.toLowerCase());
Query query =new TermQuery(t);
TopDocs hits =searcher.search(query, 20);
showResult(hits, searcher);
}

public voidtermRangeQuery(String fieldName, String q)
throwsCorruptIndexException, IOException, ParseException{
IndexSearchersearcher = new IndexSearcher(dir);

Query query =new TermRangeQuery("city", "aa", "am", true,true);
TopDocs hits =searcher.search(query, 20);
showResult(hits, searcher);
}

public voidnumericRangeQuery(int from, int to)
throwsCorruptIndexException, IOException, ParseException{
IndexSearchersearcher = new IndexSearcher(dir);

Query query =NumericRangeQuery.newIntRange("intID", from, to,true,true);
TopDocs hits =searcher.search(query, 20);
showResult(hits, searcher);
}

public voidprefixQuery(String field, String prefix)
throwsCorruptIndexException, IOException, ParseException{
IndexSearchersearcher = new IndexSearcher(dir);

Term t = newTerm(field, prefix);
Query query =new PrefixQuery(t);
TopDocs hits =searcher.search(query, 20);
showResult(hits, searcher);
}

public voidbooleanQuery() throws CorruptIndexException,IOException,
ParseException{
IndexSearchersearcher = new IndexSearcher(dir);

Term t = newTerm("contents", "bri");
Query query1 =new PrefixQuery(t);

Query query2 =NumericRangeQuery.newIntRange("intID", 1, 3, true,true);

// create aboolean query
BooleanQueryquery = new BooleanQuery();
query.add(query1,BooleanClause.Occur.SHOULD);
query.add(query2,BooleanClause.Occur.MUST);

TopDocs hits =searcher.search(query, 20);

showResult(hits, searcher);

}

public voidphraseQuery() throws CorruptIndexException,IOException,
ParseException{
IndexSearchersearcher = new IndexSearcher(dir);
PhraseQueryquery = new PhraseQuery();

// set maxslop to 10
query.setSlop(10);
query.add(newTerm("contents", "lot"));
query.add(newTerm("contents", "bridges"));
TopDocs hits =searcher.search(query, 20);

showResult(hits, searcher);

}

public voidwildCardQuery() throws CorruptIndexException,IOException,
ParseException{
IndexSearchersearcher = new IndexSearcher(dir);

// usewildchard "?ridg*"
WildcardQueryquery = new WildcardQuery(new Term("contents","?ridg*"));
TopDocs hits =searcher.search(query, 20);

showResult(hits, searcher);
}

public voidfuzzyQuery() throws CorruptIndexException,IOException,
ParseException{
IndexSearchersearcher = new IndexSearcher(dir);

// "Amsterdam"is similar to "Amsteedam"
FuzzyQueryquery = new FuzzyQuery(new Term("contents","Amsteedam"));
TopDocs hits =searcher.search(query, 20);
showResult(hits, searcher);

}

public voidtestReopen() throws ParseException, IOException{

IndexSearchersearcher = new IndexSearcher(this.indexReader);

QueryParserpraser = newQueryParser(Version.LUCENE_CURRENT,
"contents",new StandardAnalyzer(Version.LUCENE_CURRENT));

//note
Query query =praser.parse("+bridge -Amsterdam");
System.out.println("query = " +query.toString());

TopDocs hits =searcher.search(query, 20);

// reopen aindex and will cover current modification ofindex.
IndexReadernewReader = indexReader.reopen();
if(indexReader != newReader) {
indexReader =newReader;

// ifindexReader is changed , searcher must beconstructed.
searcher.close();
searcher =null;
searcher = newIndexSearcher(this.indexReader);
}

hits =searcher.search(query, 20);

showResult(hits, searcher);

}

public voidtestTopDocs() throws CorruptIndexException, IOException{
IndexSearchersearcher = new IndexSearcher(dir);

// "Amsterdam"is similar to "Amsteedam"
FuzzyQueryquery = new FuzzyQuery(new Term("contents","Amsteedam"));
TopDocs hits =searcher.search(query, 20);

System.out.println("search result:");

for (ScoreDocdoc : hits.scoreDocs) {
//閸欐牕绶遍崨鎴掕厬閻ㄥ嫭鏋冨锟�
Document d =searcher.doc(doc.doc);
System.out.println(d.get("contents"));
}
}

public voidcommit() throws CorruptIndexException, IOException{
this.writer.commit();
}

public voidshowResult(TopDocs hits, IndexSearcher searcher){

try{
System.out.println("search result:");

for (ScoreDocdoc : hits.scoreDocs) {
//閸欐牕绶遍崨鎴掕厬閻ㄥ嫭鏋冨锟�
Document d =searcher.doc(doc.doc);
System.out.println(d.get("contents"));
}
} catch(Exception e) {
e.printStackTrace();
}
}

public staticvoid main(String[] args) throws IOException, ParseException{
// TODOAuto-generated method stub
Querys ci =new Querys("charpter2-1");
ci.index();
System.out.println("----------termQuery--------------");
ci.termQuery("city", "Venice");

System.out.println("----------termRangeQuery--------------");
ci.termRangeQuery(null, null);

System.out.println("----------numericRangeQuery--------------");
ci.numericRangeQuery(1, 5);

System.out.println("----------prefixQuery--------------");
ci.prefixQuery("contents", "bri");

System.out.println("----------booleanQuery--------------");
ci.booleanQuery();

System.out.println("----------phraseQuery--------------");
ci.phraseQuery();

System.out.println("----------wildCardQuery--------------");
ci.wildCardQuery();

System.out.println("----------fuzzyQuery--------------");
ci.fuzzyQuery();

System.out.println("----------expressionQuery--------------");
ci.expressionQuery();

System.out.println("----------testreopen--------------");
ci.testReopen();

}

}

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值