【lucene】高亮案例

下面给出一份 基于 Lucene 8.5.0 的完整“高亮”示例,可直接拷贝运行(JDK 8+ 即可)。

 

示例演示:

 

1. 建索引(字段 必须存储原始文本,并 开启 Term Vector)。  

2. 用 QueryScorer + Highlighter 对搜索结果进行高亮。  

3. 输出带 `<font color='red'>` 标签的高亮片段。

 

---

 

① Maven 依赖

 

```xml

<!-- Lucene 8.5.0 -->

<dependency>

  <groupId>org.apache.lucene</groupId>

  <artifactId>lucene-core</artifactId>

  <version>8.5.0</version>

</dependency>

<dependency>

  <groupId>org.apache.lucene</groupId>

  <artifactId>lucene-analyzers-common</artifactId>

  <version>8.5.0</version>

</dependency>

<dependency>

  <groupId>org.apache.lucene</groupId>

  <artifactId>lucene-highlighter</artifactId>

  <version>8.5.0</version>

</dependency>

```

 

---

 

② 完整代码

 

```java

package demo;

 

import org.apache.lucene.analysis.Analyzer;

import org.apache.lucene.analysis.standard.StandardAnalyzer;

import org.apache.lucene.document.*;

import org.apache.lucene.index.*;

import org.apache.lucene.search.*;

import org.apache.lucene.search.highlight.*;

import org.apache.lucene.store.*;

 

import java.nio.file.Paths;

 

public class Lucene85HighlighterDemo {

 

    public static void main(String[] args) throws Exception {

        Directory dir = FSDirectory.open(Paths.get("idx_highlight"));

 

        /* ---------- 1. 建索引 ---------- */

        Analyzer analyzer = new StandardAnalyzer();

        IndexWriterConfig cfg = new IndexWriterConfig(analyzer);

 

        // 必须开启 Term Vector,才能用 Highlighter

        FieldType myType = new FieldType(TextField.TYPE_STORED);

        myType.setStoreTermVectors(true);

        myType.setStoreTermVectorPositions(true);

        myType.setStoreTermVectorOffsets(true);

        myType.freeze();

 

        try (IndexWriter w = new IndexWriter(dir, cfg)) {

            Document doc1 = new Document();

            doc1.add(new StringField("id", "1", Field.Store.YES));

            doc1.add(new Field("content", "Apache Lucene is a high-performance search engine library.", myType));

            w.addDocument(doc1);

 

            Document doc2 = new Document();

            doc2.add(new StringField("id", "2", Field.Store.YES));

            doc2.add(new Field("content", "Lucene powers Elasticsearch and Solr to provide amazing search features.", myType));

            w.addDocument(doc2);

        }

 

        /* ---------- 2. 搜索 + 高亮 ---------- */

        try (DirectoryReader r = DirectoryReader.open(dir)) {

            IndexSearcher searcher = new IndexSearcher(r);

            Query query = new QueryParser("content", analyzer).parse("lucene");

 

            TopDocs docs = searcher.search(query, 10);

 

            // 高亮组件

            SimpleHTMLFormatter fmt = new SimpleHTMLFormatter("<font color='red'>", "</font>");

            QueryScorer scorer = new QueryScorer(query);

            Highlighter hl = new Highlighter(fmt, scorer);

            hl.setTextFragmenter(new SimpleFragmenter(50)); // 每段最多 50 字符

 

            for (ScoreDoc sd : docs.scoreDocs) {

                Document doc = searcher.doc(sd.doc);

                String raw = doc.get("content");

 

                String best = hl.getBestFragment(analyzer, "content", raw);

                System.out.println("id=" + doc.get("id") + ", score=" + sd.score);

                System.out.println("highlight=" + best);

            }

        }

    }

}

```

 

---

 

③ 运行结果示例

 

```

id=1, score=0.5753648

highlight=<font color='red'>Apache</font> <font color='red'>Lucene</font> is a high-performance search engine library.

id=2, score=0.5753648

highlight=<font color='red'>Lucene</font> powers Elasticsearch and Solr to provide amazing search features.

```

 

---

 

④ 关键点回顾

 

项 要求 

字段必须存储原文 `Field.Store.YES` 

必须开启 Term Vector `setStoreTermVectors(true)` 等 

高亮器 `Highlighter`(普通文本) / `FastVectorHighlighter`(性能更好,但需额外配置) 

 

---

 

一句话总结

 

> 只要 字段存储原文并开启 Term Vector,Lucene 8.5 用 Highlighter + QueryScorer 即可轻松实现关键字高亮,无需 ES 也能获得 `<font>` 或 `<span>` 标签效果。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值