下面给出一份 基于 Lucene 8.5.0 的完整“高亮”示例,可直接拷贝运行(JDK 8+ 即可)。
示例演示:
1. 建索引(字段 必须存储原始文本,并 开启 Term Vector)。
2. 用 QueryScorer + Highlighter 对搜索结果进行高亮。
3. 输出带 `<font color='red'>` 标签的高亮片段。
---
① Maven 依赖
```xml
<!-- Lucene 8.5.0 -->
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>8.5.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analyzers-common</artifactId>
<version>8.5.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-highlighter</artifactId>
<version>8.5.0</version>
</dependency>
```
---
② 完整代码
```java
package demo;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.*;
import org.apache.lucene.index.*;
import org.apache.lucene.search.*;
import org.apache.lucene.search.highlight.*;
import org.apache.lucene.store.*;
import java.nio.file.Paths;
public class Lucene85HighlighterDemo {
public static void main(String[] args) throws Exception {
Directory dir = FSDirectory.open(Paths.get("idx_highlight"));
/* ---------- 1. 建索引 ---------- */
Analyzer analyzer = new StandardAnalyzer();
IndexWriterConfig cfg = new IndexWriterConfig(analyzer);
// 必须开启 Term Vector,才能用 Highlighter
FieldType myType = new FieldType(TextField.TYPE_STORED);
myType.setStoreTermVectors(true);
myType.setStoreTermVectorPositions(true);
myType.setStoreTermVectorOffsets(true);
myType.freeze();
try (IndexWriter w = new IndexWriter(dir, cfg)) {
Document doc1 = new Document();
doc1.add(new StringField("id", "1", Field.Store.YES));
doc1.add(new Field("content", "Apache Lucene is a high-performance search engine library.", myType));
w.addDocument(doc1);
Document doc2 = new Document();
doc2.add(new StringField("id", "2", Field.Store.YES));
doc2.add(new Field("content", "Lucene powers Elasticsearch and Solr to provide amazing search features.", myType));
w.addDocument(doc2);
}
/* ---------- 2. 搜索 + 高亮 ---------- */
try (DirectoryReader r = DirectoryReader.open(dir)) {
IndexSearcher searcher = new IndexSearcher(r);
Query query = new QueryParser("content", analyzer).parse("lucene");
TopDocs docs = searcher.search(query, 10);
// 高亮组件
SimpleHTMLFormatter fmt = new SimpleHTMLFormatter("<font color='red'>", "</font>");
QueryScorer scorer = new QueryScorer(query);
Highlighter hl = new Highlighter(fmt, scorer);
hl.setTextFragmenter(new SimpleFragmenter(50)); // 每段最多 50 字符
for (ScoreDoc sd : docs.scoreDocs) {
Document doc = searcher.doc(sd.doc);
String raw = doc.get("content");
String best = hl.getBestFragment(analyzer, "content", raw);
System.out.println("id=" + doc.get("id") + ", score=" + sd.score);
System.out.println("highlight=" + best);
}
}
}
}
```
---
③ 运行结果示例
```
id=1, score=0.5753648
highlight=<font color='red'>Apache</font> <font color='red'>Lucene</font> is a high-performance search engine library.
id=2, score=0.5753648
highlight=<font color='red'>Lucene</font> powers Elasticsearch and Solr to provide amazing search features.
```
---
④ 关键点回顾
项 要求
字段必须存储原文 `Field.Store.YES`
必须开启 Term Vector `setStoreTermVectors(true)` 等
高亮器 `Highlighter`(普通文本) / `FastVectorHighlighter`(性能更好,但需额外配置)
---
一句话总结
> 只要 字段存储原文并开启 Term Vector,Lucene 8.5 用 Highlighter + QueryScorer 即可轻松实现关键字高亮,无需 ES 也能获得 `<font>` 或 `<span>` 标签效果。