Elasticsearch API 及其 Java 客户端操作详解
Elasticsearch 是一个功能强大的分布式搜索和分析引擎,其 RESTful API 提供了丰富的操作接口。与此同时,Elasticsearch 提供了官方的 Java 客户端(如 Java High-Level REST Client 和新版 Elasticsearch Java API Client),方便开发者在 Java 环境中与其交互。本文将从索引层和文档层两个维度,详细讲解 Elasticsearch 的核心 API 及其 Java 客户端的构造操作。
一、索引层操作
索引(Index)是 Elasticsearch 中存储数据的逻辑单元,类似于数据库中的“数据库”。索引层操作主要涉及索引的创建、管理和删除等功能。
1. 创建索引(Create Index)
Elasticsearch 允许通过 PUT /<index>
创建索引,并可指定映射(Mapping)和设置(Settings)。
REST API 示例
json
代码解读
复制代码
PUT /my_index { "settings": { "number_of_shards": 3, "number_of_replicas": 1 }, "mappings": { "properties": { "title": { "type": "text" }, "date": { "type": "date" } } } }
Java 客户端实现
使用 RestHighLevelClient
创建索引:
java
代码解读
复制代码
import org.elasticsearch.action.admin.indices.create.CreateIndexRequest; import org.elasticsearch.action.admin.indices.create.CreateIndexResponse; import org.elasticsearch.client.RequestOptions; import org.elasticsearch.client.RestHighLevelClient; import org.elasticsearch.common.settings.Settings; public void createIndex(RestHighLevelClient client) throws IOException { CreateIndexRequest request = new CreateIndexRequest("my_index"); request.settings(Settings.builder() .put("index.number_of_shards", 3) .put("index.number_of_replicas", 1) ); request.mapping("{\"properties\":{\"title\":{\"type\":\"text\"},\"date\":{\"type\":\"date\"}}}", XContentType.JSON); CreateIndexResponse response = client.indices().create(request, RequestOptions.DEFAULT); System.out.println("Index created: " + response.isAcknowledged()); }
2. 删除索引(Delete Index)
通过 DELETE /<index>
删除索引。
REST API 示例
json
代码解读
复制代码
DELETE /my_index
Java 客户端实现
java
代码解读
复制代码
import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest; public void deleteIndex(RestHighLevelClient client) throws IOException { DeleteIndexRequest request = new DeleteIndexRequest("my_index"); AcknowledgedResponse response = client.indices().delete(request, RequestOptions.DEFAULT); System.out.println("Index deleted: " + response.isAcknowledged()); }
你说得对,查询是 Elasticsearch 的核心功能之一,尤其在实际应用中,复杂的查询条件和组合逻辑非常常见。我重新调整并扩展这部分内容,深入讲解查询相关的 API 和 Java 客户端实现,涵盖更多复杂的查询场景(如布尔查询、多字段查询、过滤、聚合等),并提供详细的实现细节和示例代码。以下是修订后的博客片段,重点扩展查询部分。
二、文档层操作
2. 查询文档(Search Document)
查询是 Elasticsearch 的核心功能,其强大的查询 DSL(Domain Specific Language)支持从简单匹配到复杂聚合的各种场景。REST API 使用 GET /<index>/_search
,Java 客户端通过 SearchRequest
和 SearchSourceBuilder
构造查询。
基础查询回顾
先看一个简单查询:
REST API 示例
json
代码解读
复制代码
GET /my_index/_search { "query": { "match": { "title": "Elasticsearch" } } }
Java 客户端实现
java
代码解读
复制代码
public void basicSearch(RestHighLevelClient client) throws IOException { SearchRequest request = new SearchRequest("my_index"); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.query(QueryBuilders.matchQuery("title", "Elasticsearch")); request.source(sourceBuilder); SearchResponse response = client.search(request, RequestOptions.DEFAULT); System.out.println("Total hits: " + response.getHits().getTotalHits()); }
但实际场景中,查询需求往往更复杂,下面逐步介绍。
复杂查询场景
2.1 布尔查询(Boolean Query)
布尔查询允许组合多个条件(如 must
、should
、must_not
、filter
),实现精确控制。
REST API 示例
查询标题包含 "Elasticsearch" 且日期晚于 2025-01-01,但不包含 "Basics" 的文档:
json
代码解读
复制代码
GET /my_index/_search { "query": { "bool": { "must": [ { "match": { "title": "Elasticsearch" }} ], "filter": [ { "range": { "date": { "gte": "2025-01-01" }}} ], "must_not": [ { "match": { "title": "Basics" }} ] } } }
Java 客户端实现
java
代码解读
复制代码
public void booleanSearch(RestHighLevelClient client) throws IOException { SearchRequest request = new SearchRequest("my_index"); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); BoolQueryBuilder boolQuery = QueryBuilders.boolQuery() .must(QueryBuilders.matchQuery("title", "Elasticsearch")) .filter(QueryBuilders.rangeQuery("date").gte("2025-01-01")) .mustNot(QueryBuilders.matchQuery("title", "Basics")); sourceBuilder.query(boolQuery); request.source(sourceBuilder); SearchResponse response = client.search(request, RequestOptions.DEFAULT); System.out.println("Boolean query hits: " + response.getHits().getTotalHits()); for (SearchHit hit : response.getHits().getHits()) { System.out.println(hit.getSourceAsString()); } }
2.2 多字段查询(Multi-Match Query)
当需要跨多个字段搜索时,使用 multi_match
。
REST API 示例
搜索标题或内容中包含 "Elasticsearch" 的文档:
json
代码解读
复制代码
GET /my_index/_search { "query": { "multi_match": { "query": "Elasticsearch", "fields": ["title", "content"], "type": "best_fields" } } }
Java 客户端实现
java
代码解读
复制代码
public void multiMatchSearch(RestHighLevelClient client) throws IOException { SearchRequest request = new SearchRequest("my_index"); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); MultiMatchQueryBuilder multiMatchQuery = QueryBuilders.multiMatchQuery("Elasticsearch", "title", "content") .type(MultiMatchQueryBuilder.Type.BEST_FIELDS); sourceBuilder.query(multiMatchQuery); request.source(sourceBuilder); SearchResponse response = client.search(request, RequestOptions.DEFAULT); System.out.println("Multi-match hits: " + response.getHits().getTotalHits()); }
2.3 短语查询(Match Phrase Query)
要求词语按顺序完全匹配,适用于精确短语搜索。
REST API 示例
json
代码解读
复制代码
GET /my_index/_search { "query": { "match_phrase": { "title": "Elasticsearch Basics" } } }
Java 客户端实现
java
代码解读
复制代码
public void phraseSearch(RestHighLevelClient client) throws IOException { SearchRequest request = new SearchRequest("my_index"); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.query(QueryBuilders.matchPhraseQuery("title", "Elasticsearch Basics")); request.source(sourceBuilder); SearchResponse response = client.search(request, RequestOptions.DEFAULT); System.out.println("Phrase query hits: " + response.getHits().getTotalHits()); }
2.4 模糊查询(Fuzzy Query)
允许一定程度的拼写错误,适用于容错搜索。
REST API 示例
json
代码解读
复制代码
GET /my_index/_search { "query": { "fuzzy": { "title": { "value": "Elastcsearch", "fuzziness": "AUTO" } } } }
Java 客户端实现
java
代码解读
复制代码
public void fuzzySearch(RestHighLevelClient client) throws IOException { SearchRequest request = new SearchRequest("my_index"); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.query(QueryBuilders.fuzzyQuery("title", "Elastcsearch").fuzziness(Fuzziness.AUTO)); request.source(sourceBuilder); SearchResponse response = client.search(request, RequestOptions.DEFAULT); System.out.println("Fuzzy query hits: " + response.getHits().getTotalHits()); }
2.5 聚合查询(Aggregations)
聚合用于统计分析,例如按字段分组或计算平均值。
REST API 示例
按日期统计文档数量:
json
代码解读
复制代码
GET /my_index/_search { "aggs": { "by_date": { "terms": { "field": "date" } } } }
Java 客户端实现
java
代码解读
复制代码
public void aggregationSearch(RestHighLevelClient client) throws IOException { SearchRequest request = new SearchRequest("my_index"); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); TermsAggregationBuilder aggregation = AggregationBuilders.terms("by_date").field("date"); sourceBuilder.aggregation(aggregation); request.source(sourceBuilder); SearchResponse response = client.search(request, RequestOptions.DEFAULT); Terms byDateAgg = response.getAggregations().get("by_date"); for (Terms.Bucket bucket : byDateAgg.getBuckets()) { System.out.println("Date: " + bucket.getKeyAsString() + ", Count: " + bucket.getDocCount()); } }
相关性算分与优化
Elasticsearch 默认使用 BM25 算法计算相关性得分。可以通过以下方式优化:
2.6 Boosting 权重调整
为特定字段或条件增加权重。
REST API 示例
json
代码解读
复制代码
GET /my_index/_search { "query": { "multi_match": { "query": "Elasticsearch", "fields": ["title^2", "content"] } } }
Java 客户端实现
java
代码解读
复制代码
public void boostedSearch(RestHighLevelClient client) throws IOException { SearchRequest request = new SearchRequest("my_index"); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); MultiMatchQueryBuilder query = QueryBuilders.multiMatchQuery("Elasticsearch", "title", "content") .field("title", 2.0f); // 提高 title 字段权重 sourceBuilder.query(query); request.source(sourceBuilder); SearchResponse response = client.search(request, RequestOptions.DEFAULT); System.out.println("Boosted query hits: " + response.getHits().getTotalHits()); }
2.7 Function Score 查询
自定义得分逻辑,例如结合字段值或距离。
REST API 示例
json
代码解读
复制代码
GET /my_index/_search { "query": { "function_score": { "query": { "match": { "title": "Elasticsearch" }}, "functions": [ { "field_value_factor": { "field": "views", "factor": 1.5, "modifier": "sqrt" } } ] } } }
Java 客户端实现
java
代码解读
复制代码
public void functionScoreSearch(RestHighLevelClient client) throws IOException { SearchRequest request = new SearchRequest("my_index"); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); FunctionScoreQueryBuilder functionScoreQuery = QueryBuilders.functionScoreQuery( QueryBuilders.matchQuery("title", "Elasticsearch"), ScoreFunctionBuilders.fieldValueFactorFunction("views") .factor(1.5f) .modifier(FieldValueFactorFunction.Modifier.SQRT) ); sourceBuilder.query(functionScoreQuery); request.source(sourceBuilder); SearchResponse response = client.search(request, RequestOptions.DEFAULT); System.out.println("Function score hits: " + response.getHits().getTotalHits()); }
查询性能优化建议
- 使用 Filter 替代 Must:
filter
不计算相关性得分,性能更高。 - 分页控制:通过
from
和size
参数控制结果范围。 - 字段选择:使用
stored_fields
或_source
过滤返回字段,减少数据传输。