概述
文档说明
- 以下所有的都基于ElasticSearch 7.x
- 以下所有的案例都建立在
index_learn_test
索引上 - 索引DSL
- 所有代码案例都基于java的RestHighLevelClient编写
{
"index_learn_test" : {
"mappings" : {
"properties" : {
"age" : {
"type" : "keyword",
"fields" : {
"number" : {
"type" : "integer"
}
}
},
"departmentId" : {
"type" : "keyword"
},
"departmentIdLeve1" : {
"type" : "keyword"
},
"departmentIdLeve2" : {
"type" : "keyword"
},
"departmentIdLeve3" : {
"type" : "keyword"
},
"departmentIdLeve4" : {
"type" : "keyword"
},
"departmentIdLeve5" : {
"type" : "keyword"
},
"departmentIdLeve6" : {
"type" : "keyword"
},
"departmentIdLeve7" : {
"type" : "keyword"
},
"departmentIds" : {
"type" : "keyword"
},
"departmentJoin" : {
"type" : "join",
"eager_global_ordinals" : true,
"relations" : {
"department" : "user"
}
},
"id" : {
"type" : "long"
},
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"resume" : {
"type" : "wildcard"
},
"sex" : {
"type" : "keyword"
}
}
}
}
}
字段类型
索引
遍历所有索引并查看索引占用空间
GET /_cat/indices?v
查看某个索引的配置(包含默认配置)
GET /index_learn_test/_settings?include_defaults=true
创建索引
PUT /index_learn_test
{
"mappings": {
"properties": {
"id": {
"type": "long"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"sex": {
"type": "keyword"
},
"age": {
"type": "keyword",
"fields": {
"number": {
"type": "integer"
}
}
}
}
}
}
这里面的
fields
是给字段设置别的类型,使用的时候以名字为例name.keyword
即可
查看索引结构
GET /index_learn_test/_mapping
删除索引
DELETE /index_learn_test
新增索引字段
PUT /index_learn_test/_mapping
{
"properties":{
"departmentIds":{
"type":"keyword"
}
}
}
复制索引数据
{
"dest": {
"index": "index_learn_test2"
},
"source": {
"query": {
"bool": {
"must": [
{
"term": {
"name": {
"value": "正"
}
}
}
]
}
},
"index": "index_learn_test"
} ,
"max_docs":1
}
- dest:目标索引
- source:数据源
- query:数据筛选
- max_docs:最大复制文档数量
增删改
更新后立即生效
在ES中所有更新都是延迟生效的,默认是
1s
,如果需要更新后立即生效,参考以下java
代码。
查看延迟时间GET /index_learn_test/_settings?include_defaults=true
返回的refresh_interval
设置
java:
UpdateRequest updateRequest = new UpdateRequest(getIndex(),userData.getId().toString())
//ES更新后会有延迟,延迟根据refresh_interval设置的,所以这边配置强制更新
.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE)
新增(insert)
es:
PUT /index_learn_test/_doc/${id}
PUT /index_learn_test/_doc/12
{
"id": 12,
"age": 42,
"sex": "女",
"name": "厍振",
"resume": "我是厍振,大家好!",
"departmentId": "A"
}
java代码:
public void insert() throws IOException {
List<DepartmentData> list = DepartmentUtil.getDepartment(DOC_PARENT_NAME);
String[] sex = new String[]{"男", "女"};
UserData userData = new UserData();
userData.setId(12L);
userData.setAge(RandomUtil.randomInt(19, 60));
userData.setSex(sex[RandomUtil.randomInt(2)]);
userData.setName(RandNameUtil.randName());
userData.setResume("我是" + userData.getName() + ",大家好!");
userData.setDepartmentId(list.get(RandomUtil.randomInt(list.size())).getDepartmentId());
IndexRequest indexRequest = new IndexRequest(getIndex())
.id(userData.getId().toString()).source(JsonUtils.toJsonString(userData), XContentType.JSON);
restHighLevelClient.index(indexRequest,RequestOptions.DEFAULT);
}
修改(update)
可以直接使用
新增
进行全部替换,或者使用以下代码修改部分替换
es:
POST /index_learn_test/_update/12?retry_on_conflict=10
{
"doc": {
"age":12
}
}
retry_on_conflict
允许重试次数(update在并发的情况下各个线程哪都的version可能不同导致更新失败)
java代码:
public void update() throws IOException {
UserData userData = new UserData();
userData.setId(12L);
userData.setAge(RandomUtil.randomInt(19, 60));
UpdateRequest updateRequest = new UpdateRequest(getIndex(),userData.getId().toString())
//ES更新后会有延迟,延迟根据refresh_interval设置的,所以这边配置强制更新
.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE)
.retryOnConflict(10)
.doc(JsonUtils.toJsonString(userData), XContentType.JSON);
restHighLevelClient.update(updateRequest,RequestOptions.DEFAULT);
}
删除(delete)
DELETE /index_learn_test/_doc/${id}
es:
DELETE /index_learn_test/_doc/12
java代码:
public void delete() throws IOException {
UserData userData = new UserData();
userData.setId(12L);
DeleteRequest request = new DeleteRequest(getIndex(),userData.getId().toString())
.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE);
restHighLevelClient.delete(request,RequestOptions.DEFAULT);
}
批处理(bulk)
es:
PUT /index_learn_test/_bulk
{"delete":{"_index":"index_learn_test","_id":"12"}}
{"update":{"_index":"index_learn_test","_id":"20"}}
{"doc":{"age":23}}
{"create":{"_index":"index_learn_test","_id":"30"}}
{"age":34,"id":30}
{"index":{"_index":"index_learn_test","_id":"40"}}
{"age":34,"id":30}
java代码:
public void bulk() throws IOException {
BulkRequest bulkRequest = new BulkRequest();
DeleteRequest deleteRequest = new DeleteRequest(getIndex()).id("12");
bulkRequest.add(deleteRequest);
UpdateRequest updateRequest = new UpdateRequest(getIndex(), "20").doc(Collections.singletonMap("age", 30));
bulkRequest.add(updateRequest);
//…… 其他的省略
restHighLevelClient.bulk(bulkRequest,RequestOptions.DEFAULT);
}
查询
查询条件的java代码
所有的查询方法基本上都可以套用以下代码
java代码:
SearchRequest searchRequest = new SearchRequest(getIndex());
// 所有的查询条件基本都可以通过QueryBuilders类构建
QueryBuilder wildcardQueryBuilder = QueryBuilders.wildcardQuery("resume", "*我是王*,大家*"));
searchRequest.source(SearchSourceBuilder
.searchSource()
.query(wildcardQueryBuilder);
SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
log.info("查询结果:{}",JsonUtils.toJsonString(searchResponsegetHits().getHits()));
算分
文档匹配的相关度,主要用于排序
耗时
查新结果返回参数
took
,单位ms
返回查询结果总条数
如非必要不要使用,具体原因看以下说明。
实测单片5G数据量的情况下,普通查询影响并不大,大概在20ms
。
父子文档的查询影响比较大,大概在800ms
- 当值为
true
时返回总数,需要访问所有文档。效率最低 - 当值为
>= 0
时返回总数,总数超过则按照设置的值返回,且最大值为2147483647
。仅需要访问设置的参数的文档数,效率根据设置的值做参考 - 当值为
= -1
时不返回总数,效率高
es:
GET /index_learn_test/_search
{
"track_total_hits": true
}
java代码:
SearchSourceBuilder searchSourceBuilder = SearchSourceBuilder.searchSource();
searchSourceBuilder.trackTotalHits(true);
返回部分字段
includes
只返回这些字段,excludes
除了这些字段都返回。当两个一起使用时是and
的关系
es:
GET /index_learn_test/_search
{
"_source": {
"includes": [
"name",
"age"
],
"excludes": [
"name"
]
}
}
java代码:
@Test
public void _source() throws IOException {
SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
.searchSource().fetchSource(new String[]{"name","age"}, new String[]{"name"}));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
}
排序
以SQL为案例,先根据
年龄
降序,然后再根据性别
升序
SQL:
select * from index_learn_test order by age desc, sex asc
es:
GET /index_learn_test/_search
{
"sort": [
{
"age": {
"order": "desc"
}
},
{
"sex": {
"order": "asc"
}
}
]
}
java代码:
SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
.searchSource()
.sort("age",SortOrder.DESC)
.sort("sex",SortOrder.ASC));
精确搜索
根据文档id单条查询
GET /index_learn_test/_doc/${id}
GET /index_learn_test/_doc/20
根据文档id批量查询
es:
GET /index_learn_test/_search
{
"query": {
"ids": {
"values": [
"35",
"333"
]
}
}
}
单条精确term(算分)
类似于
MySQL
的=
es:
GET /index_learn_test/_search
{
"query": {
"term": {
"name.keyword": {
"value": "王正年2"
}
}
}
}
多条精确terms(算分)
类似于
MySQL
的in
es:
GET /index_learn_test/_search
{
"query": {
"terms": {
"age": [
26,
27
]
}
}
}
模糊查询
wildcard(算分)
类似于
MySQL
的like
该方法需要将字段定义成wildcard
类型
es:
GET /index_learn_test/_search
{
"query": {
"wildcard": {
"resume": {
"wildcard": "*我是王*,大家*"
}
}
}
}
java代码:
public void wildcard() throws IOException {
SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
.searchSource()
.query(QueryBuilders.wildcardQuery("resume", "*我是王*,大家*")));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
}
match(算分)
基于分词的查询搜索,如果需要根据短语搜索请使用
match_parse
es:
GET /index_learn_test/_search
{
"query": {
"match": {
"introduce": "齐,今年"
}
}
}
java代码:
@Test
public void match() throws IOException {
SearchRequest searchRequest = new SearchRequest(getIndex());
QueryBuilder queryBuilder = QueryBuilders.matchQuery("introduce", "齐,今年");
searchRequest.source(SearchSourceBuilder
.searchSource()
.query(queryBuilder));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
}
match_parse(算分)
基于短语的匹配,匹配的分词间隔参数
slop
es:
GET /index_learn_test/_search
{
"query": {
"match_phrase": {
"introduce":{
"query": "齐,今年",
"slop": 3
}
}
}
}
java代码:
@Test
public void matchParse() throws IOException {
SearchRequest searchRequest = new SearchRequest(getIndex());
QueryBuilder queryBuilder = QueryBuilders.matchPhraseQuery("introduce", "齐,今年").slop(4);
searchRequest.source(SearchSourceBuilder
.searchSource()
.query(queryBuilder));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
}
组合查询bool
filter(and,不算分)
如非需要算分排序,就用filter
es:
GET /index_learn_test/_search
{
"query": {
"bool": {
"filter": [
{
"range": {
"age": {
"gte": 50,
"lte": 60
}
}
},
{
"term": {
"departmentId": "F"
}
}
]
}
},
"sort": [
{
"age": {
"order": "desc"
}
}
]
}
java代码:
@Test
public void filter() throws IOException {
SearchRequest searchRequest = new SearchRequest(getIndex());
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.filter(QueryBuilders.rangeQuery("age").from(50, true).to(60,true));
boolQueryBuilder.filter(QueryBuilders.termQuery("departmentId","F"));
searchRequest.source(SearchSourceBuilder
.searchSource()
.query(boolQueryBuilder));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
}
must(and,算分,比filter效率低)
如非需要算分排序,则使用
filter
es:
GET /index_learn_test/_search
{
"query": {
"bool": {
"must": [
{
"range": {
"age": {
"gte": 50,
"lte": 60,
"include_lower": true,
"include_upper": true
}
}
},
{
"term": {
"departmentId": "F"
}
}
]
}
},
"sort": [
{
"age": {
"order": "desc"
}
}
]
}
java代码:
@Test
public void must() throws IOException {
SearchRequest searchRequest = new SearchRequest(getIndex());
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(QueryBuilders.rangeQuery("age").from(50, true).to(60,true));
boolQueryBuilder.must(QueryBuilders.termQuery("departmentId","F"));
searchRequest.source(SearchSourceBuilder
.searchSource()
.query(boolQueryBuilder));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
}
must_not(not,不算分)
es:
GET /index_learn_test/_search
{
"query": {
"bool": {
"must_not": [
{
"range": {
"age": {
"gte": 50,
"lte": 60,
"include_lower": true,
"include_upper": true
}
}
},
{
"term": {
"departmentId": "F"
}
}
]
}
},
"sort": [
{
"age": {
"order": "desc"
}
}
]
}
java代码:
@Test
public void mustNot() throws IOException {
SearchRequest searchRequest = new SearchRequest(getIndex());
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.mustNot(QueryBuilders.rangeQuery("age").from(50, true).to(60,true));
boolQueryBuilder.mustNot(QueryBuilders.termQuery("departmentId","F"));
searchRequest.source(SearchSourceBuilder
.searchSource()
.query(boolQueryBuilder));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
}
should(or,不算分,效率高)
es:
GET /index_learn_test/_search
{
"query": {
"bool": {
"should": [
{
"range": {
"age": {
"gte": 40,
"lte": 50,
"include_lower": true,
"include_upper": true
}
}
},
{
"term": {
"departmentId": "F"
}
}
]
}
},
"sort": [
{
"departmentId": {
"order": "asc"
}
}
]
}
java代码:
@Test
public void should() throws IOException {
SearchRequest searchRequest = new SearchRequest(getIndex());
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.should(QueryBuilders.rangeQuery("age").from(50, true).to(60,true));
boolQueryBuilder.should(QueryBuilders.termQuery("departmentId","F"));
searchRequest.source(SearchSourceBuilder
.searchSource()
.query(boolQueryBuilder));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
}
复杂案例
以SQL为案例
sql:
select * from index_learn_test
where departmentId = 'E'
and (resume like '*冉晶菊*' or age =30)
es:
GET /index_learn_test/_search
{
"query": {
"bool": {
"filter": [
{
"term": {
"departmentId": "E"
}
},
{
"bool": {
"should":[
{
"term":{
"age":30
}
},
{
"wildcard":{
"resume":"*冉晶菊*"
}
}
]
}
}
]
}
},
"sort": [
{
"age": {
"order": "desc"
}
}
]
}
java代码:
@Test
public void complexQuery() throws IOException {
SearchRequest searchRequest = new SearchRequest(getIndex());
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
//departmentId = E and (introduce like '*冉晶菊*' or age =30)
boolQueryBuilder.filter(QueryBuilders.termQuery("departmentId", "E"));
BoolQueryBuilder shouldBool = QueryBuilders.boolQuery();
shouldBool.should(QueryBuilders.termQuery("age", 30));
shouldBool.should(QueryBuilders.wildcardQuery("resume", "*冉晶菊*"));
boolQueryBuilder.filter(shouldBool);
searchRequest.source(SearchSourceBuilder
.searchSource()
.query(boolQueryBuilder));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
}
聚合
在聚合前使用
query
是对统计数据源进行统一过滤。
分组
size
是对返回数据量的限制,在分组查询前需要对自己查询的数据规模有一定的认知。如果size过大将会导致内存溢出
。
单字段分组(term)
类似于
SQL
中的group by 字段1
其中
order
的字段_count
指对聚合结果数进行排序,如果是_key
则是对统计字段进行排序
es:
GET /index_learn_test/_search
{
"size": 0,
"aggs": {
"这是指标名字随便取": {
"terms": {
"field": "departmentId",
"size": 10,
"order": {
"_count": "desc"
}
}
}
}
}
java:
SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
.searchSource()
.size(0)
.aggregation(AggregationBuilders.terms("这是一个名字随便取")
.order(BucketOrder.count(false)).field("departmentId")));
SearchResponse searchResponse =restHighLevelClient.search(searchRequst, RequestOptions.DEFAULT);
多个字段分组(term)
类似于
SQL
中的group by 字段1, 字段2
注意其中的
order
是对排序的字段进行排序
,不是对结果进行排序
es:
GET /index_learn_test/_search
{
"size": 0,
"aggs": {
"这是一个名字随便取": {
"composite": {
"size": 10,
"sources": [
{
"groupByAge": {
"terms": {
"field": "age",
"order": "asc"
}
}
},
{
"groupBySex": {
"terms": {
"field": "sex",
"order": "asc"
}
}
}
]
}
}
}
}
java代码:
SearchRequest searchRequest = new SearchRequest(getIndex());
TermsValuesSourceBuilder groupByAge = new TermsValuesSourceBuilder("groupByAge").order(SortOrder.ASC).field("age");
TermsValuesSourceBuilder groupBySex = new TermsValuesSourceBuilder("groupBySex").order(SortOrder.ASC).field("sex");
CompositeAggregationBuilder compositeAggregationBuilder = AggregationBuilders.composite("这是一个名字随便取", Lists.newArrayList(groupByAge, groupBySex));
compositeAggregationBuilder.size(10);
searchRequest.source(SearchSourceBuilder
.searchSource()
.size(0)
.aggregation(compositeAggregationBuilder));
SearchResponse searchResponse =restHighLevelClient.search(searchRequst, RequestOptions.DEFAULT);
去重计数(cardinality)
类似于
SQL
中的DISTINCT 字段
es:
GET /index_learn_test/_search
{
"size": 0,
"aggs": {
"这是一个名字随便取": {
"cardinality": {
"field": "name.keyword"
}
}
}
}
java代码:
SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
.searchSource()
.size(0)
.aggregation(AggregationBuilders.cardinality("这是一个名字随便取").field("name.keyword")));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}",searchResponse.toString());
总数计数(value_count)
统计字段不为空的,类似SQL
count(字段)
es:
GET /index_learn_test/_search
{
"query": {
"term": {
"departmentId": {
"value": "E"
}
}
},
"size": 0,
"aggs": {
"这是一个名字随便取": {
"value_count": {
"field": "name.keyword"
}
}
}
}
java代码:
SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
.searchSource()
.size(0)
.aggregation(AggregationBuilders.count("这是一个名字随便取").field("name.keyword")));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}",searchResponsetoString());
最大值(max)
统计字段最大值,字段定义必须是
number
类型,类似SQLmax(字段)
es:
GET /index_learn_test/_search
{
"size": 0,
"aggs": {
"这是指标名字随便取": {
"max": {
"field": "age.number"
}
}
}
}
java代码:
SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
.searchSource()
.size(0)
.aggregation(AggregationBuilders.max("这是一个名字随便取").field("age.number")));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}",searchResponse.toString());
最小值(min)
统计字段最小值,字段定义必须是
number
类型,类似SQLmin(字段)
es:
GET /index_learn_test/_search
{
"size": 0,
"aggs": {
"这是指标名字随便取": {
"min": {
"field": "age.number"
}
}
}
}
java代码:
SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
.searchSource()
.size(0)
.aggregation(AggregationBuilders.min("这是一个名字随便取").field("age.number")));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}",searchResponse.toString());
平均值(avg)
统计字段平均值,字段定义必须是
number
类型,类似SQLavg(字段)
es:
GET /index_learn_test/_search
{
"size": 0,
"aggs": {
"这是指标名字随便取": {
"avg": {
"field": "age.number"
}
}
}
}
java代码:
SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
.searchSource()
.size(0)
.aggregation(AggregationBuilders.avg("这是一个名字随便取").field("age.number")));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}",searchResponse.toString());
求和(sum)
统计字段求和,字段定义必须是
number
类型,类似SQLsum(字段)
es:
GET /index_learn_test/_search
{
"size": 0,
"aggs": {
"这是指标名字随便取": {
"sum": {
"field": "age.number"
}
}
}
}
java代码:
SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
.searchSource()
.size(0)
.aggregation(AggregationBuilders.sum("这是一个名字随便取").field("age.number")));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}",searchResponse.toString());
过滤计数(filter)
类似于sql
select sum(age), count(case when a=1 then 1 else 0 end) from table where age=30
在已有的筛选结果中统计指定部分数据
es:
GET /index_learn_test/_search
{
"size": 0,
"aggs": {
"这是指标名字随便取": {
"filter": {
"bool": {
"filter":[
{
"term":{
"departmentId":"F"
}
},
{
"term":{
"age":30
}
},
{
"term":{
"sex":"女"
}
}
]
}
}
}
}
}
java代码:
SearchRequest searchRequest = new SearchRequest(getIndex());
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.filter(QueryBuilders.termQuery("departmentId", "F"));
boolQueryBuilder.filter(QueryBuilders.termQuery("age", "30"));
boolQueryBuilder.filter(QueryBuilders.termQuery("sex", "女"));
searchRequest.source(SearchSourceBuilder
.searchSource()
.size(0)
.aggregation(AggregationBuilders.filter("这是一个名字随便取",boolQueryBuilder)));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}",searchResponse.toString());
分页
普通查询分页
from + size(允许跳页)
适用于页面显示, 该方法支持跳页查询,深度分页效率低,但最多查询数据量不超过配置
max_result_window
设置的值,默认是1w
。
max_result_window:
// 查看配置该配置
GET /index_learn_test/_settings?include_defaults=true
// 设置该配置
PUT index_learn_test/_settings
{
"index":{
"max_result_window":30000
}
}
es:
GET /index_learn_test/_search
{
"from": 0,
"size": 10,
"sort": [
{
"id": {
"order": "desc"
}
}
]
}
java代码:
SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
.searchSource()
.size(10).from(0)
.sort("id", SortOrder.DESC));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}",JsonUtils.toJsonString(searchResponsegetHits().getHits()));
深度分页search_after(不允许跳页)
类似于SQL中的
select * from table where id >123 order by id asc limit 10
- 该方法必须提供一个唯一值作为排序
- 不允许跳页
- 获取到的数据是实时的
- 保存返回中最后一条数据的
sort
字段数据 - 回传
sort
数据,顺序
和值
都不能发生改变,需要和上一次搜索返回一致
es:
GET /index_learn_test/_search
{
"from": 0,
"size": 10,
"sort": [
{
"age": {
"order": "desc"
}
},
{
"id": {
"order": "desc"
}
}
],
"search_after": [
"59",
25335465
]
}
java:
@Test
public void pageSearchAfter() throws IOException {
pageSearchAfter(pageSearchAfter(null));
}
public Object[] pageSearchAfter(Object[] sort) throws IOException {
SearchRequest searchRequest = new SearchRequest(getIndex());
SearchSourceBuilder searchSourceBuilder = SearchSourceBuilder
.searchSource()
.size(10).from(0)
.sort("age", SortOrder.DESC)
.sort("id", SortOrder.DESC);
if(Objects.nonNull(sort)){
searchSourceBuilder.searchAfter(sort);
}
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
//取最后一条的sort
return searchResponse.getHits().getHits()[searchResponse.getHits().getHits().length-1].getSortValues();
}
深度分页scroll(适用于数据实时性不需要很高的,不允许跳页)
适用于导出(但不推荐用),该方法是非常好资源的,所以在使用完后需要尽快把
scroll释放掉
。
- 获取
scroll_id
10m
表示scroll_id保留十分钟- from必须是
0
开始 - 保存返回中的
_scroll_id
值
GET /index_learn_test/_search?scroll=10m
{
"from": 0,
"size": 10,
"sort": [
{
"id": {
"order": "desc"
}
}
]
}
- 根据
scroll_id
获取下一页
GET /_search/scroll
{
"scroll_id" : "FGluY2x1……",
"scroll": "10m"
}
- 根据
scroll_id
释放资源
DELETE _search/scroll/${scroll_id}
聚合后分页
from + size分页(支持跳页)
深度分页效率低
composite
的size
是指定分页的数据量,应大于后面的from+size
的大小,否则获取不到数据size
不能太大否则会造成报错,默认值是10
,最大值是65535
bucket_sort
中的sort
也只能对size
范围内的数据进行排序,所以想排序的情况下而且只有一个分组条件使用term
可以对数据进行排序后,再根据size返回一定规模的数据
es
GET /index_learn_test/_search
{
"size": 0,
"aggs": {
"这是一个名字随便取": {
"composite": {
"size": 1000,
"sources": [
{
"groupByAge": {
"terms": {
"field": "age",
"order": "asc"
}
}
},
{
"groupBySex": {
"terms": {
"field": "sex",
"order": "asc"
}
}
}
]
},
"aggs": {
"统计数量":{
"value_count": {
"field": "id"
}
},
"聚合分页": {
"bucket_sort": {
"sort": [{"统计数量":{"order":"desc"}}],
"from": 20,
"size": 10
}
}
}
}
}
}
java代码:
SearchRequest searchRequest = new SearchRequest(getIndex());
TermsValuesSourceBuilder groupByAge = new TermsValuesSourceBuilder("groupByAge").order(SortOrder.ASC).field("age");
TermsValuesSourceBuilder groupBySex = new TermsValuesSourceBuilder("groupBySex").order(SortOrder.ASC).field("sex");
CompositeAggregationBuilder compositeAggregationBuilder = AggregationBuilders.composite("这是一个名字随便取", Lists.newArrayList(groupByAge, groupBySex));
//分组返回的数据规模
compositeAggregationBuilder.size(1000);
//分页对象
List<FieldSortBuilder> fieldSortBuilders = new ArrayList<>();
BucketSortPipelineAggregationBuilder pipelineAggregationBuilder = new BucketSortPipelineAggregationBuilder("聚合分页", fieldSortBuilders);
//from+size不应该超过上方设置的数据规模
pipelineAggregationBuilder.from(100);
pipelineAggregationBuilder.size(10);
compositeAggregationBuilder.subAggregtionpipelineAggregationBuilder);
//排序字段
fieldSortBuilders.add( new FieldSortBuilder("统计数量").order(SortOrder.DESC));
AggregationBuilder sortField = AggregationBuilders.count("统计数量").field("id");
compositeAggregationBuilder.subAggregation(sortField);
searchRequest.source(SearchSourceBuilder
.searchSource()
.size(0)
.aggregation(compositeAggregationBuilder));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}",searchResponse.toString());
其他
查看空间使用情况
GET /_cat/allocation?v