.value(
(new MultiValuesSourceFieldConfig.Builder())
.setFieldName(“num”)
.setMissing(0)
.build()
)
.weight(
(new MultiValuesSourceFieldConfig.Builder())
.setFieldName(“num”)
.setMissing(1)
.build()
)
// .valueType(ValueType.LONG)
;
avg.toString();
sourceBuilder.aggregation(avg);
sourceBuilder.size(0);
sourceBuilder.query(
QueryBuilders.termQuery(“sellerId”, 24)
);
searchRequest.source(sourceBuilder);
SearchResponse result = client.search(searchRequest, RequestOptions.DEFAULT);
System.out.println(result);
} catch (Throwable e) {
e.printStackTrace();
} finally {
EsClient.close(client);
}
}
基数聚合,先distinct,再聚合,类似关系型数据库(count(distinct))。
示例如下:
POST /sales/_search?size=0
{
“aggs” : {
“type_count” : {
“cardinality” : {
“field” : “type”
}
}
}
}
对应的JAVA示例如下:
public static void test_Cardinality_Aggregation() {
RestHighLevelClient client = EsClient.getClient();
try {
SearchRequest searchRequest = new SearchRequest();
searchRequest.indices(“aggregations_index02”);
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
AggregationBuilder aggregationBuild = AggregationBuilders.cardinality(“buyerid_count”).field(“buyerId”);
sourceBuilder.aggregation(aggregationBuild);
sourceBuilder.size(0);
sourceBuilder.query(
QueryBuilders.termQuery(“sellerId”, 24)
);
searchRequest.source(sourceBuilder);
SearchResponse result = client.search(searchRequest, RequestOptions.DEFAULT);
System.out.println(result);
} catch (Throwable e) {
e.printStackTrace();
} finally {
EsClient.close(client);
}
}
返回结果如下:
{
“took”:30,
“timed_out”:false,
“_shards”:{
“total”:5,
“successful”:5,
“skipped”:0,
“failed”:0
},
“hits”:{
“total”:39,
“max_score”:0,
“hits”:[
]
},
“aggregations”:{
“cardinality#type_count”:{
“value”:11
}
}
}
上述实现与SQL:SELECT COUNT(DISTINCT buyerId) from es_order_tmp where sellerId=24; 效果类似,表示购买了商家id为24的买家个数。
其核心参数如下:
- precision_threshold
精确度控制。在此计数之下,期望计数接近准确。在这个值之上,计数可能会变得更加模糊(不准确)。支持的最大值是40000,超过此值的阈值与40000的阈值具有相同的效果。默认值是3000。
上述示例中返回的11是精确值,如果改写成下面的代码,结果将变的不准确:
field(“buyerId