前面一篇文章描述了es一些基本搜索的用法(match、term、fruzzy、matchPhraseQuery等),这篇文章我们着重讲解一下聚合查询的用法。
假设es中有如下数据
1、group by /count:
select team,count(*) from table group by team;
1)代码:
public static void aggre1Query2(String indexName,String indexType) {
SearchRequestBuilder srb = transportClient.prepareSearch(indexName).setTypes(indexType);
srb.setSearchType(SearchType.COUNT);
TermsBuilder teamAgg= AggregationBuilders.terms("player_count").field("team");
srb.addAggregation(teamAgg);
System.out.println(srb.toString());
SearchResponse searchResponse = srb.execute().actionGet();
Aggregations aggregations = searchResponse.getAggregations();
Map<String, Aggregation> asMap = aggregations.asMap();
Terms terms = (Terms)asMap.get("player_count");
List<Bucket> buckets = terms.getBuckets();
for (Bucket bt : buckets) {
logger.info(bt.getKeyAsString() + " :: " + bt.getDocCount());
}
}
2)输出的dsl:
{
"aggregations" : {
"player_count" : {
"terms" : {
"field" : "team"
}
}
}
}
3)输出:
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:461] war :: 3
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:461] cav :: 2
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:461] tim :: 1
2、group by 多个field/count、min、avg:
1)代码:
public static void aggreQuery1(String indexName,String indexType) {
SearchRequestBuilder srb = transportClient.prepareSearch(indexName).setTypes(indexType);
srb.setSearchType(SearchType.COUNT);
TermsBuilder teamAgg= AggregationBuilders.terms("team_count").field("team");
TermsBuilder positionAgg= AggregationBuilders.terms("position_count").field("position");
teamAgg.subAggregation(positionAgg);
srb.addAggregation(teamAgg);
System.out.println(srb.toString());
SearchResponse searchResponse = srb.execute().actionGet();
Aggregations aggregations = searchResponse.getAggregations();//team agg
Terms terms = aggregations.get("team_count");
List<Bucket> buckets = terms.getBuckets();
for (Bucket bt : buckets) {
logger.info(bt.getKeyAsString() + " :: " + bt.getDocCount());
Aggregations aggregations2 = bt.getAggregations();//position agg
Terms terms2 = aggregations2.get("position_count");
List<Bucket> buckets2 = terms2.getBuckets();
for (Bucket bt2 : buckets2) {
logger.info("---"+bt2.getKeyAsString() + " :: " + bt2.getDocCount());
}
}
}
2)输出的dsl:
{
"aggregations" : {
"team_count" : {
"terms" : {
"field" : "team"
},
"aggregations" : {
"position_count" : {
"terms" : {
"field" : "position"
}
}
}
}
}
}
3)输出结果:
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:430] war :: 3
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:436] ---pf :: 1
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:436] ---pg :: 1
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:436] ---sg :: 1
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:430] cav :: 2
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:436] ---pg :: 1
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:436] ---sf :: 1
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:430] tim :: 1
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:436] ---pf :: 1
4)说明:
多个field的groupby 是一层一层的,所以代码中有两个for循环进行输出。
3、查询条件+max/min/sum/avg聚合+排序:
1)代码
public static void aggre1Query3(String indexName,String indexType) {
boolean min = true;
boolean avg = true;
String groupCol = "team";
SearchRequestBuilder srb = transportClient.prepareSearch(indexName).setTypes(indexType);
srb.setSearchType(SearchType.COUNT);
QueryBuilder qb = QueryBuilders.rangeQuery("age").from(10).to(39);//查询条件
AggregationBuilder aggBuilder= AggregationBuilders.terms("group_name").field(groupCol).order(Order.aggregation("minSalary", false));
if (min) {
aggBuilder.subAggregation(AggregationBuilders.min("minSalary").field("salary"));
}
if (avg) {
aggBuilder.subAggregation(AggregationBuilders.avg("avgSalary").field("salary"));
}
srb.setQuery(qb).addAggregation(aggBuilder);
System.out.println(srb.toString());
SearchResponse searchResponse = srb.execute().actionGet();
Aggregations aggregations = searchResponse.getAggregations();
Map<String, Aggregation> asMap = aggregations.asMap();
Terms terms = (Terms)asMap.get("group_name");
List<Bucket> buckets = terms.getBuckets();
for (Bucket bt : buckets) {
logger.info(bt.getKeyAsString() + " :: " + bt.getDocCount());
Aggregations aggregations2 = bt.getAggregations();
Min minSalary = aggregations2.get("minSalary");
Avg avgSalary = aggregations2.get("avgSalary");
logger.info("min:"+minSalary.value()+",avg:"+avgSalary.value());
}
}
2)dsl输出:
{
"query" : {
"range" : {
"age" : {
"from" : 10,
"to" : 39,
"include_lower" : true,
"include_upper" : true
}
}
},
"aggregations" : {
"group_name" : {
"terms" : {
"field" : "team",
"order" : {
"minSalary" : "desc"
}
},
"aggregations" : {
"minSalary" : {
"min" : {
"field" : "salary"
}
},
"avgSalary" : {
"avg" : {
"field" : "salary"
}
}
}
}
}
}
3)结果输出:
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:499] cav :: 2
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:504] min:2000.0,avg:2500.0
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:499] war :: 3
[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:504] min:1000.0,avg:1666.6666666666667
4)说明:
该例子是一个field的groupby,所以代码中只要一个for循环即可;同时获取min和max等聚合项。
4、聚合后返回的条数:
默认情况下,search执行后,仅返回10条聚合结果,如果想反悔更多的结果,需要在构建TermsBuilder 时指定size:
TermsBuilder teamAgg= AggregationBuilders.terms("team").size(15);
参考:
http://blog.youkuaiyun.com/carlislelee/article/details/52598022
http://outofmemory.cn/code-snippet/38461/elasticsearch-aggregation-search-example
http://blog.youkuaiyun.com/it_lihongmin/article/details/78447001
http://blog.youkuaiyun.com/jacklin929/article/details/70304127