ES7.6先分组,在求多个字段的sum和count

该代码片段展示了如何使用ElasticsearchAPI对用户的文章按天进行统计,包括点赞、转发、评论数量和用户总数。它通过SearchRequest和AggregationBuilders执行查询,获取指定用户在每一天的数据汇总。
public void statByUserIdByDay(String userId){
        try {
            SearchRequest searchRequest = new SearchRequest("article_info");
            SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder().size(0);

            searchSourceBuilder.query(QueryBuilders.termQuery("user_id", userId));
            DateHistogramAggregationBuilder dateAggregation = AggregationBuilders
                    .dateHistogram("agg_pubtime")
                    .field("pubtime")
                    .fixedInterval(DateHistogramInterval.DAY)
                    .format("yyyy-MM-dd")
                    .minDocCount(0);

            SumAggregationBuilder likesAggregation = AggregationBuilders.sum("likes_sum").field("like_count");
            SumAggregationBuilder retweetsAggregation = AggregationBuilders.sum("rtt_sum").field("rtt_count");
            SumAggregationBuilder commentsAggregation = AggregationBuilders.sum("comments_sum").field("comment_count");
            ValueCountAggregationBuilder countAggregation = AggregationBuilders.count("count").field("user_id");

            dateAggregation.subAggregation(likesAggregation)
                    .subAggregation(retweetsAggregation)
                    .subAggregation(commentsAggregation)
                    .subAggregation(countAggregation);

            searchSourceBuilder.aggregation(dateAggregation);

            searchRequest.source(searchSourceBuilder);

            SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);

            Histogram agg_pubtime = searchResponse.getAggregations().get("agg_pubtime");

            for (Histogram.Bucket bucket : agg_pubtime.getBuckets()) {
                String pubtime = bucket.getKeyAsString();

                Aggregations aggregations = bucket.getAggregations();
                Sum likesSum = aggregations.get("likes_sum");
                Sum forwardsSum = aggregations.get("rtt_sum");
                Sum commentsSum = aggregations.get("comments_sum");
                ValueCount count = aggregations.get("count");

                System.out.println("userId="+userId +"  pubtime="+pubtime+" likesSum ="+likesSum.getValue() +"  forwardsSum="+forwardsSum.getValue() +"    commentsSum="+commentsSum.getValue() + "   count ="+count.getValue());
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
select pubtime, sum(field1), sum(field2),sum(field3), count(userId) from table where userId = xxxx group by 按天(pubtime); 

在Elasticsearch中,使用Java API实现多字段的`GROUP BY`分组通常通过聚合功能中的`TermsAggregationBuilder`结合`Script`来完成。与单字段分组不同,多字段分组需要通过脚本(Script)将多个字段组合成一个唯一的键(key),从而实现多字段分组统计。 以下是一个使用Elasticsearch Java API实现多字段分组的示例代码: ```java import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.script.Script; import org.elasticsearch.script.ScriptType; import org.elasticsearch.script.ScriptType; import org.elasticsearch.script.Script; import org.elasticsearch.search.aggregations.AggregationBuilders; import org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder; import org.elasticsearch.search.builder.SearchSourceBuilder; import org.elasticsearch.index.query.QueryBuilder; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.index.query.QueryBuilder; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.index.query.QueryBuilder; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.index.query.QueryBuilder; import org.elasticsearch.index.query.QueryBuilder; // 创建查询构建器,这里使用匹配所有文档 QueryBuilder queryBuilder = QueryBuilders.matchAllQuery(); // 使用脚本创建多字段组合的terms聚合 Script script = new Script(ScriptType.INLINE, "painless", "doc['field1.keyword'].value + '-' + doc['field2.keyword'].value", null); TermsAggregationBuilder aggregation = AggregationBuilders.terms("multi_group_by") .script(script) .size(100); // 根据实际数据量调整size // 构建搜索源 SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.query(queryBuilder); sourceBuilder.aggregation(aggregation); sourceBuilder.size(0); // 不需要返回具体文档,只返回聚合结果 // 构建搜索请 SearchRequest searchRequest = new SearchRequest("your_index_name"); searchRequest.source(sourceBuilder); // 执行搜索请 SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT); // 处理响应结果 Terms multiGroupAggregation = response.getAggregations().get("multi_group_by"); for (Terms.Bucket entry : multiGroupAggregation.getBuckets()) { System.out.println("Key: " + entry.getKey() + " | Doc count: " + entry.getDocCount()); } ``` 上述代码中,通过`Script`将两个字段`field1``field2`的值拼接成一个唯一的字符串作为分组的依据,从而实现多字段分组统计。使用脚本的方式相比递归更高效,且适用于更复杂的分组逻辑[^1]。 在实际应用中,可以根据需扩展脚本内容,例如添加更多字段、使用不同的分隔符、处理空值等。此外,还可以结合其他聚合类型(如`avg`、`sum`等)对每个分组进行进一步的统计分析。 如果需要更高层级的分组嵌套(如字段A分组,再按字段B分组),可以通过`subAggregation`方法添加子聚合,实现多级分组结构。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值