Elasticsearch 聚合查询

1.聚合与搜索的概念

    通俗的说:搜索是查找某些具体的文档.然而聚合就是对这些搜索到的文档进行统计。

 

2.高阶概念

    Buckets(桶/集合):满足特定条件的文档的集合

    Metrics(指标):对桶内的文档进行统计计算(例如最小值,求和,最大值等).

 

3.举例说明—关于汽车数据的相关聚合(Index=cars;type=transactions)

3.1 第一步添加创建相关的数据

POST /cars/transactions/_bulk
{ "index": {}}
{ "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2014-10-28" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2014-05-18" }
{ "index": {}}
{ "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : "2014-07-02" }
{ "index": {}}
{ "price" : 12000, "color" : "green", "make" : "toyota", "sold" : "2014-08-19" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2014-01-01" }
{ "index": {}}
{ "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" }

    注意:官方文档说明,如何设置fildData.

    Fielddata is disabled on text fields by default. Set fielddata=true on [your_field_name] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory.

    设置方法:

PUT cars/_mapping/transactions/
{
  "properties": {
    "color": { 
      "type":     "text",
      "fielddata": true
    }
  }
}

3.2 实战之—-查询那个颜色的汽车销量最好?

3.2.1 使用http-restfull查询

GET /cars/transactions/_search
{
    "size" : 0,//不需要返回文档,所以直接设置为0.可以提高查询速度
    "aggs" : { //这个是aggregations的缩写,这边用户随意,可以写全称也可以缩写
        "popular_colors" : { //定义一个聚合的名字,与java的方法命名类似,建议用'_'线来分隔单词
            "terms" : { //定义单个桶(集合)的类型为 terms
              "field" : "color"(字段颜色进行分类,类似于sql中的group by color)
            }
        }
    }
}

3.2.2 使用java-api的形式查询

public void aggsTermsQuery(){
        SearchResponse response = transportClient.prepareSearch("cars")
                .setTypes("transactions")
                .addAggregation(
                        AggregationBuilders.terms("popular_colors")
                                .field("color"))
                .setSize(0)
                .get();
        Aggregation popular_colors = response.getAggregations().get("popular_colors");
    }

3.2.3 返回结果

{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 16,
    "max_score": 0,
    "hits": []  //因为我们设置了返回的文档数量为0,所以在这个文档里面是不会包含具体的文档的
  },
  "aggregations": {
    "popular_colors": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "red",
          "doc_count": 8
        },
        {
          "key": "blue",
          "doc_count": 4
        },
        {
          "key": "green",
          "doc_count": 4
        }
      ]
    }
  }
}

 

3.3 实战之—-在上面的聚合基础上添加一些指标—>’average‘平均价格

3.3.1 http请求查询

GET /cars/transactions/_search
{
   "size" : 0,
   "aggs": {
      "colors": {
         "terms": {
            "field": "color"
         },
         "aggs": { //为指标新增aggs层
            "avg_price": { //指定指标的名字,在返回的结果中也是用这个变量名来储存数值的
               "avg": {//指标参数:平均值
                  "field": "price" //明确求平均值的字段为'price'
               }
            }
         }
      }
   }
}

3.3.2 java-api查询

@Test
    public void setMertricsQuery(){
        SearchResponse response = transportClient.prepareSearch("cars")
                .setTypes("transactions")
                .addAggregation(
                        AggregationBuilders.terms("colors")
                                .field("color")
                                //添加指标
                                .subAggregation(AggregationBuilders
                                        .avg("avg_price")
                                        .field("price")
                                )
                )
                .setSize(0)
                .get();
        Aggregation colors = response.getAggregations().get("colors");
    }

3.3.3 返回结果

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 16,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "colors": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "red",
          "doc_count": 8,
          "avg_price": {
            "value": 32500
          }
        },
        {
          "key": "blue",
          "doc_count": 4,
          "avg_price": {
            "value": 20000
          }
        },
        {
          "key": "green",
          "doc_count": 4,
          "avg_price": {
            "value": 21000
          }
        }
      ]
    }
  }
}

 

3.4 实战之—-桶/集合(Buckets)的嵌套,在沙面的基础上,先按照颜色划分—>再汽车按照厂商划分

3.4.1 http请求

GET /cars/transactions/_search
{
   "size" : 0,
   "aggs": {
      "colors": {
         "terms": {
            "field": "color"
         },
         "aggs": {
            "avg_price": { 
               "avg": {
                  "field": "price"
               }
            },
            "make": { //命名子集合的名字
                "terms": {
                    "field": "make" //按照字段'make'再次进行分类
                }
            }
         }
      }
   }
}

3.4.2 java-api请求方式

@Test
    public void subMertricsQuery(){
        SearchResponse response = transportClient.prepareSearch("cars")
                .setTypes("transactions")
                .addAggregation(
                        AggregationBuilders.terms("colors")
                                .field("color")
                                .subAggregation(AggregationBuilders
                                        .avg("avg_price")
                                        .field("price")
                                )
                                .subAggregation(AggregationBuilders
                                        .terms("make")//子集合的名字
                                        .field("make")//分类的字段
                                )
                )
                .setSize(0)
                .get();
        Aggregation colors = response.getAggregations().get("colors");
    }

3.4.3 返回结果

{
  "took": 13,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 16,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "colors": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "red",
          "doc_count": 8,
          "avg_price": {
            "value": 32500
          },
          "make": {   //子集合的名字
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "honda",
                "doc_count": 6
              },
              {
                "key": "bmw",
                "doc_count": 2
              }
            ]
          }
        },
        {
          "key": "blue",
          "doc_count": 4,
          "avg_price": {
            "value": 20000
          },
          "make": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "ford",
                "doc_count": 2
              },
              {
                "key": "toyota",
                "doc_count": 2
              }
            ]
          }
        },
        {
          "key": "green",
          "doc_count": 4,
          "avg_price": {
            "value": 21000
          },
          "make": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "ford",
                "doc_count": 2
              },
              {
                "key": "toyota",
                "doc_count": 2
              }
            ]
          }
        }
      ]
    }
  }
}

 

3.5 实战之—-在上面的结果基础上,在增加一个指标,就是查询出每个制造商生产的最贵和最便宜的车子的价格分别是多少

3.5.1 http请求

GET /cars/transactions/_search
{
   "size" : 0,
   "aggs": {
      "colors": {
         "terms": {
            "field": "color"
         },
         "aggs": {
            "avg_price": { "avg": { "field": "price" }
            },
            "make" : {
                "terms" : {
                    "field" : "make"
                },
                "aggs" : { 
                    "min_price" : { //自定义变量名字
                        "min": { //参数-最小值
                            "field": "price"
                            } 
                        }, 
                    "max_price" : {
                         "max": { //参数-最大值
                                 "field": "price"
                                 } 
                         } 
                }
            }
         }
      }
   }
}

3.5.2 java-api请求

@Test
    public void subMertricsQuery(){
        SearchResponse response = transportClient.prepareSearch("cars")
                .setTypes("transactions")
                .addAggregation(
                        AggregationBuilders.terms("colors")
                                .field("color")
                                .subAggregation(AggregationBuilders
                                        .avg("avg_price")
                                        .field("price")
                                )
                                .subAggregation(AggregationBuilders
                                        .terms("make")
                                        .field("make")
                                        .subAggregation(AggregationBuilders
                                                        .max("max_price")
                                                        .field("price")
                                        )
                                        .subAggregation(AggregationBuilders
                                                .min("min_price")
                                                .field("price")
                                        )
                                )
                )
                .setSize(0)
                .get();
        Aggregation colors = response.getAggregations().get("colors");
    }

3.5.3 返回结果

{
  "took": 17,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 16,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "colors": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "red",
          "doc_count": 8,
          "avg_price": {
            "value": 32500
          },
          "make": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "honda",
                "doc_count": 6,
                "max_price": {
                  "value": 20000
                },
                "min_price": {
                  "value": 10000
                }
              },
              {
                "key": "bmw",
                "doc_count": 2,
                "max_price": {
                  "value": 80000
                },
                "min_price": {
                  "value": 80000
                }
              }
            ]
          }
        },
        {
          "key": "blue",
          "doc_count": 4,
          "avg_price": {
            "value": 20000
          },
          "make": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "ford",
                "doc_count": 2,
                "max_price": {
                  "value": 25000
                },
                "min_price": {
                  "value": 25000
                }
              },
              {
                "key": "toyota",
                "doc_count": 2,
                "max_price": {
                  "value": 15000
                },
                "min_price": {
                  "value": 15000
                }
              }
            ]
          }
        },
        {
          "key": "green",
          "doc_count": 4,
          "avg_price": {
            "value": 21000
          },
          "make": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "ford",
                "doc_count": 2,
                "max_price": {
                  "value": 30000
                },
                "min_price": {
                  "value": 30000
                }
              },
              {
                "key": "toyota",
                "doc_count": 2,
                "max_price": {
                  "value": 12000
                },
                "min_price": {
                  "value": 12000
                }
              }
            ]
          }
        }
      ]
    }
  }
}

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

程序员学习圈

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值