Elasticsearch 重要的查询

最新推荐文章于 2025-07-31 12:28:40 发布

zxc123e

最新推荐文章于 2025-07-31 12:28:40 发布

阅读量4.8k

点赞数

CC 4.0 BY-SA版权

分类专栏： elasticsearch 文章标签： Elasticsearch query

本文链接：https://blog.youkuaiyun.com/zxc123e/article/details/79268361

elasticsearch 专栏收录该内容

3 篇文章

订阅专栏

本文介绍了Elasticsearch中各种查询技巧，包括简单查询、条件查询、聚合查询、通配符查询及子条件查询等，并详细解释了Querycontext和Filtercontext的区别。

一、初级查询

首先我们的操作都是针对这11条数据。book索引，novel type的数据如下：
这里写图片描述
除特别说明，以下查询都是以下面的地址为url。

http://192.168.124.128:9200/book/_search

1. 简单查询

http://192.168.124.128:9200/book/novel/1
http://192.168.124.128:9200/book/novel/1?version=1

2. 条件查询

所有的查询都是以query作为关键词

查询所有数据

{
    "query":{
        "match_all":{}
    }
}

查询默认返回10条数据，查询结果中took值表示查询响应使用了多少毫秒。

{
    "query":{
        "match_all":{}
    },
    "from":1,
    "size":2
}

from表示从第一条结果开始，总共返回2条数据。

关键词查询

{
    "query":{
        "match":{
            "title":"Elasticsearch"
        }
    }
}

查询title包含Elasticsearch的文档，返回结果如下：
这里写图片描述

匹配（match）查询属于全文查询，ElasticSearch引擎在处理全文搜索时，首先分析（analyze）查询字符串，然后根据分词构建查询，最终返回查询结果。对于全文查询条件 “title”: “Quick Foxes!”，ElasticSearch引擎首先分析查询字符串，将其拆分成两个小写的分词。只要title字段值中包含有任意一个关键字quick或者oxes，那么返回该文档。

我们也可以使用下面的url得到上面的结果
这里写图片描述

返回两天关于Elasticsearch的结果，结果是以_score值得倒序排列的。

指定排序规则

{
    "query":{
        "match":{
            "title":"Elasticsearch"
        }
    },
    "sort":{
        "publish_date":{"order":"desc"}
    }
}

返回结果如下：
这里写图片描述

3. 聚合查询

单个分组聚合

{
    "aggs":{
        "group_by_word_count":{
            "terms":{
                "field":"word_count"
            }
        }
    }
}

group_by_word_count这个名字是自定义的，可以使用任意的名字。
这里写图片描述

多个分组聚合

{
    "aggs":{
        "group_by_word_count":{
            "terms":{
                "field":"word_count"
            }
        },
        "group_by_publish_date":{
            "terms":{
                "field":"publish_date"
            }
        }
    }
}

查询结果中多一个以日期分组的统计结果

{
    "aggs":{
        "grades_word_count":{
            "stats":{
                "field":"word_count"
            }
        }
    }
}

这里写图片描述

4. 通配符查询

前缀匹配查询

{
    "query":{
        "prefix":{
            "title":"python"
        }

    }
}

当字段中词的集合很小时，可以放心使用，但是它的伸缩性并不好，会对我们的集群带来很多压力。可以使用较长的前缀来限制这种影响，减少需要访问的量。

通配符查询

? 匹配任意字符， * 匹配 0 或多个字符。

{
    "query":{
        "wildcard":{
            "title":"*web"
        }

    }
}

此处web要小写

"hits": [
      {
        "_index": "book",
        "_type": "novel",
        "_id": "10",
        "_score": 1,
        "_source": {
          "author": "charmingfst",
          "title": "Java Web",
          "word_count": 1000,
          "publish_date": "2000-08-01"
        }
      }
    ]

正则匹配查询

"query": {
   "regexp": {
       "title": "E[0-9].+" 
   }
}

这个正则表达式要求词必须以 E 开头，紧跟 0 至 9 之间的任何一个数字，然后接一或多个其他字符。

正则表达式(RegExp）查询，可用在已分析（analyzed）的字符字段上，在不分析（not_analyzed）的字段上执行正则表达式查询，注意，字符的大小写。

以下是Elasticsearch权威指南的一段说明：

wildcard 和 regexp 查询的工作方式与 prefix 查询完全一样，它们也需要扫描倒排索引中的词列表才能找到所有匹配的词，然后依次获取每个词相关的文档 ID ，与 prefix 查询的唯一不同是：它们能支持更为复杂的匹配模式。

prefix 、 wildcard 和 regexp 查询是基于词操作的，如果用它们来查询 analyzed 字段，它们会检查字段里面的每个词，而不是将字段作为整体来处理。

比方说包含 “Quick brown fox” 的 title 字段会生成词： quick 、 brown 和 fox 。

会匹配以下这个查询：

{ "regexp": { "title": "br.*" }}

但是不会匹配以下两个查询：

{ "regexp": { "title": "Qu.*" }}  //在索引里的词是 quick 而不是 Quick 。
{ "regexp": { "title": "quick br*" }} //quick 和 brown 在词表中是分开的。

二、子条件查询

查询特定字段所指特定值

子条件查询分为Query context和Filter context。

1. Query Context

在查询过程中，除了判断文档是否满足查询条件外，ES还会计算一个_score来标识匹配的程度，旨在判断目标文档和查询条件匹配的程度。

字段级别查询：字段级别查询不会分析查询条件，只有当词条和查询字符串完全匹配时，才匹配搜索。当在未被分析的字段中进行搜索时，和查询字符串完全匹配的文档会被返回；如果在已分析（Analyzed）的字段中进行搜索，词条必须是小写的单个词条，否则，匹配不到任何文档。

全文本查询：ElasticSearch引擎会先分析（analyze）查询字符串，将其拆分成小写的分词，只要已分析的字段中包含词条的任意一个，或全部包含，就匹配查询条件，返回该文档；如果不包含任意一个分词，表示没有任何文档匹配查询条件。

1) 全文本查询

针对文本类型数据

模糊匹配

这里写图片描述
会查询出包含Elasticsearch和入门的所有文档

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 2.0794415,
    "hits": [
      {
        "_index": "book",
        "_type": "novel",
        "_id": "7",
        "_score": 2.0794415,
        "_source": {
          "author": "孙七",
          "title": "Elasticsearch入门",
          "word_count": 3000,
          "publish_date": "2017-10-01"
        }
      },
      {
        "_index": "book",
        "_type": "novel",
        "_id": "3",
        "_score": 1.219939,
        "_source": {
          "author": "李三",
          "title": "Python入门",
          "word_count": 2000,
          "publish_date": "2005-10-01"
        }
      },
      {
        "_index": "book",
        "_type": "novel",
        "_id": "9",
        "_score": 0.96669346,
        "_source": {
          "author": "chm",
          "title": "Elasticsearch精髓",
          "word_count": 3000,
          "publish_date": "2017-08-01"
        }
      }
    ]
  }
}

习语查询

{
    "query":{
        "match_phrase":{
            "title":"Elasticsearch入门"
        }
    }
}

将Elasticsearch入门作为一个整体查询，查询结果

{
  "took": 29,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 2.0794415,
    "hits": [
      {
        "_index": "book",
        "_type": "novel",
        "_id": "7",
        "_score": 2.0794415,
        "_source": {
          "author": "孙七",
          "title": "Elasticsearch入门",
          "word_count": 3000,
          "publish_date": "2017-10-01"
        }
      }
    ]
  }
}

match_phrase检索时候，文档必须同时满足以下两个条件，才能被检索到：
1）分词后所有词项都出现在该字段中；
2）字段中的词项顺序要一致。

"match_phrase_prefix" : {
        "title" : {
            "query": "道路",
            "max_expansions": 50
        }
    }

除了把查询文本的最后一个分词只做前缀匹配之外，match_phrase_prefix和match_phrase查询基本一样，参数 max_expansions 控制最后一个单词会被重写成多少个前缀，也就是，控制前缀扩展成分词的数量，默认值是50。

匹配的时候，如果想尽可能的多检索结果，考虑使用match;
如果想尽可能精确的匹配分词结果，考虑使用match_phrase;
如果短语匹配的时候，怕遗漏，考虑使用match_phrase_prefix。

多个字段匹配查询

{
    "query":{
        "multi_match":{
            "query":"chm",
            "fields":["author","title"]
        }
    }
}

这会查询author和title字段包含chm的所有文档。

语法查询

{
    "query":{
        "query_string":{
            "query":"(Elasticsearch AND 入门)  OR charmingfst"
        }
    }
}

查询结果如下

"hits": {
    "total": 2,
    "max_score": 2.0592237,
    "hits": [
      {
        "_index": "book",
        "_type": "novel",
        "_id": "7",
        "_score": 2.0592237,
        "_source": {
          "author": "孙七",
          "title": "Elasticsearch入门",
          "word_count": 3000,
          "publish_date": "2017-10-01"
        }
      },
      {
        "_index": "book",
        "_type": "novel",
        "_id": "10",
        "_score": 1.261305,
        "_source": {
          "author": "charmingfst",
          "title": "Java Web",
          "word_count": 1000,
          "publish_date": "2000-08-01"
        }
      }
    ]

查询author和title中包含”Elasticsearch”或者”张三”的文档

{
    "query":{
        "query_string":{
            "query":"Elasticsearch OR 张三",
            "fields":["author", "title"]
        }
    }
}

结果如下：

"hits": [
      {
        "_index": "book",
        "_type": "novel",
        "_id": "9",
        "_score": 0.96669346,
        "_source": {
          "author": "chm",
          "title": "Elasticsearch精髓",
          "word_count": 3000,
          "publish_date": "2017-08-01"
        }
      },
      {
        "_index": "book",
        "_type": "novel",
        "_id": "1",
        "_score": 0.6931472,
        "_source": {
          "author": "张三",
          "title": "kafka权威指南",
          "word_count": 1000,
          "publish_date": "2018-01-01"
        }
      },
      {
        "_index": "book",
        "_type": "novel",
        "_id": "7",
        "_score": 0.6931472,
        "_source": {
          "author": "孙七",
          "title": "Elasticsearch入门",
          "word_count": 3000,
          "publish_date": "2017-10-01"
        }
      }
    ]

2) 字段级别查询

针对结构化数据，如数字、日期等

{
    "query":{
        "term":{
            "author":"张三"
        }
    }
}

term 查询被用于精确值匹配，term 查询对于输入的文本不分析，所以它将给定的值进行精确查询。

字段级别查询支持范围查询

{
    "query":{
        "range":{
            "publish_date":{
                "gt":"2017-01-01",
                "lte":"now"
            }
        }
    }
}

2. Filter context

在查询过程中，只判断该文档是否满足条件，只有Yes或者No。不关心匹配度。
查询word_count为1000的文档，需要结合bool关键字

{
    "query":{
        "bool":{
            "filter":{
                "term":{
                    "word_count":1000
                }
            }
        }
    }
}

ES会对filter查询结果做缓存，所以相对query较快一些。

三、复合条件查询

以一定的逻辑组合子条件查询

固定分数查询

{
    "query":{
        "constant_score":{
            "filter":{
                "match":{
                    "title":"Elasticsearch"
                }
            }
        }
    }
}

查询结果如下：

"hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "book",
        "_type": "novel",
        "_id": "9",
        "_score": 1,
        "_source": {
          "author": "chm",
          "title": "Elasticsearch精髓",
          "word_count": 3000,
          "publish_date": "2017-08-01"
        }
      },
      {
        "_index": "book",
        "_type": "novel",
        "_id": "7",
        "_score": 1,
        "_source": {
          "author": "孙七",
          "title": "Elasticsearch入门",
          "word_count": 3000,
          "publish_date": "2017-10-01"
        }
      }
    ]

可以看到_score的分数都是1。还可以使用boost关键字指定查询结果的分数

{
    "query":{
        "constant_score":{
            "filter":{
                "match":{
                    "title":"Elasticsearch"
                }
            },
            "boost":2
        }
    }
}

此时查询的结果中_score的值都为2。

布尔查询

使用should关键字查询

{
    "query":{
        "bool":{
            "should":[
                {"match":{"author":"张三"}},
                {"match":{"title":"Elasticsearch"}}
            ]
        }
    }
}

查询结果如下三条：

"hits": [
      {
        "_index": "book",
        "_type": "novel",
        "_id": "9",
        "_score": 0.96669346,
        "_source": {
          "author": "chm",
          "title": "Elasticsearch精髓",
          "word_count": 3000,
          "publish_date": "2017-08-01"
        }
      },
      {
        "_index": "book",
        "_type": "novel",
        "_id": "1",
        "_score": 0.6931472,
        "_source": {
          "author": "张三",
          "title": "kafka权威指南",
          "word_count": 1000,
          "publish_date": "2018-01-01"
        }
      },
      {
        "_index": "book",
        "_type": "novel",
        "_id": "7",
        "_score": 0.6931472,
        "_source": {
          "author": "孙七",
          "title": "Elasticsearch入门",
          "word_count": 3000,
          "publish_date": "2017-10-01"
        }
      }
    ]

使用must关键字

{
    "query":{
        "bool":{
            "must":[
                {"match":{"author":"chm"}},
                {"match":{"title":"Elasticsearch"}}
            ]
        }
    }
}

"hits": [
      {
        "_index": "book",
        "_type": "novel",
        "_id": "9",
        "_score": 2.1706662,
        "_source": {
          "author": "chm",
          "title": "Elasticsearch精髓",
          "word_count": 3000,
          "publish_date": "2017-08-01"
        }
      }
    ]

should相当与or的关系，must相当于and的关系。

must还可以与filter组合

{
    "query":{
        "bool":{
            "must":[
                {"match":{"author":"张三"}},
                {"match":{"title":"Elasticsearch"}}
            ],
            "filter":[
                {"term":{"word_count":1000}}
            ]

        }
    }
}

还可以查询指定一定不满足某个条件的文档

{
    "query":{
        "bool":{
            "must_not":{
                "term":{
                    "author":"王五"
                }
            }
        }
    }
}