ElasticSearch 查询方法总结 ~ 第四章

吃橘子的汤圆

于 2022-08-18 15:50:46 发布

阅读量1.3k

点赞数

分类专栏： Elasticsearch 文章标签： elasticsearch 大数据搜索引擎

本文链接：https://blog.youkuaiyun.com/qq_43751598/article/details/126381888

版权

Elasticsearch 专栏收录该内容

6 篇文章

订阅专栏

文章目录

注意：学前必读
一、示例代码
二、URI Search
三、Query DSL

注意：学前必读

环境Elastisearch Version 使用的7.1
传送门：安装环境和了解基本概念

一、示例代码

1- 示例代码一

	DELETE my_index

	POST _bulk
	{"create":{"_index":"my_index","_id":1}}
	{"name":"Ruan Yiming","about":"swift, elasticsearch","age":10}
	{"create":{"_index":"my_index","_id":2}}
	{"name":"zhang san","about":"my name zhangsan","age":20}
	{"create":{"_index":"my_index","_id":3}}
	{"name":"li si","about":"my name lisi","age":30}
	{"create":{"_index":"my_index","_id":4}}
	{"name":"wang wu","about":"my name wangwu","age":50}
	{"create":{"_index":"my_index","_id":5}}
	{"name":"zhao liu","about":"my name zhaoliu","age":70}
	{"create":{"_index":"my_index","_id":6}}
	{"name":"wangwu haha","about":"wo jiao haha","age":80}

2- 示例代码二

DELETE blogs
PUT /blogs/_doc/1
{
    "title": "Quick brown rabbits",
    "body":  "Brown rabbits are commonly seen."
}

PUT /blogs/_doc/2
{
    "title": "Keeping pets healthy",
    "body":  "My quick brown fox eats rabbits on a regular basis."
}

3-示例代码三

DELETE products
POST /products/_bulk
{ "index": { "_id": 1 }}
{ "price" : 10,"avaliable":true,"date":"2018-01-01", "productID" : "XHDK-A-1293-#fJ3" }
{ "index": { "_id": 2 }}
{ "price" : 20,"avaliable":true,"date":"2019-01-01", "productID" : "KDKE-B-9947-#kL5" }
{ "index": { "_id": 3 }}
{ "price" : 30,"avaliable":true, "productID" : "JODL-X-1937-#pV7" }
{ "index": { "_id": 4 }}
{ "price" : 30,"avaliable":false, "productID" : "QQPX-R-3956-#aD8" }

4- 示例代码四

DELETE news
POST /news/_bulk
{ "index": { "_id": 1 }}
{ "content":"Apple Mac" }
{ "index": { "_id": 2 }}
{ "content":"Apple iPad" }
{ "index": { "_id": 3 }}
{ "content":"Apple employee like Apple Pie and Apple Juice" }

二、URI Search

1- 根据 ID 查询

#使用示例代码一

GET my_index/_doc/1

2- 查询所有数据

#使用示例代码一

GET my_index/_search

3- 指定查询条件Query

#使用示例代码一
#查询 age=10 的数据

GET my_index/_search?q=age:10

4- 多条件查询(AND OR)

#使用示例代码一
#查询 age=20 并且 name=li si  的数据
GET my_index/_search?q=about:(age:20 AND name:li si)

#查询 age=20 或者name=li si  的数据
GET my_index/_search?q=about:(age:20 OR name:li si)

5- 查询排序sort

#使用示例代码一
#根据age排序，使用关键词sort

GET my_index/_search?sort=age:desc

6- 泛查询,Phrase查询,Bool查询

#使用示例代码一
#profile 可以查看出运行的查询方式
#泛查询 此处不只会查看 about 还会查找匹配其他字段，此处查询文档包括 ID 6
GET my_index/_search?q=about:my wangwu
{
  "profile":"true"
}
#Phrase查询 文章中没有 my wangwu 的短语所以无法查询出，换成my name就可以查询喽
GET my_index/_search?q=about:"my wangwu"
{
  "profile":"true"
}
#Bool查询 查询about:my about:wangwu 
GET my_index/_search?q=about:(my wangwu)
{
  "profile":"true"
}

Profile 示例：查询看不懂的可以自己验证一下就知道不同在哪里了
在这里插入图片描述

7- 范围查询TO > >= < <=

#使用示例代码一
#查询bout:my 并且 age 10到40范围的数据
GET my_index/_search?q=about:my AND age:[10 TO 30]
#查询age 大于40的数据
GET my_index/_search?q=age:>10
#查询age小于等于10的数据
GET my_index/_search?q=age:<=10

8- 分页查询from=&size=

#使用示例代码一
#查询age 大于40，第一页 只获得一条数据
GET my_index/_search?q=age:>10&from=0&size=1

9- 取某个字段_source=字段,字段

#使用示例代码一
#查询age 大于40，第一页 只获得一条数据,只获得name
GET my_index/_search?q=age:>10&from=0&size=1&_source=name

9- 模糊匹配

#使用示例代码一
#相当于dsl 里面match_phrase 查询的slop
GET /my_index/_search?q=about:"my zhaoliu"~1
{
  "profile": "true"
}

三、Query DSL

1- 获得多条数据 mget

#使用示例代码一
#查询ID=1和ID=2的两条数据
GET my_index/_mget
{
   "ids":[1,2]
}

#不指定索引
GET _mget
{
    "docs":[
        {
          "_index":"my_index",
          "_id":1
        },
        {
          "_index":"my_index",
          "_id":2,
          "_source":{
            "include":["age"]
          }
        }
      ]
}

2- 查询所有 match_all

#使用示例代码一
#查询所有
GET my_index/_search
{
  "query": {"match_all": {}}
}

3- 结构化查询

3.1-精准匹配 Term Terms

#使用示例代码一
#Term  查询age等于50的
GET my_index/_search
{
  "query":{
    "term": {
      "age": {
        "value": 50
      }
    }
  }
}
#terms 查询about的分词等于 swift 和 haha
GET my_index/_search
{
  "query":{
    "terms": {
      "about": [
        "swift",
        "haha"
      ]
    }
  }
}

3.2-不评分 constant_score

constant_score 转换成无评分的查询（不评分当然就可以增加性能了）
boost: 每条文档指定的评分

#使用示例代码一
POST my_index/_search
{
  "profile": "true",
  "explain": true,
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "age": 50
        }
      },
      "boost": 1.2
    }
  }
}

3.3-范围匹配 range

range:范围查询

关键词	描述
gte	大于等于
lte	小于等于
gt	大于
lt	小于

#使用示例代码一
POST my_index/_search
{
  "query": {
    "range": {
      "age": {
        "gte": 20,
        "lte": 70
      }
    }
  }
}

3.4- exists

等同于mysql : select * from my_index where nameis not null;

POST my_index/_search
{
  "query": {
  "exists": {
    "field": "name"
   }
  }
}

4- 模糊匹配 Match

#使用示例代码一
#match 查询about 分词包含 name
GET my_index/_search
{
  "query": {
    "match": {
    "about": "name"
     }
  }
}

#match 查询about 分词包含 name 并且包含 my 并且包含 wangwu
GET my_index/_search
{
  "profile": "true", 
  "query": {
    "match": {
      "about":{
        "query": "my name wangwu",
        "operator": "and"
      }
    }
  }
}

5- 短语匹配 match_phrase slop

#使用示例代码一
#match_phrase 查询about 等于 wo jiao
GET my_index/_search
{
  "profile": "true", 
  "query": {
    "match_phrase": {
      "about": {
        "query": "wo jiao"
      }
    }
  }
}

#如果想查询wo haha可添加关键词slop
GET my_index/_search
{
  "profile": "true", 
  "query": {
    "match_phrase": {
      "about": {
        "query": "wo haha",
        "slop":1
      }
    }
  }
}

6- 多匹配查询 multi_match

语法

GET <index>/_search
{
  "query": {
    "multi_match": {
      "query": "<query keyword>",
      "type": "<multi_match_type>",
      "fields": [
        "<field_a>",
        "<field_b>"
      ]
    }
  }
}

练手示例

#使用示例代码一
#查询about 包含wangwu 或者 name 包含wangwu 的数据

GET my_index/_search
{
  "profile": "true", 
  "query": {
   "multi_match": {
     "query": "wangwu",
     "fields": ["name","about"]
   }
  }
}

#如果想提高某个字段的排名:比如 提高about的排名
GET my_index/_search
{
  "profile": "true", 
  "query": {
   "multi_match": {
     "query": "wangwu",
     "fields": ["name","about^2"]
   }
  }
}

6.1- best_fields

#使用示例代码二
# multi_match（默认）查找与任何字段匹配的文档，但使用 _score来自最佳字段的文档
POST blogs/_search
{
  "query": {
    "multi_match": {
      "type": "best_fields",
      "query": "Quick pets",
      "fields": ["title","body"],
      "tie_breaker": 0.2
    }
  }
}

# 上面的查询等同于
POST blogs/_search
{
    "query": {
        "dis_max": {
            "queries": [
                { "match": { "title": "Quick pets" }},
                { "match": { "body":  "Quick pets" }}
            ],
            "tie_breaker": 0.2
        }
    }
}

6.2- most_fields

most_fields当查询包含以不同方式分析的相同文本的多个字段时，该类型最有用。例如，主字段可能包含同义词、词干和没有变音符号的术语。第二个字段可能包含原始术语，第三个字段可能包含 shingles。通过结合所有三个字段的分数，我们可以将尽可能多的文档与主字段匹配，但使用第二个和第三个字段将最相似的结果推到列表的顶部。

POST blogs/_search
{
  "profile": "true", 
  "query": {
    "multi_match": {
      "type": "most_fields",
      "query": "Quick pets",
      "fields": ["title","body"]
    }
  }
}

#上面结果转换成
POST blogs/_search
{
   "profile": "true", 
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "title": "Quick pets"
          }
        },
         {
          "match": {
            "body": "Quick pets"
          }
        }
      ]
    }
  }
}

6.3- cross_fields

#代表的是 （title=brown 或者 title=brown=fox） 并且（body=brown 或者 body=brown=fox）
POST blogs/_search
{
  "profile": "true", 
  "query": {
    "multi_match": {
      "type": "cross_fields",
      "query": "brown fox",
      "fields": ["title","body"]
    }
  }
}

6.4- phrase

POST blogs/_search
{
  "profile": "true", 
  "query": {
    "multi_match": {
      "type": "phrase",
      "query": "Brown fox",
      "fields": ["title","body"]
    }
  }
}

#上面结果转换成
POST blogs/_search
{
  "profile": "true",
  "query": {
    "dis_max": {
      "queries": [
        {
          "match_phrase": {
            "title": "Brown fox"
          }
        },
         {
          "match_phrase": {
            "body": "Brown fox"
          }
        }
      ]
    }
  }
}

6.5- phrase_prefix

POST blogs/_search
{
  "profile": "true", 
  "query": {
    "multi_match": {
      "type": "phrase_prefix",
      "query": "Brown fox",
      "fields": ["title","body"]
    }
  }
}

#上面结果转换成
POST blogs/_search
{
  "profile": "true",
  "query": {
    "dis_max": {
      "queries": [
        {
          "match_phrase_prefix": {
            "title": "Brown fox"
          }
        },
         {
          "match_phrase_prefix": {
            "body": "Brown fox"
          }
        }
      ]
    }
  }
}

6.6- cross_fields 和 most_fields 区别一。

官方文档两个区别另一个没看懂：文档链接

#most_fields：加上参数operator=and,查询就变成了
#(+body:brown +body:eats +body:pets) (+title:brown +title:eats +title:pets)
POST blogs/_search
{
  "profile": "true", 
  "query": {
    "multi_match": {
      "type": "most_fields",
      "query": "brown eats pets",
      "fields": ["title","body"],
      "operator": "and"
    }
  }
}

#cross_fields：加上参数operator=and,查询就变成了
#+(body:brown | title:brown) +(body:eats | title:eats) +(body:pets | title:pets)
POST blogs/_search
{
  "profile": "true", 
  "query": {
    "multi_match": {
      "type": "most_fields",
      "query": "brown eats pets",
      "fields": ["title","body"],
      "operator": "and"
    }
  }
}

7- match_phrase_prefix 匹配

#与match_phrase查询类似，但是会对最后一个Token在倒排序索引列表中进行通配符搜索
GET my_index/_search
{
  "profile": "true", 
  "query": {
    "match_phrase_prefix": {
      "name":{
        "query": "w",
        "max_expansions": 10
      }
    }
  }
}

8- query_string 匹配

GET my_index/_search
{
  "profile": "true", 
  "query": {
    "query_string": {
      "default_field": "about",
      "query": "wo AND haha"
    }
  }
}

9- simple_query_string 匹配

POST my_index/_search
{ "profile": "true", 
  "query": {
    "simple_query_string": {
      "query": "wo haha",
      "fields": ["about"],
      "default_operator": "AND"
    }
  }
}

9- wildcard匹配

关键词	作用
*	代表0到任何字符
?	代表0到一个字符

# 相当于mysql 的like
#这样是查不出数据的
GET /my_index/_search
{
  "profile": "true",
  "query": {
    "wildcard": {
      "about.keyword": {
        "value": "wo"
      }
    }
  }
}

#后面加上*就可以了
GET /my_index/_search
{
  "profile": "true",
  "query": {
    "wildcard": {
      "about.keyword": {
        "value": "wo*"
      }
    }
  }
}

10- dis_max 查询

dis_max：返回最佳字段匹配的文档

queries:（必需，查询对象数组）包含一个或多个查询子句。返回的文档必须与这些查询中的一个或多个匹配。如果一个文档匹配多个查询，Elasticsearch 使用最高相关性分数。
tie_breaker:（可选，float）之间的浮点数0，1.0用于增加匹配多个查询子句的文档的相关性分数。默认为0.0.
您可以使用该tie_breaker值为在多个字段中包含相同术语的文档分配更高的相关分数.
如果文档匹配多个子句，则dis_max查询计算文档的相关性分数，如下所示：

从得分最高的匹配子句中获取相关性得分。
将任何其他匹配子句的分数乘以该tie_breaker值。
将最高分数添加到相乘分数中。

如果该tie_breaker值大于0.0，则所有匹配的子句都计数，但得分最高的子句计数最多。

#使用示例代码二
#普通查询，虽然可以查出数据但是很明显ID2的body更符合条件应该排在上面，就可以使用dis_max
POST /blogs/_search
{
    "query": {
        "bool": {
            "should": [
                { "match": { "title": "Brown fox" }},
                { "match": { "body":  "Brown fox" }}
            ]
        }
    }
}
#dis max查询ID2就排在了最上面
POST /blogs/_search
{
    "query": {
        "dis_max": {
            "queries": [
                { "match": { "title": "Brown fox" }},
                { "match": { "body":  "Brown fox" }}
            ]
        }
    }
}

#dis max只会匹配单个字段，看下面的查询文档2应该排在上面
POST blogs/_search
{
    "query": {
        "dis_max": {
            "queries": [
                { "match": { "title": "Quick pets" }},
                { "match": { "body":  "Quick pets" }}
            ]
        }
    }
}

#使用tie_breaker解决
POST blogs/_search
{
    "query": {
        "dis_max": {
            "queries": [
                { "match": { "title": "Quick pets" }},
                { "match": { "body":  "Quick pets" }}
            ],
            "tie_breaker": 0.2
        }
    }
}

11- bool 布尔查询

must 子句（查询）必须出现在匹配的文档中，并将有助于得分。(查询的结果必须匹配查询条件，并且计算score。)
filter 子句（查询）必须出现在匹配的文档中。然而，与 must查询的分数不同，将被忽略。过滤器子句在过滤器上下文中执行，这意味着忽略评分并考虑缓存子句。(查询的结果必须匹配查询条件，和must不太一样的是，不会计算score)
should 子句（查询）应该出现在匹配的文档中 (查询结果必须符合查询条件should中的一个或者多个，minimum_should_match参数定义了至少满足几个子句。会计算score。)
must_not 子句（查询）不得出现在匹配的文档中。子句在过滤器上下文中执行，这意味着忽略评分并考虑缓存子句。因为忽略了评分，0所以返回所有文档的评分。(查询的结果必须不符合查询条件。)

#使用示例三
POST /products/_search
{
  "query": {
    "bool" : {
      "must" : {
        "term" : { "price" : "30" }
      },
      "filter": {
        "term" : { "avaliable" : "true" }
      },
      "must_not" : {
        "range" : {
          "price" : { "lte" : 10 }
        }
      },
      "should" : [
        { "term" : { "productID.keyword" : "JODL-X-1937-#pV7" } },
        { "term" : { "productID.keyword" : "XHDK-A-1293-#fJ3" } }
      ],
      "minimum_should_match" :1
    }
  }
}

11.1- minimum_should_match

minimum_should_match去设置文档中必须匹配的should子句数量

#使用示例三
#minimum_should_match =1 最少匹配满足一条（只要其中一条满足就行），所以可以查询出两条数据
productID.keyword="JODL-X-1937-#pV7" or "productID.keyword="XHDK-A-1293-#fJ3"

#minimum_should_match =2 就代表同时满足这两个条件，因为文中没有一条记录同时等于这两个所以查不出来
productID.keyword="JODL-X-1937-#pV7" and "productID.keyword="XHDK-A-1293-#fJ3"

POST /products/_search
{
  "query": {
    "bool" : {
      "should" : [
        { "term" : { "productID.keyword" : "JODL-X-1937-#pV7" } },
        { "term" : { "productID.keyword" : "XHDK-A-1293-#fJ3" } }
      ],
      "minimum_should_match" :1
    }
  }
}

#minimum_should_match =2 就代表同时满足这两个条件，文中有一条同时满足
productID.keyword="JODL-X-1937-#pV7" and "price=30

POST /products/_search
{
  "query": {
    "bool" : {
      "should" : [
        { "term" : { "productID.keyword" : "JODL-X-1937-#pV7" } },
        { "term" : { "price" : 30 } }
      ],
      "minimum_should_match" :2
    }
  }
}


# minimum_should_match 不存在，如果bool里面 有must或者filter即使should不满足也能查询出数据
POST /products/_search
{
  "query": {
    "bool" : {
      "must" : {
        "term" : { "price" : "30" }
      },
      "should" : [
        { "term" : { "productID.keyword" : "1" } },
        { "term" : { "productID.keyword" : "2" } }
      ]
    }
  }
}

12- negative

negative：包含再查询里面，但是不靠前，相当于降低评分

# 查询包含apple
POST news/_search
{
  "query": {
    "bool": {
      "must": {
        "match":{"content":"apple"}
      }
    }
  }
}
# 查询包含apple 但是不想包含 pie
POST news/_search
{
  "query": {
    "bool": {
      "must": {
        "match":{"content":"apple"}
      },
      "must_not": {
        "match":{"content":"pie"}
      }
    }
  }
}
# 查询包含apple 但是不想包含 pie,pie里面也包含apple，不确定是不是准确就排在后面吧
POST news/_search
{
  "query": {
    "boosting": {
      "positive": {
        "match": {
          "content": "apple"
        }
      },
      "negative": {
        "match": {
          "content": "pie"
        }
      },
      "negative_boost": 0.5
    }
  }
}

13- 分页 &排序 from size sort

POST my_index/_search
{
  "profile": "true", 
  "query": {
    "match_all": {}
  },
  "from": 0,
  "size": 3,
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    }
  ]
}

14- 后面没有，等下一章吧，太多了不想写了。

下一章出聚合 和 关联（类似mysql group max join）