ElasticSearch技能包_elasticsearch 技能描述-优快云博客

本文链接：https://blog.youkuaiyun.com/weixin_44685655/article/details/101107226

ES的起源

解决传统数据库的问题
无法存储海量数据比如PB级别
非结构化数据如何存放
相关匹配查询问题
和Lucene的关系
基于lucene，为了解决lucene需要开发和集成以及还需要了解原理的问题，通过RESTful api来隐藏复杂性，让全文检变的简单

{
   "settings":{
   	 "number_of_shards":5
   },
   "mappings":{
   	"userinfos":{
   		"properties":{
   			"username":{"type":"keyword"},
   			"birthday":{"type":"date","format":"yyyy-MM-dd"},
   			"say":{"type":"text"},
   			"age":{"type":"integer"},
   			"address":{
   				"properties":{
   					"person":{"type":"keyword"},
   					"city":{"type":"keyword"}
   				}
   			}
   		}
   	}                                     建表样例
   	当然也可以使用Postman这样的工具
  模式POST http://192.168.10.196/demo/_search

其他：put 增加 get 查看 post 创建或者提交 delete 删除

批量插入bulk  						批量获取mget

GET /megacorp/employee/_search

查询所有，查询表达式 GET /_search

{
 “query”:{
   "match_all":{ }
    }
}

GET /magacorp/employee/_search?q=last_name:Smith

精确匹配字符串

{ “match”: { “age”: 26 }}
mutlti_match以在多个字段上执行相同的 match 查询

     {
          "multi_match": {
              "query":    "full text search",
              "fields":   [ "title", "body" ]
          }
      }

range 查询落在指定区间的数字或者时间

      {
          "range": {
              "age": {
                  "gte":  20,
                  "lt":   30
              }
          }
      }

  gt大于   gte大于等于  lt小于   lte 小于等于

term查询被用于精确值匹配，这些精确值可能是数字、时间、布尔或者那些 not_analyzed 的字符串

{ “term”: { “age”: 26 }}
terms

查询和 term 查询一样，但它允许你指定多值进行匹配。如果这个字段包含了指定值中的任何一个值，那么这个文档满足条件：

{ "terms": { "tag": [ "search", "full_text", "nosql" ] }}

exists 查询和 missing 查询被用于查找那些指定字段中有值 (exists) 或无值 (missing) 的文档

这些查询经常用于某个字段有值的情况和某个字段缺值的情况。

组合搜索

    {
        "query" : {
            "filtered" : {
                "filter" : {
                    "range" : {
                        "age" : { "gt" : 30 } <1>
                    }
                },
                "query" : {
                    "match" : {
                        "last_name" : "smith" <2>
                    }
                }
            }
        }
    }

<1> 这部分查询属于区间过滤器(range filter),它用于查找所有年龄大于30岁的数据——gt为"greater than"的缩写。

<2> 这部分查询与之前的match语句(query)一致。

- 模糊about    获得相关评分
    GET /megacorp/employee/_search
    {
        "query" : {
            "match" : {
                "about" : "rock climbing"
            }
        }
    }

- match_phrase 短语查询

    GET /megacorp/employee/_search
    {
        "query" : {
            "match_phrase" : {
                "about" : "rock climbing"
            }
        }
    }

高亮搜索

    GET /megacorp/employee/_search
    {
        "query" : {
            "match_phrase" : {
                "about" : "rock climbing"
            }
        },
        "highlight": {
            "fields" : {
                "about" : {}
            }
        }
    }

聚合搜索

    GET /megacorp/employee/_search
    {
        "aggs" : {
            "all_interests" : {
                "terms" : { "field" : "interests" },
                "aggs" : {
                    "avg_age" : {
                        "avg" : { "field" : "age" }
                    }
                }
            }
        }
    }

详见官解 https://es.xiaoleilu.com/

过滤器

filter {
	grok {
		patterns_dir => ["./patterns"]
		match => { 		
			"message" => "%{USERUID}"
		}		
    } 
	grok {
		patterns_dir => ["./patterns"]
		match => { 		
			"message" => "%{EQUIPMENT}"
		}		
    }
	grok {		
		match => { 
			"message" => "%{TIMESTAMP_ISO8601:time}\t%{DATA:request}\t%{WORD:method}\t%{NUMBER:status}\t%{IP:uip}\t(?:-|%{IP:sip})\t(?:-|%{URI:prepend})\t(?:-|(?<userAgent>[^\r]+))"		

			}		
    }
}

注：time ，request， status 等为字段名
- 内置匹配规则
  grok模块中自带相应的匹配文本比如 IP    NUMBER    DATA    WORD  URL  TIMESTAMP_ISO8601等等
- 自定义匹配规则
  变量名%DATA(?<field_name>the pattern here)