Note:Elasticsearch

本文介绍了Elasticsearch的基础概念及操作方法,涵盖了索引管理、文档增删改查、查询语法、聚合分析等功能,并深入探讨了索引映射、批量索引等高级特性。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

概念

本篇文章没有讲述关于elasticsearch的详细内容以及一些概念的东西,注重写了一些新手刚接触elasticsearch时比较想知道的要点,详细内容可查阅elasticsearch权威指南

localhost:9200/索引index/文档类型type/标识符id
索引:相当于数据库
文档:相当于一行记录,由多个字段组成
文档类型(type):一个索引对象可以存储多个不同对象,文档类型用于区别索引下不同的对象;亦即文档是不同数据结构的,代表不同的文档类型,相当数据表格
标识符(id)
字段(field):文档的一部分,包括名称和值两部分
词(term):一个搜索单元,表示文本中的一个词
标志(token):表示在字段文本中出现的词,由这个词的文本、开始和结束偏移量以及类型组成

REST API操作数据

  • 获取基本信息
    curl GET http://localhost:9200/
  • 获取集群中所有信息
    curl GET http://localhost:9200/_cluster/state/nodes/
  • 集群健康度:
    curl -XGET http://localhost:9200/_cluster/health?pretty
  • 关掉集群,向所有节点发送shutdown请求
    curl -XPOST http://localhost:9200/_cluster/nodes/_shutdown
  • 关闭单一节点
    curl -XPOST http://localhost:9200/_cluster/nodes/标志符/_shutdown
  • #### 增删改查
  • 新建文档
    curl -XPUT http://localhost:9200/blog/article/1 -d '{"title":"let me use ElasticSearch","content":"make a storge way","tags":["skill","storge","elasticSearch"]}'
  • 检索文档
    curl -XGET http://localhost:9200/blog/article/1
  • 更新文档
    curl -XPOST http://localhost:9200/blog/article/1/_update -d '{"script":"ctx._source.content=\"new version\""}'
  • 更新插入
    curl -XPOST http://localhost:9200/blog/article/1/_update -d '{"script":"ctx._source.counter+=1","upsert":{"counter":0}}'
  • 删除文档
    curl -XDELETE http://localhost:9200/blog/article/1
使用URI请求查询
  • 查找索引映射
    curl GET http://localhost:9200/blog/_mapping?pretty
  • es查询会发送到_search端点
curl GET http://localhost:9200/blog/_search?pretty
curl GET http://localhost:9200/blog,index2/_search?pretty
curl GET http://localhost:9200/blog/article/_search?pretty
curl GET http://localhost:9200/_search?pretty
  1. 查询响应
    curl -XGET 'localhost:9200/blog/_search?pretty&q=title:elastic'
  2. 查询分析
    curl -XGET 'localhost:9200/blog/_analyze?field=title' -d 'elasticsearch server'
  3. 查询参数
    curl -XGET 'localhost:9200/blog/_search?pretty&q=title:elasticsearch&df=title&explain=true&default_operator=AND&fields=title&sort=title:asc&size=2&from=2'
参数说明
q查询条件,相当于where
df没有指定q时则使用该参数指定的字段
analyzer指定分析器
default_operator布尔运算符,默认OR
explain查询解释,在返回结果中加入解释信息
fields返回字段
sort结果排序
timeout搜索超时
size,from查询结果窗口

索引

elasticsearch是无模式的搜索引擎,可以通过PUT数据计算出数据结构,也可以通过自己控制定义数据结构。
1. 修改索引自动创建
修改配置文件elasticsearch.yml关闭自动索引,action.auto_create_index:false
2. 可以修改默认分片、副本数量
curl -XPUT localhost:9200/blog/ -d '{"settings":{"number_of_shards":1,"number_of_replicas":2}}'
3. 索引结构映射

curl -XPUT localhost:9200/blog/ -d @blog.json
//blog.json
{
    "mappings":{
        "article":{
            "dynamic":"false",//关闭自动添加字段
            "_index":{
                "enabled":true
            },
            "_id":{
                "index":"not_analyzed",//不经分析编入索引
                "store":"no"//不希望存储
                //"path":"article_id"
            },
            "_routing":{
                "required":true,
                "path":"userID"
            },
            "properties":{
                  "id":
                  {"type":"long",
                  "store":"yes",
                  "precision_step":0,
                  "postings_format":"pulsing"//索引方式,加快查询速度
                },
                "userID":{"type":"long","store":"yes"},
                "name":{"type":"string","store":"yes","index":"analyzed"},
                "published":{"type":"date","store":"yes","precision_step":0,"format":"YYYY-mm-dd"},
                "contents":{"type":"string","store":"no","index":"analyzed"},
                "allowed":{"type":"boolean","store":"yes"},
                "image":{"type":"binary"},
                "address":{"type":"ip","store":"yes"},
                "votes":{
                  "type":"integer",
                  "doc_values_format":"memory"//配置文档值,高效排序和切面搜索
                },
                "fields":{
                    "type":"string",
                    "field":{"type":"string","sotre":"yes"}
                }
            }
        },
        "user":{
            "properties":{
                "id":{"type":"long","store":"yes"},
                "name":{"type":"string","store":"yes","index":"analyzed"}
            }
        }

    }
}
//使用路由参数的查询
curl -XGET localhost:9200/blog/_search?routing=12,13&q=title:elasticsearch
  1. 批量索引
curl -XPOST localhost:9200/_bulk?pretty --data-binary @article.json
//article.json
{"index":{"_index":"blog","_type":"article","_id":1}}
{"title":"elasticsearch","content":"it is amazing"}
{"create":{"_index":"blog","_type":"article","_id":2}}
{"title":"elasticsearch","content":"it is interesting"}
{"create":{"_index":"blog","_type":"article","_id":2}}
{"title":"elasticsearch","content":"it is funny"}
{"delete":{"_index":"blog","_type":"article","_id":2}}
//批量索引文件默认大小微100MB,elasticsearch.yml
http.max_content_length:200MB

查询

curl -XGET 'localhost:9200/blog/article/_search?pretty=true -d '{
    "fields":["title","id","year"],//select id,title...
    "min_score":0.01,//elasticsearch会给文档评分,可限制返回文档最低分
    "from":1,
    "size":10,
    "script_fields":{//使用脚本编辑返回数据
        "changeYear":{
            "script":"_source.year-paramYear",
            "params":{
                "paramYear":100
            }
        }
    },
    "query":{
        "ids":{//标识符查询
            "values":["1","2","3"]
        },
        "prefix":{//前缀查询
            "title":{
                "value":"elastic"
            }
        },
        "query_string" : {
            "query":"title:elasticsearch"
        }
    },
    "sort":{
        "year":"desc"
    }
}

//查看查询如何执行
curl -XGET 'localhost:9200/blog/article/_search_shards?pretty=true -d '{
    "query":{"match_all":{}}
}'
//验证查询是否错误
curl -XGET 'localhost:9200/blog/article/_validate/query?pretty&explain -d '{
    "query":{"match_all":{}}
}'
  • 词条查询,未经分析
{
    "query":{
        "term":{
            "title":"elasticsearch"
        }
    }
}
{
    "query":{
        "term":{
            "title":{
                "value":"elasticsearch",
                "boost":10.0//加权值,改变词条重要程度
            }
        }
    }
}
//多词条
{
    "query":{
        "terms":{
            "title":["elasticsearch","stored"],
            "minimun":2 //至少匹配两个词条
        }
    }
}
  • 常用词查询,对高频词和低频词分开计算得分,实现更高性能
{
    "query":{
        "common":{
            "title":{
                "query":"elasticsearch and store",//and 属于高频词
                "cutoff_frequency":0.001
            }
        }
    }
}
  • match
//match_all 得到索引所有文档
{
    "query":{
        "match_all":{}
    }
}
//match 与term对比会对词条进行分析
{
    "query":{
        "match":{
            "title":{
                "query":"elasticsearch and store",//会将该字段分成三个词进行文档匹配
                "operator":"and"//默认or
            }
        }
    }
}
//match_phrase 从分析后的词条构建短语
{
    "query":{
        "match_phrase":{
            "title":{
                "query":"elasticsearch store",
                "slop":1 // elasticsearch和store之间可以有一个为止词条如and
            }
        }
    }
}
//match_phrase_prefix 与match_phrase查询一样,特点是可对最后一个词条进行前缀匹配
{
    "query":{
        "match_phrase_prefix":{
            "title":{
                "query":"elasticsearch st",
                "slop":1"max_expansions":20
            }
        }
    }
}
//multi_match 多字段
{
    "query":{
        "multi_match":{
            "query":"elasticsearch and store",
            "fields":["title","content"]
        }
    }
}
//query_string 支持所有apache lucene语法
{
    "query":{
        "query_string":{
            "query":"title:elasticsearch -content:store"
        }
    }
}
  • 模糊查询fuzzy
{
    "query":{
        "fuzzy":{//很占用CPU
            "title":"elasearch"
        }
    }
}
{
    "query":{
        "fuzzy_like_this_field":{
            "title":{
                "like_text":"elasticsearch and "
            }
        }
    }
}
{
    "query":{
        "fuzzy_like_this":{
            "fields":["title","content"],
            "like_text":"elasticsearch and ",
            "min_similarity":0.7//相似性,默认0.5
        }
    }
}
  • 通配符?*
{
    "query":{
        "wildcard":{
            "title":"e?as*search"
        }
    }
}
  • 正则表达式
{
    "query":{
        "regexp":{
            "title":{
                "value":"el.sear[abc]h"
            }
        }
    }
}
  • 范围
{
    "query":{
        "range":{
            "year":{
                "gte":2012,
                "lte":2017
            }
        }
    }
}

复合查询

  • 布尔查询
{
    "query":{
        "bool":{
            "must":{
                "term":{"title":"elasticsearch"}
            },
            "should":{
                "range":{
                    "year":{"from":2010,"to":2017} }
            },
            "must_not":{
                "term":{"content":"struct"}
            }
        }
    }
}
  • 加权查询
{
    "query":{
        "boosting":{
            "positive":{
                "term":{"title":"elasticsearch"}
            },
            "negative":{//减去0.333分
                "term":{"content":"struct"}
            },
            "negative_boost":0.333
        }
    }
}
  • 固定文档得分,封装一个查询或者过滤
{
    "query":{
        "constant_score":{
            "query":{
                "term":{"title":"elasticsearch"}
            },
            "boost":3.0
        }
    }
}
  • 索引查询
{
    "query":{
        "indices":{//匹配索引执行的查询
            "indices":["blog"],
            "query"{
                "term":{"title":"elasticsearch"}
            }
        },
        "no_match_query":{//匹配不到的索引执行的查询
            "term":{"title":"store"}
        }
    }
}

查询结果过滤
上面介绍过的查询的得分计算使得搜索变得复杂,耗费CPU资源,而过滤器不影响得分,过滤应用在整个索引的内容上,过滤的结果独立于找到的文档,也独立于文档之间的关系,过滤器也容易被缓存。

//过滤发生在发现文档之后
{
    "query":{
        "term":{"title":"elasticsearch"}
    },
    "post_filter":{
        "term":{"year":2017}
    }
}
//过滤发生在发现文档之前,效率更快
{
    "query":{
        "filtered":{
            "query":{
                "term":{"title":"elasticsearch"}
            },
            "filter":{
                "term":{"year":2017}
            }
        }
    }
}
  • 范围过滤器
{
    "post_filter":{
        "range":{
            "year":{"gte":2010,"lte":2017}
        }
    }
}
  • exists
{
    "post_filter":{
        "exists":{"field":"title"}//过滤字段没有值的文档
    }
}
  • missing 与exists相对应
{
    "post_filter":{
        "missing":{
            "field":"year",
            "null_value":0,//可以额外指定视为空值的值
            "existence":true
        }
    }
}
  • script
{
    "post_filter":{
        "script":{
            "_cache":true,//缓存
            "script":"now-doc['year'].value>100",
            "paramas":{"now":2017}
        }
    }
}
  • type
{
    "post_filter":{
        "type":{"value":"article"}
    }
}
  • limit
{
    "post_filter":{
        "limit":{"value":1}//限定单个分片返回文档数
    }
}
  • id
{
    "post_filter":{
        "ids":{
            //"type":["article"],
            "values":[1]
        }
    }
}
  • and,not,or 组合过滤器,数组
{
    "post_filter":{
        "not":[
            "and":[
                "range":{
                    "year":{"gte":2010,"lte":2017}
                },
                {
                    "or":[
                        "term":{"title":"elasticsearch"},
                        "term":{"title":"store"}
                    ]
                }
            ]
        ]
    }
}

高亮显示

{
    "query":{
        "term":{"title":"elasticsearch"}
    },
    "highlight":{//全局定义
        "pre_tags":["<br>"],//默认<em>
        "post_tags":["</br>"],//默认</em>
        "fields":{
            "title":{}
        }
    }
}
{
    "query":{
        "term":{"title":"elasticsearch"}
    },
    "highlight":{
        "require_field_match":true,
        "fields":{
            "title":{//局部定义
                "pre_tags":["<br>"],//默认<em>
                "post_tags":["</br>"],//默认</em>
            },
            "content":{//局部定义,若require_field_match为false,content匹配字段也会高亮
                "pre_tags":["<br>"],//默认<em>
                "post_tags":["</br>"],//默认</em>
            }
        }
    }
}

索引扩展

//非扁平
{
    "article":{
        "author":{
            "name":{
                "firstName":"lig",
                "lastName":"bee"
            }
        },
        "isbn":"1424123131",
        "year":2017,
        "tags":[
            {"headline":"elasticsearch"},
            {"headline":"store"}
        ],
        "copies":1
    }
}
//定义数据结构
{
    "article":{
        "properties":{
            "author":{
                "type":"object",//对象类型
                "properties":{
                    "name":{
                        "type":"object",
                        "properties":{
                            "firstName":{"type":"string","store":"yes},
                            "lastName":{"type":"string","store":"yes},
                        }
                    }
                }
            },
            "isbin":{"type":"string","store":"yes"},
            "year":{"type":"integer","store":"yes"},
            "tags":{
                "properties":{
                    "headline":{"type":"string","store":"yes"}
                }
            },
            "copies":{"type":"integer","store":"yes"}
        }
    }
}
  • 嵌套对象
{
    "name":"t-shirt",
    "kinds":[
        {"size":"M","color":"black"},
        {"size":"XXL","color":"white"}
    ]
}
//结构
{
    "cloth":{
        "properties":{
            "name":{"type":"string","store":"yes"},
            "kinds":{
                "type":"nested",
                "properties":{
                    "size":{"type":"string","store":"yes"},
                    "color":{"type":"string","store":"yes"}
                }
            }
        }
    }
}
//search
curl -XGET localhost:9200/shop/cloth/_search?pretty=true' -d '
{
    "query":{
        "nested":{
            "path":"kinds",//指定嵌套对象
            "query":{
                "bool":{
                    "must":[
                        {"term":{
                            "kinds.size":"M"
                        }},
                        {"term":{
                            "kinds.cloth":"white"
                        }}
                    ]
                  }
            }
        }
    }
}
'

聚合

度量聚合:接收一个文档集并生成至少一个统计值

  • min,max,sum,avg,value-count
{
    "aggs":{
        "min_year":{
            "min":{
                "field":"year"
            }
        }
    }
}

//script
{
    "aggs":{
        "min_year":{
            "min":{
                "field":"year",
                "script":"_value-100"
            }
        }
    }
}
  • status 返回前面所有聚合(min,max,sum,avg,value-count)
{
    "aggs":{
        "all_agg":{
            "status":{
                "field":"year"
            }
        }
    }
}
  • extended_status 包含更多扩展信息

桶聚合:返回子集统计数量(group by + count)

  • terms
{
    "aggs":{
        "all_term":{
            "terms":{
                "field":"year",
                "order":"desc"
            }
        }
    }
}
  • range
{
    "aggs":{
        "all_years":{
            "range":{
                "field":"year",
                "ranges":[
                    {"to":2000},
                    {"from":2001,"to":2011},
                    {"from":2012,"to":2017}
                ]
            }
        }
    }
}
  • date-range 专用在使用日期类型的字段
{
    "aggs":{
        "all_date":{
            "date_range":{
                "field":"published",
                "format":"YYYY MMMM DD",
                "ranges":[
                    {"to":"2000/01/01"},
                    {"from":"2001/01/02","to":"2011/12/31"},
                    {"from":"2012/01/01","to":"2017/01/01"}
                ]
            }
        }
    }
}

{
    "aggs":{
        "all_date":{
            "date_range":{
                "field":"published",
                "format":"YYYY MMMM DD",
                "ranges":[
                    {"to":"2000/01/01"},
                    {"from":"now,"to":"now+1y"}
                ]
            }
        }
    }
}
  • histogram 周期范围
{
    "aggs":{
        "years":{
            "histogram":{
                "field":"year",
                "interval":4
            }
        }
    }
}
  • data-histogram
{
    "aggs":{
        "publish":{
            "data_histogram":{
                "field":"published",
                "format":"yyyy-MM-dd HH:mm",
                "interval":"31d"
            }
        }
    }
}
  • ipv4
{
    "aggs":{
        "ip_access":{
            "ip_range":{
                "field":"ip",
                "ranges":[
                    {"from":"192.168.0.1","to":"192.168.0.254"},
                    {"mask":"192.168.0.0/24"}
                ]
            }
        }
    }
}
  • 嵌套
{
    "aggs":{
        "nested-agg":{
            "nested":{
                "path":"kinds"
            },
            "aggs":{
                "sizes":{
                    "terms":{ "field":"kinds.size" } }
            }
        }
    }
}

{
    "aggs":{
        "all_years":{
            "range":{
                "field":"year",
                "ranges":[
                    {"to":2000},
                    {"from":2001,"to":2011},
                    {"from":2012,"to":2017}
                ]
            },
            "aggs":{
                "status_all":{
                    "status":{} }
            }
        }
    }
}
  • 桶排序和嵌套聚合
{
    "aggs":{
        "all_term":{
            "terms":{
                "field":"copies",
                "order":"defindNum.avg"
            },
            "aggs":{
                "defindNum":{
                    "status":{} }
            }
        }
    }
}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值