基础知识
存在索引: es.indices.exists(index=name)
删除索引:es.indices.delete(index=name, ignore=[400, 404])
创建索引:es.indices.create(index=name, body=wsku_index_body)
es.delete_by_query(index='api_log', body=query)
query = {'query': {'match': {'level': 'warning'}}}
基本搜索:GET /api_log/_search?q=critical
GET /api_log/_search
{
"query": {
"bool": {
"must": [{
"match_phrase": {
"url": {
"query": "/api/v1.0/reseller/sms/active"
}
}
},
{
"range": {
"now": {
"gte": "2019-11-23"
}
}
}
]
}
}
}
我们把搜索的相关度提高三倍
"fields": ["title", "summary^3"]
must等同于and,should等同于or,must_not等同于and not
"bool": {
"should": [
{ "match": { "request": "order_id" }},
{ "match": { "request": "order_ids" }}
],
"must":
{ "match": { "request": "1" }
}
}
"fuzziness":"AUTO",增加模糊匹配拼写错误,用于商品搜索 ., * , [a-z]正则通配符
"type": "phrase",要求在请求字符串中的所有查询项必须都在文档中存在,文中顺序也得和请求字符串一致,且彼此相连。 分隔多远的距离"slop", 用于精确查找
"query": "(saerch~1 algorithm~1) AND (grant ingersoll) OR (tom morton)", 多个字符串搜索, ~1进行一次模糊查询 因为它用 + / \| / - 分别替换了 AND/OR/NOT ,可能用户有搜索框,可以更简化 "fields": ["_all", "summary^2"] 提权
词条查询 和 多词条查询
"query": { "term" : { "publisher": "manning" } }
排序
"sort": [ { "publish_date": {"order":"desc"}}, { "title": { "order": "desc" }} ]
范围
"query": { "range" : { "publish_date": { "gte": "2015-01-01", "lte": "2015-12-31" } } }
过滤
"filter": { "range" : { "num_reviews": { "gte": 20 } } }
查询,限定结果返回数据字段_source
POST /pro_product/_search
{
"_source": ["wcate_id"],
"query": {
"bool": {
"must": [{"terms": {"wcate_id": [6239,6240,6241,6242,6243,6244,6245,6246,6247,6248,6249]}}]
}
},
"sort": [
{"priority": "asc"},{"wcate_id": "desc"}
],
"size":20,
"search_after":[1,12802]
}
字段OR AND
title:(quick OR brown)
精准匹配
author:"John Smith"
where any of the fields book.title, book.content or book.date contains quick or brown:
book.\*:(quick brown)
存在字段:
_exists_:title
单字符通配符?和任意字符通配符*
qu?ck bro*
Regular expression patterns can be embedded in the query string by wrapping them in forward-slashes ("/"):
name:/joh?n(ath[oa]n)/
范围:
date:[2012-01-01 TO 2012-12-31]
count:[10 TO *]
age:(>=10 AND <20)
age:(+>=10 +<20)
+-操作符
es修改字段类型步骤
Text:会分词,然后进行索引, 不支持聚合
支持模糊、精确查询
keyword:不进行分词,直接索引,支持聚合
支持模糊、精确查询
1.查询出原来索引的结构
GET /pro_api_log
2.创建新的索引
PUT /pro_api_log2
{
"settings": {
"index": {
"max_result_window": 30000,
"analysis": {
"analyzer": {
"custom_standard": {
"type": "custom",
"tokenizer": "standard",
"char_filter": ["filter_char_filter"],
"filter": "lowercase"
}
},
"char_filter": {
"filter_char_filter": {
"type": "mapping",
"mappings": [
"· => xxDOT1xx",
"+ => xxPLUSxx",
"- => xxMINUSxx",
"\" => xxQUOTATIONxx",
"( => xxLEFTBRACKET1xx",
") => xxRIGHTBRACKET1xx",
"& => xxANDxx",
"| => xxVERTICALxx",
"—=> xxUNDERLINExx",
"/=> xxSLASHxx",
"!=> xxEXCLAxx",
"•=> xxDOT2xx",
"【=>xxLEFTBRACKET2xx",
"】 => xxRIGHTBRACKET2xx",
"`=>xxapostrophexx",
".=>xxDOT3xx",
"#=>xxhashtagxx",
",=>xxcommaxx"
]
}
}
},
"number_of_shards": 3,
"number_of_replicas": 1
}
},
"mappings": {
"wemore": {
"properties": {
"now": {
"type": "date"
, "index": true
},
"ip": {
"type": "keyword"
, "index": true
},
"name": {
"type": "keyword"
, "index": true
},
"request": {
"type": "text"
, "index": true
},
"response": {
"type": "text"
, "index": true
},
"level": {
"type": "keyword"
, "index": true
},
"url": {
"type": "keyword"
, "index": true
},
"method": {
"type": "keyword"
, "index":true
},
"exception": {
"type": "text"
, "index":true
}
}
}
}
}
3.旧数据同步到新的索引,时间可能比较长网络会超时,但是可以查询数据观察数据增长
POST _reindex
{
"source": {
"index": "pro_api_log"
},
"dest": {
"index": "pro_api_log2",
"version_type": "external" #相同版本数据进行覆盖
}
}
4.查询同步进度,关注total,update,created
GET _tasks?detailed=true&actions=*reindex
5.删除旧的索引
DELETE /pro_api_log
6.创建新的索引,修改字段类型,其余格式同上,删除旧的时候如果同时有插入数据,则会因插入数据先行创建一个默认的文档,导致新建文档有旧的字段,必须保证没有插入干扰
PUT /pro_api_log
7.同步数据回来
POST _reindex
{
"source": {
"index": "pro_api_log2"
},
"dest": {
"index": "pro_api_log",
"version_type": "external"
}
}
解决的问题
1.除了api日志,其他自己打印的日志插入不了es的问题
自己打印的日志的特点就是不是全字段,经过测试发现,es索引的字段缺失是可以的,所有的字段都可以缺失,但是kibana上显示是按照时间now(设置的)来排序了,缺失这个字段就不能查询出来。
POST /test_api_log/_doc/
{
"ip":"666.666.666.666",
"now":"2019-12-02T02:40:02.685213"
}
2.定期清除日志,清除一个月的日志
POST /pro_api_log/_delete_by_query
{
"query": {
"range" : {
"now" : {
"lt" : "now-30d/d"
}
}
}
}
3.how to fix problem
POST /pro_api_log/_analyze
{
"analyzer": "standard",
"text": """{"cod_charge":0,"product_id":[33005,33006,33007,33008,33009,33010],"COD":1}"""
}
正则
{
"regexp": {
"request": "[0-9|,]*8697[0-9|,]*"
}
}
本文详细介绍Elasticsearch中的高级操作技巧,包括索引管理、数据检索与更新、查询优化等核心内容。涵盖如何创建、删除索引,执行复杂查询,调整字段类型,以及日志管理和清理策略。
658

被折叠的 条评论
为什么被折叠?



