电商项目-es相关

最新推荐文章于 2025-09-21 03:56:10 发布

原创最新推荐文章于 2025-09-21 03:56:10 发布 · 650 阅读

1 ·

CC 4.0 BY-SA版权

电商项目专栏收录该内容

4 篇文章

订阅专栏

本文详细介绍Elasticsearch中的高级操作技巧，包括索引管理、数据检索与更新、查询优化等核心内容。涵盖如何创建、删除索引，执行复杂查询，调整字段类型，以及日志管理和清理策略。

基础知识

存在索引： es.indices.exists(index=name)

删除索引：es.indices.delete(index=name, ignore=[400, 404])

创建索引：es.indices.create(index=name, body=wsku_index_body)

es.delete_by_query(index='api_log', body=query)

query = {'query': {'match': {'level': 'warning'}}}

基本搜索：GET /api_log/_search?q=critical

GET /api_log/_search
{
    "query": {
        "bool": {
            "must": [{
                "match_phrase": {
                    "url": {
                        "query": "/api/v1.0/reseller/sms/active"
                    }
                }
            },
                {
                    "range": {
                        "now": {
                            "gte": "2019-11-23"
                        }
                    }
                }
            ]
        }
    }
}

我们把搜索的相关度提高三倍

"fields": ["title", "summary^3"]

must等同于and，should等同于or，must_not等同于and not

 "bool": {
    "should": [
        { "match": { "request": "order_id" }},
        { "match": { "request": "order_ids" }} 
    ],
    "must": 
        { "match": { "request": "1" }
    }
}

"fuzziness":"AUTO"，增加模糊匹配拼写错误，用于商品搜索 ., * , [a-z]正则通配符

"type": "phrase",要求在请求字符串中的所有查询项必须都在文档中存在，文中顺序也得和请求字符串一致，且彼此相连。分隔多远的距离"slop"，用于精确查找

"query": "(saerch~1 algorithm~1) AND (grant ingersoll) OR (tom morton)", 多个字符串搜索，～1进行一次模糊查询因为它用 + / \| / - 分别替换了 AND/OR/NOT ，可能用户有搜索框，可以更简化 "fields": ["_all", "summary^2"] 提权

词条查询和多词条查询

"query": { "term" : { "publisher": "manning" } }

排序

"sort": [ { "publish_date": {"order":"desc"}}, { "title": { "order": "desc" }} ]

范围

"query": { "range" : { "publish_date": { "gte": "2015-01-01", "lte": "2015-12-31" } } }

过滤

"filter": { "range" : { "num_reviews": { "gte": 20 } } }

查询，限定结果返回数据字段_source

POST /pro_product/_search
{
        "_source": ["wcate_id"],
        "query": {
            "bool": {
                "must": [{"terms": {"wcate_id": [6239,6240,6241,6242,6243,6244,6245,6246,6247,6248,6249]}}]
            }
        },
        "sort": [
       {"priority": "asc"},{"wcate_id": "desc"}
        ],
   "size":20,
    "search_after":[1,12802]
}

字段OR AND

title:(quick OR brown)

精准匹配

author:"John Smith"

where any of the fields book.title, book.content or book.date contains quick or brown：

book.\*:(quick brown)

存在字段：

_exists_:title

单字符通配符?和任意字符通配符*

qu?ck bro*

Regular expression patterns can be embedded in the query string by wrapping them in forward-slashes ("/"):

name:/joh?n(ath[oa]n)/

范围：

date:[2012-01-01 TO 2012-12-31]

count:[10 TO *]

age:(>=10 AND <20)
age:(+>=10 +<20)

+-操作符

es修改字段类型步骤

Text：会分词，然后进行索引，不支持聚合

支持模糊、精确查询

keyword：不进行分词，直接索引，支持聚合

支持模糊、精确查询

1.查询出原来索引的结构

GET /pro_api_log

2.创建新的索引

PUT /pro_api_log2

{
            "settings": {
                "index": {
                    "max_result_window": 30000,
                    "analysis": {
                        "analyzer": {
                            "custom_standard": {
                                "type": "custom",
                                "tokenizer": "standard",
                                "char_filter": ["filter_char_filter"],
                                "filter": "lowercase"
                            }
                        },
                        "char_filter": {
                            "filter_char_filter": {
                                "type": "mapping",
                                "mappings": [
                                    "· => xxDOT1xx",
                                    "+ => xxPLUSxx",
                                    "- => xxMINUSxx",
                                    "\" => xxQUOTATIONxx",
                                    "（ => xxLEFTBRACKET1xx",
                                    "） => xxRIGHTBRACKET1xx",
                                    "& => xxANDxx",
                                    "| => xxVERTICALxx",
                                    "—=> xxUNDERLINExx",
                                    "/=> xxSLASHxx",
                                    "！=> xxEXCLAxx",
                                    "•=> xxDOT2xx",
                                    "【=>xxLEFTBRACKET2xx",
                                    "】 => xxRIGHTBRACKET2xx",
                                    "`=>xxapostrophexx",
                                    ".=>xxDOT3xx",
                                    "#=>xxhashtagxx",
                                    "，=>xxcommaxx"
                                ]
                            }
                        }
                    },
                    "number_of_shards": 3,
                    "number_of_replicas": 1
                }
            },
            "mappings": {
                "wemore": {
                    "properties": {
                        "now": {
                            "type": "date"
                            , "index": true
                        },
                        "ip": {
                            "type": "keyword"
                            , "index": true
                        },
                        "name": {
                            "type": "keyword"
                            , "index": true
                        },
                        "request": {
                            "type": "text"
                            , "index": true
                        },
                        "response": {
                            "type": "text"
                            , "index": true
                        },
                        "level": {
                            "type": "keyword"
                            , "index": true
                        },
                        "url": {
                            "type": "keyword"
                            , "index": true
                        },
                        "method": {
                            "type": "keyword"
                            , "index":true
                        },
                        "exception": {
                            "type": "text"
                            , "index":true
                        }
                    }

}

}
}

3.旧数据同步到新的索引，时间可能比较长网络会超时，但是可以查询数据观察数据增长

POST _reindex
{
"source": {
    "index": "pro_api_log"
},
"dest": {
    "index": "pro_api_log2",
    "version_type": "external" #相同版本数据进行覆盖
}
}

4.查询同步进度，关注total，update，created

GET _tasks?detailed=true&actions=*reindex
5.删除旧的索引

DELETE /pro_api_log

6.创建新的索引，修改字段类型，其余格式同上，删除旧的时候如果同时有插入数据，则会因插入数据先行创建一个默认的文档，导致新建文档有旧的字段，必须保证没有插入干扰

PUT /pro_api_log

7.同步数据回来

POST _reindex
{
"source": {
    "index": "pro_api_log2"
},
"dest": {
    "index": "pro_api_log",
    "version_type": "external"
}
}

解决的问题

1.除了api日志，其他自己打印的日志插入不了es的问题

自己打印的日志的特点就是不是全字段，经过测试发现，es索引的字段缺失是可以的，所有的字段都可以缺失，但是kibana上显示是按照时间now（设置的）来排序了，缺失这个字段就不能查询出来。

POST /test_api_log/_doc/
{
"ip":"666.666.666.666",
"now":"2019-12-02T02:40:02.685213"
}

2.定期清除日志，清除一个月的日志

POST /pro_api_log/_delete_by_query
{
    "query": {
        "range" : {
            "now" : {
                "lt" : "now-30d/d"
            }
        }
    }
}

3.how to fix problem

POST /pro_api_log/_analyze
{
"analyzer": "standard",
"text": """{"cod_charge":0,"product_id":[33005,33006,33007,33008,33009,33010],"COD":1}"""
}

正则

{

                    "regexp": {
                        "request": "[0-9|,]*8697[0-9|,]*"
                    }

}