ElasticSearch-partial update

最新推荐文章于 2022-09-07 14:31:36 发布

原创最新推荐文章于 2022-09-07 14:31:36 发布 · 345 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#elasticsearch

Elasticsearch 专栏收录该内容

24 篇文章

订阅专栏

本文深入探讨Elasticsearch中的Partial Update操作，包括基于API和Groovy脚本的方法，以及如何处理并发控制和文档不存在情况下的Upsert操作。通过具体示例展示了更新文档字段而不替换整个文档的过程。

基于普通api进行partial update

# 创建一条数据
PUT /test_index/test_type/10
{
  "test_field1": "test1",
  "test_field2": "test2"
}

POST /test_index/test_type/10/_update
{
  "doc": {
    "test_field2": "updated test2"
  }
}

# 对数据进行更新操作
from elasticsearch import Elasticsearch

es = Elasticsearch(hosts="ip:port")

body = {
  "doc": {
    "test_field2": "updated test2"
  }
}
result = es.update(index="test_index", doc_type="test_type", id=10, body=body)
print(result)

基于groovy脚本进行partial update

# 新增一条数据
PUT /test_index/test_type/11
{
  "num": 0,
  "tags": []
}

(1)内置脚本

POST /test_index/test_type/11/_update
{
   "script" : "ctx._source.num+=1"
}
body = {
   "script": "ctx._source.num+=1"
}
result = es.update(index="test_index", doc_type="test_type", id=11, body=body)

(2)外部脚本

好像不支持了
https://www.elastic.co/guide/en/elasticsearch/reference/6.0/breaking_60_scripting_changes.html

(3)upsert操作

POST /test_index/test_type/11/_update
{
   "script" : "ctx._source.num+=1",
   "upsert": {
       "num": 0,
       "tags": []
   }
}
result = es.update(index="test_index", doc_type="test_type", id=11, body=body)
"""
如果指定的document不存在，就执行upsert中的初始化操作；如果指定的document存在，就执行doc或者script指定的partial update操作
"""

partial update内置乐观锁并发控制

"""
"reason": "Validation Failed: 1: can't provide both retry_on_conflict and a specific version;"
"""
（1）retry_on_conflict
（2）_version
POST /index/type/id/_update?version=6
or
POST /index/type/id/_update?retry_on_conflict=5
body = {
    "doc":
        {
            "num": 2
        }
}
result = es.update(index="test_index", doc_type="test_type", id=11, retry_on_conflict=5, body=body)

tips

es内部执行partial update操作,跟传统的全量替换方式,是几乎一样
(1) 内部先获取document
(2) 将传递过来的field更新到document的json中
(3) 将老的document标记为deleted
(4) 将修改后的新的document创建出来
es的partial update优点
(1) 所有的查询,修改和写回擦操作,都发生在es中的一个shard内部,避免了所有的网络数据的开销(减少2次网络请求)
(2) 减少了查询和修改中的时间间隔,可以有效减少并发冲突的情况