Elasticsearch把cluster persistent update保存到哪里了？

最新推荐文章于 2024-08-27 09:59:04 发布

原创最新推荐文章于 2024-08-27 09:59:04 发布 · 1.7k 阅读

1 ·

CC 4.0 BY-SA版权

文章标签：

#elasticsearch

Elasticsearch 专栏收录该内容

47 篇文章

订阅专栏

本文深入解析了在Elasticsearch集群中如何持久化存储集群参数，包括参数保存的位置、文件格式以及重启时的状态恢复过程，详细解释了配置变更如何被存储和应用。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

执行下面的设置：

PUT _cluster/settings
{
    "persistent" : {
        "indices.store.throttle.type" : "merge",
        "indices.store.throttle.max_bytes_per_sec" : "20mb"
    }
}

意思是持久更新集群参数，既然是持久化，那么它把这个变更保存到哪里了呢？

elasticsearch.yml里是没有这两个设置，但是重启后，这个变更它又确实是生效的：

[2016-01-19 18:22:49,500][INFO ][indices.store            ] [elasticsearch_179] updating indices.store.throttle.type from [MERGE] to [merge]
[2016-01-19 18:22:49,500][INFO ][indices.store            ] [elasticsearch_179] updating indices.store.throttle.max_bytes_per_sec from [20mb] to [20mb], note, type is [merge]

可以看到重启结点时，它会提示已更新。

原来它是保存在下面这个文件里：

{INSTALL_DIR}/data/{CLUSTER_NAME}/nodes/{N}/_state/global-{NNN}

INSTALL_DIR 是elasticsearch的安装目录
CLUSTER_NAME 是集群名称
N 结点序号(0表示这台机器上只有一个结点)
NNN 是集群状态文件版本号

该文件默认是二进制格式，你可以在elasticsearch.yml增加如下：

format: json

ES就会把它保存为josn格式

让我们看一下这个文件都包含了什么：

{
  "meta-data": {
    "version": 7,
    "uuid": "MxNs_3pRRTa3eU4ZMbcI1A",
    "settings": {
      "indices.store.throttle.type": "merge",
      "indices.store.throttle.max_bytes_per_sec": "20mb"
    },
    "templates": {
      "logstash": {
        "order": 0,
        "template": "logstash-*",
        "settings": {
          "index.refresh_interval": "5s"
        },
        "mappings": [
          {
            "_default_": {
              "_all": {
                "enabled": true,
                "omit_norms": true
              },
              "dynamic_templates": [
                {
                  "message_field": {
                    "match": "message",
                    "match_mapping_type": "string",
                    "mapping": {
                      "type": "string",
                      "index": "analyzed",
                      "omit_norms": true
                    }
                  }
                },
                {
                  "string_fields": {
                    "match": "*",
                    "match_mapping_type": "string",
                    "mapping": {
                      "type": "string",
                      "index": "analyzed",
                      "omit_norms": true,
                      "fields": {
                        "raw": {
                          "type": "string",
                          "index": "not_analyzed",
                          "ignore_above": 256
                        }
                      }
                    }
                  }
                }
              ],
              "properties": {
                "@version": {
                  "type": "string",
                  "index": "not_analyzed"
                },
                "geoip": {
                  "type": "object",
                  "dynamic": true,
                  "properties": {
                    "location": {
                      "type": "geo_point"
                    }
                  }
                }
              }
            }
          }
        ],
        "aliases": {}
      }
    }
  }
}

确实看到了如下两个配置项：

    "settings": {
      "indices.store.throttle.type": "merge",
      "indices.store.throttle.max_bytes_per_sec": "20mb"
    },

另外，还有一个持久性的模板设置:

    "templates": {
      "logstash": {
        "order": 0,
        "template": "logstash-*",
        "settings": {
          "index.refresh_interval": "5s"
        },

这个就是logstash在ES中增加的模板配置，可以看到其中关于message字段的配置，其它字符串型字段都会默认增加一个子字段raw，用于保存原字段不经analyzed的原始值。PUT _template/logstash的模板也是保存在这个文件中的。

Every time cluster state changes, all master-eligible nodes store the new version of the file, so during cluster restart the node that starts first and elects itself as a master will have the newest version of the cluster state. What you are describing could be possible if you updated the settings when one of your master-eligible nodes was not part of the cluster (and therefore couldn’t store the latest version with your settings) and after the restart this node became the cluster master and propagated its obsolete settings to all other nodes.