索引状态管理 ISM
1.ISM vs ILM 核心区别 🎯
我们首先了解一下 ISM(Index State Management)和 ILM(Index Lifecycle Management)的区别。
| 特性 | ISM(OpenDistro) | ILM(Elasticsearch) |
|---|---|---|
| 来源 | AWS OpenDistro(开源) | Elastic 官方(商业功能) |
| 许可证 | Apache 2.0(免费) | Elastic License(付费) |
| 存储位置 | .opendistro-ism-config 索引 | .ilm-history 等系统索引 |
| API 端点 | _opendistro/_ism/ | _ilm/ |
| 7.x 版本 | ✅ 包含在 OpenDistro 中 | ⚠️ 基础版功能有限 |
2.正确的 ISM 策略配置 📝
2.1 创建 ISM 策略
DELETE _opendistro/_ism/policies/log-ism-policy
PUT _opendistro/_ism/policies/log-ism-policy
{
"policy": {
"description": "该策略用于创建日志类索引,包含滚动和删除",
"default_state": "hot",
"states": [
{
"name": "hot",
"actions": [
{
"rollover": {
"min_doc_count": 10,
"min_size": "1mb",
"min_index_age": "10m"
}
}
],
"transitions": [
{
"state_name": "delete",
"conditions": {
"min_index_age": "1h"
}
}
]
},
{
"name": "delete",
"actions": [
{
"delete": {}
}
]
}
],
"ism_template": {
"index_patterns": ["test-*"],
"priority": 100
}
}
}
检查策略是否创建成功。
GET _opendistro/_ism/policies/

只要满足 任意一个 条件就会触发滚动:
min_doc_count: 10- 索引中的主分片文档总数达到 10 个时,触发滚动。min_size: "1mb"- 索引的主分片总存储大小达到 1MB 时,触发滚动。min_index_age: "10m"- 索引创建时间达到 10 分钟时,触发滚动。
🚀 注意:ISM 默认检查间隔 5 分钟。
GET _cluster/settings?include_defaults=true&filter_path=*.opendistro.index_state_management*

2.2 创建索引模板(关联 ISM)
DELETE _index_template/log-template
PUT _index_template/log-template
{
"index_patterns": ["test-*"],
"priority": 1,
"template": {
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"refresh_interval": "15s",
"index.opendistro.index_state_management.policy_id": "log-ism-policy",
"index.opendistro.index_state_management.rollover_alias": "log-index"
},
"mappings": {
"properties": {
"timestamp": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss||epoch_millis"
},
"log_level": {
"type": "keyword"
},
"message": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_smart",
"fielddata": true
},
"user_id": {
"type": "long"
},
"ip_address": {
"type": "ip"
},
"response_time": {
"type": "float"
},
"status_code": {
"type": "integer"
},
"is_success": {
"type": "boolean"
},
"tags": {
"type": "keyword"
},
"request_body": {
"type": "text",
"index": false
},
"geo_location": {
"type": "geo_point"
},
"request_count": {
"type": "integer",
"doc_values": true
},
"metadata": {
"type": "object",
"enabled": true
},
"create_time": {
"type": "date"
}
}
}
},
"_meta": {
"description": "新版日志索引模板,用于 test-* 模式索引",
"version": "1.0"
}
}
查看索引模板是否创建成功。
GET _index_template

2.3 创建初始索引
创建索引,指定为别名 log-index 的写入索引。
PUT /test-000001
{
"aliases": {
"log-index": {
"is_write_index": true
}
}
}

查看是否创建成功。
GET /_cat/aliases/log-index?v

2.4 测试
2.4.1 写入数据
写入数据之前,索引文档数为 0。
GET /_cat/indices/log-index?v

往别名 log-index 中写入 10 条数据。
POST /log-index/_bulk
{"index":{}}
{"timestamp":"2024-06-15 08:30:25","log_level":"INFO","message":"用户张三登录系统成功","user_id":1001,"ip_address":"192.168.1.100","response_time":45.2,"status_code":200,"is_success":true,"tags":["login","success"],"request_body":"username=zhangsan&password=***","geo_location":{"lat":39.9042,"lon":116.4074},"request_count":1,"metadata":{"browser":"Chrome","version":"120.0"},"create_time":"2024-06-15T08:30:25Z"}
{"index":{}}
{"timestamp":"2024-06-15 08:32:10","log_level":"ERROR","message":"数据库连接超时,请检查网络配置","user_id":1002,"ip_address":"10.0.0.55","response_time":1500.5,"status_code":500,"is_success":false,"tags":["database","timeout","critical"],"request_body":"query=SELECT * FROM users","geo_location":{"lat":31.2304,"lon":121.4737},"request_count":3,"metadata":{"service":"user-service","thread_id":"thread-25"},"create_time":"2024-06-15T08:32:10Z"}
{"index":{}}
{"timestamp":"2024-06-15 08:35:42","log_level":"WARN","message":"内存使用率超过80%阈值","user_id":null,"ip_address":"172.16.10.20","response_time":0.5,"status_code":200,"is_success":true,"tags":["monitoring","memory"],"request_body":"","geo_location":{"lat":23.1291,"lon":113.2644},"request_count":1,"metadata":{"hostname":"server-01","memory_usage":82},"create_time":"2024-06-15T08:35:42Z"}
{"index":{}}
{"timestamp":"2024-06-15 08:40:15","log_level":"INFO","message":"订单ID 20240615001 支付成功,金额299.00元","user_id":1003,"ip_address":"192.168.1.150","response_time":120.3,"status_code":200,"is_success":true,"tags":["order","payment"],"request_body":"order_id=20240615001&amount=299.00","geo_location":{"lat":30.2741,"lon":120.1551},"request_count":1,"metadata":{"payment_gateway":"alipay","transaction_id":"txn_123456"},"create_time":"2024-06-15T08:40:15Z"}
{"index":{}}
{"timestamp":"2024-06-15 08:45:30","log_level":"DEBUG","message":"开始处理API请求 /api/v1/users/list","user_id":1004,"ip_address":"203.0.113.10","response_time":15.7,"status_code":200,"is_success":true,"tags":["api","debug"],"request_body":"page=1&size=20","geo_location":{"lat":22.3193,"lon":114.1694},"request_count":5,"metadata":{"endpoint":"/api/v1/users/list","method":"GET"},"create_time":"2024-06-15T08:45:30Z"}
{"index":{}}
{"timestamp":"2024-06-15 08:50:22","log_level":"ERROR","message":"文件上传失败:文件大小超过限制","user_id":1005,"ip_address":"192.168.1.200","response_time":320.8,"status_code":413,"is_success":false,"tags":["upload","file_size","error"],"request_body":"file=report.pdf","geo_location":{"lat":34.3416,"lon":108.9398},"request_count":2,"metadata":{"file_name":"report.pdf","file_size":"15MB","max_size":"10MB"},"create_time":"2024-06-15T08:50:22Z"}
{"index":{}}
{"timestamp":"2024-06-15 08:55:47","log_level":"INFO","message":"缓存刷新完成,共清理 1250 个过期条目","user_id":null,"ip_address":"10.0.1.100","response_time":45.0,"status_code":200,"is_success":true,"tags":["cache","cleanup"],"request_body":"action=refresh_cache","geo_location":{"lat":29.4316,"lon":106.9123},"request_count":1,"metadata":{"cache_type":"redis","cleaned_count":1250},"create_time":"2024-06-15T08:55:47Z"}
{"index":{}}
{"timestamp":"2024-06-15 09:00:12","log_level":"WARN","message":"API响应时间较慢,当前平均响应时间 850ms","user_id":null,"ip_address":"172.17.20.30","response_time":850.0,"status_code":200,"is_success":true,"tags":["performance","slow"],"request_body":"","geo_location":{"lat":36.0611,"lon":120.3783},"request_count":100,"metadata":{"avg_response_time":850,"threshold":500},"create_time":"2024-06-15T09:00:12Z"}
{"index":{}}
{"timestamp":"2024-06-15 09:05:33","log_level":"INFO","message":"用户李四修改个人资料信息","user_id":1006,"ip_address":"203.0.113.45","response_time":78.9,"status_code":200,"is_success":true,"tags":["profile","update"],"request_body":"name=李四&email=lisi@example.com","geo_location":{"lat":38.0428,"lon":114.5149},"request_count":1,"metadata":{"updated_fields":["name","email"]},"create_time":"2024-06-15T09:05:33Z"}
{"index":{}}
{"timestamp":"2024-06-15 09:10:18","log_level":"ERROR","message":"第三方服务调用失败:支付网关无响应","user_id":1007,"ip_address":"192.168.2.100","response_time":3000.2,"status_code":503,"is_success":false,"tags":["external_service","payment","timeout"],"request_body":"payment_data=encrypted","geo_location":{"lat":45.758,"lon":126.642},"request_count":3,"metadata":{"service_name":"payment-gateway","timeout_ms":5000},"create_time":"2024-06-15T09:10:18Z"}
可以看到 10 条数据已经被写入到索引 test-000001 中。

2.4.2 过程状态分析
查看指定索引 test-000001 的 ISM 策略执行详情和状态。
GET _opendistro/_ism/explain/test-000001


- 信息:
"successfully initialized policy: log-ism-policy" - 发生了什么:
- 索引
test-000001创建后,ISM 检测到它匹配策略模式test-* - ISM 成功将
log-ism-policy策略应用到该索引 - 索引进入默认状态
hot
- 索引

- 信息:
"Successfully rolled over index [index=test-000001]" - 发生了什么:
- 索引满足了滚动条件(文档数 ≥ 10,或 大小 ≥ 1MB,或 年龄 ≥ 10分钟)
- ISM 执行了滚动操作,创建了新索引
test-000002 log-index别名从test-000001切换到test-000002

- 信息:
"Evaluating transition conditions [index=test-000001]" - 发生了什么:
- ISM 正在检查
test-000001是否符合从hot状态转换到delete状态的条件 - 根据您的策略,条件是:
min_index_age: "1h"(索引创建1小时后删除)
- ISM 正在检查
- 此时状态:
test-000001仍在hot状态- ISM 在检查是否满足
min_index_age: "1h"条件
会定期检查相应的滚动条件是否满足。


"Pending rollover of index [index=test-000033]"- 关注点:滚动操作(rollover action)
- 阶段:准备创建新索引
- 动作:
rollover
"Evaluating transition conditions [index=test-000033]"- 关注点:状态转换(state transition)
- 阶段:检查是否应该切换到下一个状态
- 动作:状态机转换
| 方面 | Pending Rollover | Evaluating Transition |
|---|---|---|
| 操作类型 | 滚动操作 | 状态转换 |
| 触发条件 | 滚动条件满足 | 转换条件满足 |
| 执行动作 | 创建新索引+切换别名 | 切换到新状态 |
| 在策略中的位置 | actions 数组中 | transitions 数组中 |
1️⃣ 场景 1:滚动优先触发
状态: hot
↓
滚动条件满足 → "Pending rollover" → 执行滚动 → 索引仍在 hot 状态
↓
转换条件满足 → "Evaluating transition" → 切换到 delete 状态
2️⃣ 场景 2:转换优先触发
状态: hot
↓
转换条件满足 → "Evaluating transition" → 切换到 delete 状态
↓
(滚动操作可能不会执行,因为已离开 hot 状态)
这两个是 独立但可能同时发生 的过程,分别对应策略中的 actions 和 transitions 配置。
2.4.3 结果观察
触发策略的滚动条件后,新生成了索引 test-000002,并被指定为了写入索引。
GET /_cat/aliases/log-index?v

test-000001的状态:- ✅ 策略已应用
- ✅ 滚动已完成(现在是只读索引)
- ⏳ 等待删除(还在检查是否达到 1 小时的生命周期)
test-000002的状态:- ✅ 新的写入索引
- 🔄 正在应用相同的策略循环
- 完整的状态流程
创建索引 → 应用策略 → 进入hot状态 → 满足条件滚动 → 创建新索引 → 评估转换条件 → 满足条件删除
此时,再往别名 log-index 写入 20 条数据。
GET /_cat/indices/log-index?v

一段时间后,查看索引 test-000002 的 ISM 策略执行详情和状态。
GET _opendistro/_ism/explain/test-000002
可以看到 ISM 滚动操作已经成功执行完成!

新生成了索引 test-000003,并被指定为了写入索引。

继续往索引 test-000003 写入 30 条数据。




2.5 错误排查
2.5.1 别名冲突

这个错误很明确了!问题是别名冲突。ISM 在尝试滚动时发现 log-index 别名指向了多个索引,这是由于索引模板中的别名配置导致的。
模板中配置了:
"aliases": {
"log-index": {}
}
这导致每个新创建的 test-* 索引都会自动获得 log-index 别名,但滚动操作要求写入别名只能指向一个索引。
更新模板,移除了 aliases 配置。
2.5.2 策略重试
手动重试执行指定索引的 ISM 策略。
POST _opendistro/_ism/retry/test-000001

2.5.3 减少 ISM 检查间隔(测试用)
- 临时设置为 1 分钟(测试后记得恢复)
PUT _cluster/settings
{
"transient": {
"opendistro.index_state_management.job_interval": "1"
}
}

2.5.4 使用手动滚动进行关键操作

POST log-index/_rollover
{
"conditions": {
"max_docs": 10
}
}


| ISM / ILM 参数 | 手动滚动参数 | 实际含义 |
|---|---|---|
min_doc_count: 10 | max_docs: 10 | 文档数 ≥ 10 时滚动 |
min_size: "1mb" | max_size: "1mb" | 大小 ≥ 1MB 时滚动 |
min_index_age: "10m" | max_age: "10m" | 年龄 ≥ 10分钟 时滚动 |
为什么没有统一? Elasticsearch 团队可能因为以下原因保持了这种不一致:
- 向后兼容:手动滚动 API 已被广泛使用,修改会破坏现有代码
- 上下文差异:
- 手动滚动:从 “限制” 角度思考(这个索引最多能有多大)
- ISM / ILM:从 “触发条件” 角度思考(什么条件会触发操作)
3.ISM 状态机概念 🔄
{
"policy": {
"default_state": "hot", // 初始状态
"states": [ // 状态列表
{
"name": "hot", // 状态名称
"actions": [ // 在该状态执行的动作
{
"rollover": { // 滚动动作
"min_index_age": "30d",
"min_size": "50gb"
}
}
],
"transitions": [ // 状态转换条件
{
"state_name": "delete", // 目标状态
"conditions": { // 转换条件
"min_index_age": "90d" // 索引创建后90天
}
}
]
}
]
}
}
4.更复杂的 ISM 策略示例 🎪
包含完整生命周期的策略。
PUT _opendistro/_ism/policies/complete-log-policy
{
"policy": {
"description": "完整的日志索引生命周期管理",
"default_state": "hot",
"states": [
{
"name": "hot",
"actions": [
{
"rollover": {
"min_index_age": "1d",
"min_doc_count": 1000000,
"min_size": "10gb"
}
}
],
"transitions": [
{
"state_name": "warm",
"conditions": {
"min_index_age": "7d"
}
}
]
},
{
"name": "warm",
"actions": [
{
"replica_count": {
"number_of_replicas": 1
}
},
{
"force_merge": {
"max_num_segments": 1
}
}
],
"transitions": [
{
"state_name": "cold",
"conditions": {
"min_index_age": "30d"
}
}
]
},
{
"name": "cold",
"actions": [
{
"read_only": {}
},
{
"replica_count": {
"number_of_replicas": 0
}
}
],
"transitions": [
{
"state_name": "delete",
"conditions": {
"min_index_age": "90d"
}
}
]
},
{
"name": "delete",
"actions": [
{
"delete": {}
}
]
}
]
}
}
4.ISM 管理命令 🔍
- 查看所有策略
GET _opendistro/_ism/policies/
- 查看策略详情
GET _opendistro/_ism/policies/log-ism-policy
- 查看索引的 ISM 状态
GET _opendistro/_ism/explain/test-000001
- 手动执行策略
POST _opendistro/_ism/retry/test-000001
- 删除策略
DELETE _opendistro/_ism/policies/log-ism-policy
5.解决策略冲突 ⚠️
如果您遇到 version_conflict_engine_exception,说明策略已存在:
方案 1️⃣:使用新名称
PUT _opendistro/_ism/policies/log-ism-policy-v2
{
"policy": {
// 您的策略配置
}
}
方案 2️⃣:先删除再创建
DELETE _opendistro/_ism/policies/10min_rollover_policy
PUT _opendistro/_ism/policies/log-ism-policy
{
"policy": {
// 您的策略配置
}
}
方案 3️⃣:更新现有策略(需要序列号)
// 先获取策略详情,包含_seq_no和_primary_term
GET _opendistro/_ism/policies/10min_rollover_policy
// 然后使用获取到的序列号更新
PUT _opendistro/_ism/policies/10min_rollover_policy?if_seq_no=1&if_primary_term=1
{
"policy": {
// 更新后的配置
}
}
6.重要提醒 💡
- ISM 是 OpenDistro 组件:确保您的 Elasticsearch 安装了 OpenDistro ISM 插件。
- 策略名称唯一:每个策略名称必须唯一。
- 自动应用:通过
ism_template或索引设置自动关联策略。 - 监控执行:使用
_opendistro/_ism/explain监控策略执行状态。

被折叠的 条评论
为什么被折叠?



