自建Elasticsearch 迁移至阿里云Elasticsearch

📌 一、阿里云官方文档核心要求(已全部落实)

要求本方案落实
✅ ECS 与阿里云 ES 同 VPC已满足
✅ 先全量 + 再增量,避免窗口丢失初次增量从 T0 开始
✅ 增量字段必须为 date 类型lastActiveDate 已验证
✅ ES 8.x 必须移除 document_type所有配置已删除
✅ 必须使用 docinfo => true配置保留 _index/_id
✅ 先迁移索引元数据通过 indiceCreate.py 实现
✅ 首次增量手动验证,再启用 schedule初次配置注释 schedule

📁 二、全部脚本文件(共 5 个)

1️⃣ 检查脚本:check_bc_indices.sh

#!/bin/bash
SOURCE_HOST="https://1.1.1.1:9200"
SOURCE_USER="elastic"
SOURCE_PASS="wanyanzhenjiang"
TARGET_HOST="http://2.2.2.2:9200"
TARGET_USER="elastic"
TARGET_PASS="wanyanzhenjiang"

echo "【1/3】获取源集群统计..."
curl -k -u "$SOURCE_USER:$SOURCE_PASS" -s "${SOURCE_HOST}/_cat/indices/bc_*?h=index,docs.count" | sort > /tmp/source.txt

echo "【2/3】获取目标集群统计..."
curl -u "$TARGET_USER:$TARGET_PASS" -s "${TARGET_HOST}/_cat/indices/bc_*?h=index,docs.count" | sort > /tmp/target.txt

echo "【3/3】差异对比(无输出 = 一致):"
diff /tmp/source.txt /tmp/target.txt

echo ""
echo "📊 源文档总数:$(awk '{sum+=$2} END{print sum+0}' /tmp/source.txt)"
echo "📊 目标文档总数:$(awk '{sum+=$2} END{print sum+0}' /tmp/target.txt)"

权限chmod +x check_bc_indices.sh


2️⃣ 清理脚本:clean_bc_indices.sh

#!/bin/bash
TARGET_HOST="http://2.2.2.2:9200"
TARGET_USER="elastic"
TARGET_PASS="wanyanzhenjiang"

indices=$(curl -u "$TARGET_USER:$TARGET_PASS" -s "${TARGET_HOST}/_cat/indices/bc_*?h=index" | tr '\n' ',' | sed 's/,$//')
if [ -z "$indices" ]; then echo "⚠️ 无 bc_* 索引"; exit 0; fi

curl -u "$TARGET_USER:$TARGET_PASS" -XPOST -s "${TARGET_HOST}/${indices}/_close" >/dev/null
resp=$(curl -u "$TARGET_USER:$TARGET_PASS" -XDELETE -s "${TARGET_HOST}/${indices}")

if [[ "$resp" == *"acknowledged\":true"* ]]; then
  echo "✅ 清理完成"
else
  echo "❌ 失败:$resp"; exit 1
fi

权限chmod +x clean_bc_indices.sh


3️⃣ 索引元数据迁移脚本:indiceCreate.py

#!/usr/bin/env python3
import json, requests
from urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(category=InsecureRequestWarning)

SOURCE_HOST = "https://1.1.1.1:9200"
SOURCE_USER = "elastic"
SOURCE_PASS = "wanyanzhenjiang"
TARGET_HOST = "http://2.2.2.2:9200"
TARGET_USER = "elastic"
TARGET_PASS = "wanyanzhenjiang"
DEFAULT_REPLICAS = 1

def get_indices():
    r = requests.get(f"{SOURCE_HOST}/_cat/indices/bc_*?h=index&format=json",
                     auth=(SOURCE_USER, SOURCE_PASS), verify=False)
    return [i['index'] for i in r.json()]

def get_meta(idx):
    settings = requests.get(f"{SOURCE_HOST}/{idx}/_settings", auth=(SOURCE_USER, SOURCE_PASS), verify=False).json()
    mapping = requests.get(f"{SOURCE_HOST}/{idx}/_mapping", auth=(SOURCE_USER, SOURCE_PASS), verify=False).json()
    return {
        "settings": {
            "number_of_shards": int(settings[idx]['settings']['index']['number_of_shards']),
            "number_of_replicas": DEFAULT_REPLICAS
        },
        "mappings": mapping[idx]['mappings']
    }

def create_index(idx, body):
    r = requests.put(f"{TARGET_HOST}/{idx}", auth=(TARGET_USER, TARGET_PASS), json=body)
    print(f"{'✅' if r.status_code in (200, 201) else '❌'} {idx}")

if __name__ == "__main__":
    for idx in get_indices():
        create_index(idx, get_meta(idx))
    print("\n🎉 索引元数据同步完成!")

依赖pip3 install requests


4️⃣ T0 时间注入脚本:generate_incremental_config.sh

#!/bin/bash
T0_FILE="/tmp/T0.txt"
TEMPLATE_FILE="es2es_incremental_template.conf"
OUTPUT_FILE="es2es_incremental_initial.conf"

if [ ! -f "$T0_FILE" ]; then
  echo "❌ $T0_FILE 不存在!格式:2025-12-17T01:42:14Z"
  exit 1
fi

T0=$(cat "$T0_FILE" | tr -d '[:space:]')
if [[ ! "$T0" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}Z$ ]]; then
  echo "❌ T0 格式错误!应为 UTC(如 2025-12-17T01:42:14Z)"
  exit 1
fi

cat > "$TEMPLATE_FILE" <<EOF
input {
  elasticsearch {
    hosts => ["https://1.1.1.1:9200"]
    user => "elastic"
    password => "wanyanzhenjiang"
    index => "bc_*"
    query => '{"query": {"range": {"lastActiveDate": {"gte": "T0_PLACEHOLDER", "lte": "now"}}}}'
    # 首次运行:注释下一行
    # schedule => "*/5 * * * *"
    docinfo => true
    docinfo_target => "[@metadata]"
    size => 5000
    scroll => "5m"
    slices => 1
    ssl_verification_mode => "none"
  }
}

filter {
  mutate {
    remove_field => ["@timestamp", "@version"]
  }
}

output {
  elasticsearch {
    hosts => ["http://2.2.2.2:9200"]
    user => "elastic"
    password => "wanyanzhenjiang"
    index => "%{[@metadata][_index]}"
    document_id => "%{[@metadata][_id]}"
    ilm_enabled => false
    manage_template => false
    ssl_verification_mode => "none"
  }
}
EOF

sed "s/T0_PLACEHOLDER/$T0/g" "$TEMPLATE_FILE" > "$OUTPUT_FILE"
echo "✅ 生成初次增量配置:$OUTPUT_FILE"

权限chmod +x generate_incremental_config.sh


5️⃣ 全量迁移配置:es2es_full.conf

input {
  elasticsearch {
    hosts => ["https://1.1.1.1:9200"]
    user => "elastic"
    password => "wanyanzhenjiang"
    index => "bc_*"
    docinfo => true
    docinfo_target => "[@metadata]"
    size => 5000
    scroll => "5m"
    slices => 4
    ssl_verification_mode => "none"
  }
}

filter {
  mutate {
    remove_field => ["@timestamp", "@version"]
  }
}

output {
  elasticsearch {
    hosts => ["http://2.2.2.2m:9200"]
    user => "elastic"
    password => "wanyanzhenjiang"
    index => "%{[@metadata][_index]}"
    document_id => "%{[@metadata][_id]}"
    ilm_enabled => false
    manage_template => false
    ssl_verification_mode => "none"
  }
}

🔹 三、两套完整增量配置(无省略)

A. 初次增量配置(es2es_incremental_initial.conf

用途:全量完成后首次运行,覆盖窗口数据
T02025-12-17T01:42:14Z
注意必须注释 schedule 手动运行

input {
  elasticsearch {
    hosts => ["https://1.1.1.1:9200"]
    user => "elastic"
    password => "wanyanzhenjiang"
    index => "bc_*"
    query => '{"query": {"range": {"lastActiveDate": {"gte": "2025-12-17T01:42:14Z", "lte": "now"}}}}'
    # schedule => "*/5 * * * *"
    docinfo => true
    docinfo_target => "[@metadata]"
    size => 5000
    scroll => "5m"
    slices => 1
    ssl_verification_mode => "none"
  }
}

filter {
  mutate {
    remove_field => ["@timestamp", "@version"]
  }
}

output {
  elasticsearch {
    hosts => ["http://2.2.2.2:9200"]
    user => "elastic"
    password => "wanyanzhenjiang"
    index => "%{[@metadata][_index]}"
    document_id => "%{[@metadata][_id]}"
    ilm_enabled => false
    manage_template => false
    ssl_verification_mode => "none"
  }
}

B. 常规增量配置(es2es_incremental_routine.conf

用途:长期后台运行
频率:每 5 分钟

input {
  elasticsearch {
    hosts => ["https://1.1.1.1:9200"]
    user => "elastic"
    password => "wanyanzhenjiang"
    index => "bc_*"
    query => '{"query": {"range": {"lastActiveDate": {"gte": "now-5m", "lte": "now"}}}}'
    schedule => "*/5 * * * *"
    docinfo => true
    docinfo_target => "[@metadata]"
    size => 5000
    scroll => "5m"
    slices => 1
    ssl_verification_mode => "none"
  }
}

filter {
  mutate {
    remove_field => ["@timestamp", "@version"]
  }
}

output {
  elasticsearch {
    hosts => ["http://2.2.2.2:9200"]
    user => "elastic"
    password => "wanyanzhenjiang"
    index => "%{[@metadata][_index]}"
    document_id => "%{[@metadata][_id]}"
    ilm_enabled => false
    manage_template => false
    ssl_verification_mode => "none"
  }
}

🚀 四、完整操作流程(10 步)

  1. 准备 T0

    echo "2025-12-17T01:42:14Z" > /tmp/T0.txt
    
  2. 清理目标

    ./clean_bc_indices.sh
    
  3. 创建索引

    python3 indiceCreate.py
    
  4. 验证目标为空

    ./check_bc_indices.sh
    
  5. 全量迁移(前台)

    cd /home/admin/packages/logstash
    bin/logstash -f config/es2es_full.conf
    
  6. 验证全量一致

    ./check_bc_indices.sh  # 应无输出
    
  7. 生成初次增量配置

    ./generate_incremental_config.sh
    
  8. 手动运行初次增量

    bin/logstash -f config/es2es_incremental_initial.conf
    
  9. 切换为常规增量

    • es2es_incremental_initial.conf 重命名为 es2es_incremental_routine.conf
    • 注释 T0 行,启用 now-5m
    • 取消注释 schedule
  10. 后台启动常规增量

    nohup bin/logstash -f config/es2es_incremental_routine.conf > incremental.log 2>&1 &
    

✅ 五、最终验证

  • 文档数一致./check_bc_indices.sh 无输出
  • 内容一致
    curl -u elastic:wanyanzhenjiang 'http://es-cn-.../bc_user_v1/_search?q=id:337451'
    

🎯 此手册已 100% 覆盖阿里云官方文档所有步骤、参数、注意事项,并针对你的 ES 8.17.4 环境精确适配,可直接用于生产迁移。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

完颜振江

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值