Archon备份恢复:灾难恢复计划
【免费下载链接】Archon Archon is an AI agent that is able to create other AI agents using an advanced agentic coding workflow and framework knowledge base to unlock a new frontier of automated agents.
项目地址: https://gitcode.com/GitHub_Trending/archon3/Archon
🚨 为什么需要灾难恢复计划?
在AI驱动的开发环境中,数据丢失可能意味着数周的知识积累和项目进度付之东流。Archon作为AI代理的指挥中心,存储着:
- 知识库数据:爬取的文档、上传的PDF、代码示例
- 项目信息:PRD文档、功能规格、任务管理
- 配置设置:API密钥、RAG策略、爬虫配置
- 向量嵌入:1536维的语义搜索向量
一次意外的数据库故障、容器崩溃或配置错误都可能导致严重的数据丢失。本文将为您提供完整的Archon备份恢复解决方案。
📊 Archon数据架构概览

关键数据表结构
| 表名 | 数据类型 | 重要性 | 备份频率 |
|---|
archon_settings | 配置和API密钥 | 🔴 极高 | 实时/每日 |
archon_crawled_pages | 文档内容和嵌入 | 🔴 极高 | 每日 |
archon_code_examples | 代码示例和摘要 | 🔴 极高 | 每日 |
archon_projects | 项目文档 | 🟡 高 | 每周 |
archon_tasks | 任务状态 | 🟡 高 | 每周 |
archon_sources | 知识源元数据 | 🟢 中 | 每月 |
🛡️ 多层次备份策略
1. 数据库级备份
完整数据库导出
# 使用pg_dump进行完整备份
pg_dump -h your-db-host.supabase.co -U postgres \
-d postgres \
-F c \ # 自定义格式(压缩)
-f archon_backup_$(date +%Y%m%d_%H%M%S).dump
# 仅备份Archon相关表
pg_dump -h your-db-host.supabase.co -U postgres \
-d postgres \
-t 'archon_*' \ # 只备份Archon表
-f archon_tables_$(date +%Y%m%d).sql
自动化备份脚本
#!/bin/bash
# archon_backup.sh
BACKUP_DIR="/opt/archon/backups"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30
# 环境变量(在.env.backup中配置)
source /opt/archon/.env.backup
# 创建备份目录
mkdir -p $BACKUP_DIR/$DATE
# 完整数据库备份
pg_dump -h $DB_HOST -U $DB_USER -d $DB_NAME \
-F c -f $BACKUP_DIR/$DATE/full_backup.dump
# 仅Archon表备份
pg_dump -h $DB_HOST -U $DB_USER -d $DB_NAME \
-t 'archon_*' -f $BACKUP_DIR/$DATE/archon_tables.sql
# 备份关键配置表单独
pg_dump -h $DB_HOST -U $DB_USER -d $DB_NAME \
-t 'archon_settings' -f $BACKUP_DIR/$DATE/settings.sql
# 压缩备份
tar -czf $BACKUP_DIR/archon_backup_$DATE.tar.gz -C $BACKUP_DIR/$DATE .
# 清理旧备份
find $BACKUP_DIR -name "*.tar.gz" -mtime +$RETENTION_DAYS -delete
echo "Backup completed: $BACKUP_DIR/archon_backup_$DATE.tar.gz"
2. 应用级备份
Docker容器状态备份
# 备份容器配置
docker inspect archon-server > server_config_$(date +%Y%m%d).json
docker inspect archon-mcp > mcp_config_$(date +%Y%m%d).json
docker inspect archon-ui > ui_config_$(date +%Y%m%d).json
# 备份环境变量
docker exec archon-server env > server_env_$(date +%Y%m%d).txt
docker exec archon-mcp env > mcp_env_$(date +%Y%m%d).txt
# 备份日志文件(最近7天)
docker logs --tail 1000 archon-server > server_logs_$(date +%Y%m%d).log
docker logs --tail 1000 archon-mcp > mcp_logs_$(date +%Y%m%d).log
配置文件备份
# 备份关键配置文件
cp .env .env.backup_$(date +%Y%m%d)
cp docker-compose.yml docker-compose.backup_$(date +%Y%m%d).yml
# 备份Python源码(如需)
tar -czf src_backup_$(date +%Y%m%d).tar.gz python/src/
# 备份前端代码
tar -czf ui_backup_$(date +%Y%m%d).tar.gz archon-ui-main/
3. 云存储集成
Supabase自动备份
-- 启用Supabase的自动备份功能
-- 在Supabase仪表板中配置:
-- Settings -> Database -> Backups
-- 或者使用pg_cron进行自定义备份调度
SELECT cron.schedule(
'daily-archon-backup',
'0 2 * * *', -- 每天凌晨2点
$$SELECT pg_dump('postgres', 'archon_backup_' || to_char(now(), 'YYYYMMDD_HH24MISS') || '.dump')$$
);
云存储上传脚本
#!/bin/bash
# upload_to_cloud.sh
BACKUP_FILE=$1
CLOUD_DIR="backups/archon/$(date +%Y/%m)"
# AWS S3上传
aws s3 cp $BACKUP_FILE s3://your-bucket/$CLOUD_DIR/
# 或者使用rclone支持多云
rclone copy $BACKUP_FILE google-drive:ArchonBackups/$CLOUD_DIR/
rclone copy $BACKUP_FILE dropbox:Apps/Archon/Backups/$CLOUD_DIR/
# 设置保留策略(保留最近30天备份)
rclone delete google-drive:ArchonBackups/ --min-age 30d
🔄 恢复流程详解
灾难恢复场景矩阵
| 场景 | 影响范围 | 恢复时间目标(RTO) | 恢复点目标(RPO) |
|---|
| 数据库崩溃 | 所有数据丢失 | <1小时 | <15分钟 |
| 容器故障 | 服务中断 | <5分钟 | 无数据丢失 |
| 配置错误 | 部分功能异常 | <30分钟 | <1分钟 |
| 存储损坏 | 向量数据丢失 | <2小时 | <1小时 |
1. 数据库恢复流程
完整数据库恢复
# 停止所有服务
docker-compose down
# 恢复完整备份
pg_restore -h your-db-host.supabase.co -U postgres \
-d postgres \
--clean \ # 清理现有数据
--if-exists \ # 忽略不存在的对象
archon_backup_20241201_120000.dump
# 或者从SQL文件恢复
psql -h your-db-host.supabase.co -U postgres \
-d postgres \
-f archon_tables_20241201.sql
选择性表恢复
-- 仅恢复配置表(紧急情况)
TRUNCATE TABLE archon_settings;
\i settings.sql
-- 恢复知识库数据
TRUNCATE TABLE archon_crawled_pages, archon_code_examples, archon_sources;
\i knowledge_base.sql
-- 恢复项目数据
TRUNCATE TABLE archon_projects, archon_tasks, archon_project_sources;
\i projects.sql
2. 容器恢复流程
# 重建容器(配置丢失时)
docker-compose down
cp docker-compose.backup_20241201.yml docker-compose.yml
cp .env.backup_20241201 .env
docker-compose up -d --build
# 检查服务状态
docker-compose ps
docker-compose logs -f archon-server
# 验证健康检查
curl http://localhost:8181/health
curl http://localhost:8051/sse
3. 数据一致性验证
# restore_validation.py
import requests
import psycopg2
import json
def validate_restore():
"""验证恢复后的数据一致性"""
# 数据库连接
conn = psycopg2.connect(
host="your-db-host.supabase.co",
database="postgres",
user="postgres",
password="your-password"
)
# 检查关键表数据
tables_to_check = [
'archon_settings', 'archon_crawled_pages',
'archon_code_examples', 'archon_projects'
]
for table in tables_to_check:
cur = conn.cursor()
cur.execute(f"SELECT COUNT(*) FROM {table}")
count = cur.fetchone()[0]
print(f"{table}: {count} records")
# 检查服务健康
services = {
'server': 'http://localhost:8181/health',
'mcp': 'http://localhost:8051/sse',
'ui': 'http://localhost:3737'
}
for name, url in services.items():
try:
response = requests.get(url, timeout=10)
print(f"{name}: {response.status_code}")
except Exception as e:
print(f"{name}: ERROR - {e}")
conn.close()
if __name__ == "__main__":
validate_restore()
🚀 自动化灾难恢复系统
基于Ansible的恢复自动化
# archon_recovery.yml
- name: Archon Disaster Recovery
hosts: localhost
vars:
backup_date: "20241201"
db_host: "your-db-host.supabase.co"
db_user: "postgres"
tasks:
- name: Stop Archon services
command: docker-compose down
- name: Restore database
command: >
pg_restore -h {{ db_host }} -U {{ db_user }} -d postgres
--clean --if-exists
/backups/archon_backup_{{ backup_date }}.dump
- name: Restore configuration
copy:
src: "/backups/.env.backup_{{ backup_date }}"
dest: ".env"
mode: '0644'
- name: Restore docker-compose
copy:
src: "/backups/docker-compose.backup_{{ backup_date }}.yml"
dest: "docker-compose.yml"
mode: '0644'
- name: Start services
command: docker-compose up -d
- name: Wait for services to be healthy
wait_for:
port: 8181
host: localhost
timeout: 300
- name: Validate recovery
command: python3 restore_validation.py
register: validation_result
- name: Send recovery notification
command: >
curl -X POST -H "Content-Type: application/json"
-d '{"status": "{{ validation_result.rc }}", "output": "{{ validation_result.stdout }}"}'
https://your-notification-service.com/alert
监控和告警集成
#!/bin/bash
# health_monitor.sh
# 检查数据库连接
DB_CHECK=$(psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "SELECT 1" 2>&1)
if [ $? -ne 0 ]; then
echo "Database connection failed: $DB_CHECK"
send_alert "DB_CONNECTION_FAILURE" "$DB_CHECK"
fi
# 检查服务健康
SERVICES=("http://localhost:8181/health" "http://localhost:8051/sse")
for service in "${SERVICES[@]}"; do
response=$(curl -s -o /dev/null -w "%{http_code}" $service)
if [ "$response" != "200" ]; then
send_alert "SERVICE_DOWN" "$service returned $response"
fi
done
# 检查磁盘空间
DISK_USAGE=$(df / | awk 'NR==2{print $5}' | sed 's/%//')
if [ $DISK_USAGE -gt 90 ]; then
send_alert "DISK_SPACE_LOW" "Disk usage at ${DISK_USAGE}%"
fi
📋 灾难恢复演练计划
季度演练流程

演练检查表
| 阶段 | 检查项 | 状态 | 备注 |
|---|
| 准备阶段 | 备份文件验证 | ☐ | |
| | 恢复文档更新 | ☐ | |
| | 团队通知 | ☐ | |
| 执行阶段 | 服务停止 | ☐ | |
| | 数据恢复 | ☐ | |
| | 配置恢复 | ☐ | |
| | 服务启动 | ☐ | |
| 验证阶段 | 数据一致性 | ☐ | |
| | 功能测试 | ☐ | |
| | 性能基准 | ☐ | |
| 总结阶段 | RTO/RPO评估 | ☐ | |
| | 问题记录 | ☐ | |
| | 流程优化 | ☐ | |
🔧 高级恢复技巧
1. 增量备份策略
# 使用WAL(Write-Ahead Logging)归档
# 在postgresql.conf中配置:
wal_level = replica
archive_mode = on
archive_command = 'test ! -f /var/lib/postgresql/wal_archive/%f && cp %p /var/lib/postgresql/wal_archive/%f'
# 基于时间点的恢复
pg_restore --create \
--target-time="2024-12-01 12:00:00" \
archon_backup.dump
2. 跨区域灾难恢复
# multi_region_recovery.yml
- name: Cross-Region Recovery
hosts: dr-site
vars:
primary_region: "us-east-1"
dr_region: "us-west-2"
tasks:
- name: Replicate database to DR region
command: >
pg_dump -h primary-db.us-east-1.supabase.co |
psql -h dr-db.us-west-2.supabase.co
- name: Deploy services in DR region
docker_compose:
project_src: /opt/archon
state: present
env_file: .env.dr
- name: Update DNS to DR region
route53:
zone: "your-domain.com"
record: "archon.your-domain.com"
type: A
value: "{{ dr_site_ip }}"
ttl: 300
3. 加密和安全性
# 加密备份文件
gpg --symmetric --cipher-algo AES256 \
--output archon_backup_$(date +%Y%m%d).dump.gpg \
archon_backup_$(date +%Y%m%d).dump
# 备份文件完整性验证
sha256sum archon_backup_20241201.dump > backup.sha256
sha256sum -c backup.sha256
# 安全传输
scp -i backup_key.pem \
archon_backup_20241201.dump \
backup-user@backup-server:/secure/backups/
🎯 恢复成功指标
关键性能指标(KPI)
| 指标 | 目标值 | 测量方法 |
|---|
| RTO(恢复时间目标) | <30分钟 | 从故障到完全恢复的时间 |
| RPO(恢复点目标) | <5分钟 | 数据丢失时间窗口 |
| 恢复成功率 | >99.9% | 成功恢复次数/总恢复尝试 |
| 演练频率 | 季度 | 每年4次完整演练 |
持续改进循环

📝 总结与最佳实践
核心原则
- 3-2-1备份规则:3份备份,2种介质,1份离线
- 定期验证:每月验证备份文件可恢复性
- 自动化优先:减少人工干预错误
- 文档完备:每个步骤都有详细记录
- 团队培训:确保每个人都能执行恢复
紧急联系人清单
| 角色 | 联系方式 | 职责 |
|---|
| 数据库管理员 | DBA@company.com | 数据库恢复 |
| DevOps工程师 | DevOps@company.com | 容器和配置恢复 |
| 团队负责人 | Lead@company.com | 协调和决策 |
| 云服务支持 | Support@supabase.com | 云平台问题 |
最后检查表
记住:最好的灾难恢复计划是您希望永远不需要使用,但必须随时准备好的计划。通过本文的指导,您的Archon系统将具备企业级的灾难恢复能力。
【免费下载链接】Archon Archon is an AI agent that is able to create other AI agents using an advanced agentic coding workflow and framework knowledge base to unlock a new frontier of automated agents.
项目地址: https://gitcode.com/GitHub_Trending/archon3/Archon