Velero安装部署全攻略:从零开始搭建生产环境

Velero安装部署全攻略:从零开始搭建生产环境

【免费下载链接】velero Backup and migrate Kubernetes applications and their persistent volumes 【免费下载链接】velero 项目地址: https://gitcode.com/GitHub_Trending/ve/velero

概述

Velero(前身为Heptio Ark)是一个强大的Kubernetes集群备份和恢复工具,能够备份和恢复Kubernetes集群资源及持久卷。无论您使用公有云平台还是本地环境,Velero都能提供可靠的灾难恢复和迁移解决方案。

通过本文,您将获得:

  • Velero核心架构的深度理解
  • 生产环境部署的最佳实践
  • 多云环境配置指南
  • 性能优化和安全配置
  • 完整的故障排除方案

Velero核心架构

mermaid

核心组件说明

组件功能描述部署方式
Velero Server主控制器,管理备份恢复操作Deployment
Node Agent节点级数据移动代理DaemonSet
BackupStorageLocation备份存储位置配置Custom Resource
VolumeSnapshotLocation卷快照位置配置Custom Resource

环境准备与要求

系统要求

# 检查Kubernetes版本
kubectl version --short

# 验证集群状态
kubectl cluster-info
kubectl get nodes

版本兼容性矩阵

Velero版本支持的Kubernetes版本测试验证版本
1.17.x1.18+1.31.7, 1.32.3, 1.33.1
1.16.x1.18+1.31.4, 1.32.3, 1.33.0
1.15.x1.18+1.28.8, 1.29.8, 1.30.4, 1.31.1

安装Velero CLI客户端

macOS (Homebrew)

brew install velero

Linux (二进制安装)

# 下载最新版本
VERSION=$(curl -s https://api.github.com/repos/vmware-tanzu/velero/releases/latest | grep tag_name | cut -d '"' -f 4)
wget https://github.com/vmware-tanzu/velero/releases/download/${VERSION}/velero-${VERSION}-linux-amd64.tar.gz

# 解压并安装
tar -xvf velero-${VERSION}-linux-amd64.tar.gz
sudo mv velero-${VERSION}-linux-amd64/velero /usr/local/bin/

Windows (Chocolatey)

choco install velero

生产环境部署指南

AWS环境部署示例

# 创建IAM策略文件
cat > velero-policy.json << EOF
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeVolumes",
                "ec2:DescribeSnapshots",
                "ec2:CreateTags",
                "ec2:CreateSnapshot",
                "ec2:DeleteSnapshot"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:DeleteObject",
                "s3:PutObject",
                "s3:AbortMultipartUpload",
                "s3:ListMultipartUploadParts"
            ],
            "Resource": [
                "arn:aws:s3:::your-bucket-name/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::your-bucket-name"
            ]
        }
    ]
}
EOF

# 安装Velero到集群
velero install \
    --provider aws \
    --plugins velero/velero-plugin-for-aws:v1.7.0 \
    --bucket your-backup-bucket \
    --backup-location-config region=us-west-2 \
    --snapshot-location-config region=us-west-2 \
    --secret-file ./credentials-velero \
    --use-node-agent \
    --default-volumes-to-fs-backup

Azure环境部署示例

# 设置环境变量
AZURE_BACKUP_RESOURCE_GROUP=VeleroBackups
AZURE_STORAGE_ACCOUNT_ID=velerobackups123
BLOB_CONTAINER=velero

# 安装Velero
velero install \
    --provider azure \
    --plugins velero/velero-plugin-for-microsoft-azure:v1.7.0 \
    --bucket $BLOB_CONTAINER \
    --secret-file ./credentials-velero \
    --backup-location-config resourceGroup=$AZURE_BACKUP_RESOURCE_GROUP,storageAccount=$AZURE_STORAGE_ACCOUNT_ID \
    --snapshot-location-config apiTimeout=5m

GCP环境部署示例

velero install \
    --provider gcp \
    --plugins velero/velero-plugin-for-gcp:v1.7.0 \
    --bucket your-backup-bucket \
    --secret-file ./gcp-service-account.json \
    --backup-location-config bucket=your-backup-bucket \
    --snapshot-location-config project=your-gcp-project

自定义安装配置

资源限制配置

velero install \
    --provider aws \
    --plugins velero/velero-plugin-for-aws:v1.7.0 \
    --bucket your-bucket \
    --secret-file ./credentials \
    --velero-pod-cpu-request 500m \
    --velero-pod-mem-request 512Mi \
    --velero-pod-cpu-limit 1000m \
    --velero-pod-mem-limit 1Gi \
    --node-agent-pod-cpu-request 250m \
    --node-agent-pod-mem-request 256Mi \
    --node-agent-pod-cpu-limit 500m \
    --node-agent-pod-mem-limit 512Mi

优先级类别配置

# priority-class.yaml
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: velero-high-priority
value: 1000000
globalDefault: false
description: "High priority class for Velero components"
kubectl apply -f priority-class.yaml

velero install \
    --provider aws \
    --plugins velero/velero-plugin-for-aws:v1.7.0 \
    --bucket your-bucket \
    --secret-file ./credentials \
    --server-priority-class-name velero-high-priority \
    --node-agent-priority-class-name velero-high-priority

验证安装

# 检查Velero部署状态
kubectl get deployments -n velero
kubectl get pods -n velero

# 验证Velero配置
velero backup-location get
velero snapshot-location get

# 测试备份功能
velero backup create test-backup --include-namespaces default

# 检查备份状态
velero backup describe test-backup
velero backup logs test-backup

备份存储位置配置

AWS S3存储配置

apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
  name: aws-primary
  namespace: velero
spec:
  provider: aws
  objectStorage:
    bucket: my-velero-backups
    prefix: prod-cluster
  config:
    region: us-west-2
    s3ForcePathStyle: "false"
    s3Url: https://s3.us-west-2.amazonaws.com

多存储位置配置

apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
  name: aws-secondary
  namespace: velero
spec:
  provider: aws
  objectStorage:
    bucket: my-velero-backups-dr
    prefix: prod-cluster-dr
  config:
    region: us-east-1
  accessMode: ReadOnly

卷快照位置配置

AWS卷快照配置

apiVersion: velero.io/v1
kind: VolumeSnapshotLocation
metadata:
  name: aws-us-west-2
  namespace: velero
spec:
  provider: aws
  config:
    region: us-west-2
    profile: velero-profile

多区域快照配置

apiVersion: velero.io/v1
kind: VolumeSnapshotLocation
metadata:
  name: aws-us-east-1
  namespace: velero
spec:
  provider: aws
  config:
    region: us-east-1

备份策略配置

定时备份配置

# 创建每日备份计划
velero schedule create daily-backup \
    --schedule="0 2 * * *" \
    --include-namespaces production \
    --ttl 720h

# 创建每周全量备份
velero schedule create weekly-full-backup \
    --schedule="0 3 * * 0" \
    --include-namespaces '*' \
    --ttl 2160h

资源过滤配置

apiVersion: velero.io/v1
kind: Backup
metadata:
  name: selective-backup
  namespace: velero
spec:
  includedNamespaces:
  - production
  excludedNamespaces:
  - kube-system
  includedResources:
  - pods
  - services
  - deployments
  excludedResources:
  - events
  - endpoints
  ttl: 720h

测试应用程序部署

apiVersion: v1
kind: Namespace
metadata:
  name: nginx-example
  labels:
    app: nginx

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: nginx-example
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx:1.25
        name: nginx
        ports:
        - containerPort: 80

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: nginx
  name: my-nginx
  namespace: nginx-example
spec:
  ports:
  - port: 80
    targetPort: 80
  selector:
    app: nginx
  type: LoadBalancer

备份与恢复操作

创建备份

# 备份特定命名空间
velero backup create nginx-backup --include-namespaces nginx-example

# 备份整个集群(除系统命名空间)
velero backup create full-cluster-backup --exclude-namespaces kube-system,kube-public

# 带标签选择的备份
velero backup create app-backup --selector app=nginx

恢复操作

# 查看可用备份
velero backup get

# 恢复特定备份
velero restore create --from-backup nginx-backup

# 恢复到不同命名空间
velero restore create --from-backup nginx-backup --namespace-mappings nginx-example:nginx-restored

# 查看恢复状态
velero restore describe <RESTORE_NAME>

监控与日志

Prometheus监控配置

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: velero-monitor
  namespace: velero
  labels:
    app: velero
spec:
  selector:
    matchLabels:
      app: velero
  endpoints:
  - port: monitoring
    interval: 30s
    path: /metrics

关键监控指标

指标名称描述告警阈值
velero_backup_attempt_total备份尝试次数>5 failures/hour
velero_restore_attempt_total恢复尝试次数>3 failures/hour
velero_volume_snapshot_attempt_total卷快照尝试>10 failures/hour
velero_backup_duration_seconds备份持续时间>30 minutes

故障排除指南

常见问题排查

# 检查Velero pod日志
kubectl logs -f deployment/velero -n velero

# 检查节点代理日志
kubectl logs -f daemonset/node-agent -n velero

# 验证存储凭据
velero plugin get
velero backup-location get

# 检查CRD状态
kubectl get crd | grep velero

# 验证网络连接
kubectl exec -it deployment/velero -n velero -- curl https://s3.amazonaws.com

备份失败处理流程

mermaid

性能优化建议

资源调优配置

# values.yaml (Helm安装)
resources:
  requests:
    cpu: 500m
    memory: 512Mi
  limits:
    cpu: 1000m
    memory: 1Gi

nodeAgent:
  resources:
    requests:
      cpu: 250m
      memory: 256Mi
    limits:
      cpu: 500m
      memory: 512Mi

configurations:
  backupStorageLocation:
    - name: default
      provider: aws
      bucket: my-velero-backups
      config:
        region: us-west-2
        s3Url: https://s3.us-west-2.amazonaws.com

并发控制配置

# 增加并发处理数
velero install \
    --provider aws \
    --plugins velero/velero-plugin-for-aws:v1.7.0 \
    --bucket your-bucket \
    --secret-file ./credentials \
    --node-agent-concurrency 10 \
    --restic-parallelism 5

安全最佳实践

RBAC配置

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: velero-server
rules:
- apiGroups: [""]
  resources: ["namespaces", "pods", "secrets"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["velero.io"]
  resources: ["*"]
  verbs: ["*"]

网络策略

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: velero-egress
  namespace: velero
spec:
  podSelector:
    matchLabels:
      app: velero
  policyTypes:
  - Egress
  egress:
  - to:
    - ipBlock:
        cidr: 0.0.0.0/0
    ports:
    - protocol: TCP
      port: 443
    - protocol: TCP
      port: 80

升级与维护

版本升级流程

# 备份当前配置
velero backup-location get -o yaml > backup-locations.yaml
velero snapshot-location get -o yaml > snapshot-locations.yaml

# 下载新版本CLI
wget https://github.com/vmware-tanzu/velero/releases/download/v1.8.0/velero-v1.8.0-linux-amd64.tar.gz

# 升级服务器组件
velero install \
    --provider aws \
    --plugins velero/velero-plugin-for-aws:v1.8.0 \
    --bucket your-bucket \
    --secret-file ./credentials \
    --image velero/velero:v1.8.0

定期维护任务

# 清理过期备份
velero backup delete --older-than 30d

# 验证存储完整性
velero backup-location get
velero snapshot-location get

# 检查资源使用情况
kubectl top pods -n velero

总结

通过本文的详细指南,您应该已经掌握了Velero在生产环境中的完整部署流程。记住以下关键点:

  1. 规划先行:根据业务需求设计备份策略和存储架构
  2. 安全第一:严格遵循最小权限原则配置访问控制
  3. 监控到位:建立完善的监控告警体系
  4. 定期测试:定期验证备份的可用性和恢复流程
  5. 文档完善:维护详细的运行文档和应急预案

Velero作为Kubernetes生态中成熟的备份解决方案,能够为您的生产环境提供可靠的灾难恢复保障。建议定期关注官方更新,及时应用安全补丁和性能改进。

【免费下载链接】velero Backup and migrate Kubernetes applications and their persistent volumes 【免费下载链接】velero 项目地址: https://gitcode.com/GitHub_Trending/ve/velero

创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值