使用 HPA 和 TopologySpreadConstraints 实现可用区间等比扩容

1. 原理介绍

  • 设置 HPA 每次最小扩容 Pod 数为可用区数量,以期可用区间 Pod 同步扩容
  • 设置 TopologySpreadConstraints 可用区分散 maxSkew 为 1,以尽可能可用区间 Pod 均匀分布

2. 实验验证

2.1. 准备 Kind 集群

准备如下配置文件,命名为 kind-cluster.yaml

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  image: kindest/node:v1.24.0@sha256:0866296e693efe1fed79d5e6c7af8df71fc73ae45e3679af05342239cdc5bc8e
- role: worker
  image: kindest/node:v1.24.0@sha256:0866296e693efe1fed79d5e6c7af8df71fc73ae45e3679af05342239cdc5bc8e
  labels:
    topology.kubernetes.io/zone: "us-east-1a"
- role: worker
  image: kindest/node:v1.24.0@sha256:0866296e693efe1fed79d5e6c7af8df71fc73ae45e3679af05342239cdc5bc8e
  labels:
    topology.kubernetes.io/zone: "us-east-1c"

上述配置为集群定义了 2 个工作节点,并分别打上了不同的可用区标签。
执行如下命令创建该 Kubernetes 集群:

$ kind create cluster --config cluster-1.24.yaml
Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.24.0) 🖼 
 ✓ Preparing nodes 📦 📦 📦  
 ✓ Writing configuration 📜 
 ✓ Starting control-plane 🕹️ 
 ✓ Installing CNI 🔌 
 ✓ Installing StorageClass 💾 
 ✓ Joining worker nodes 🚜 
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Have a question, bug, or feature request? Let us know! https://kind.sigs.k8s.io/#community 🙂

检查集群运行正常:

$ kubectl get node --show-labels
NAME                 STATUS   ROLES           AGE    VERSION   LABELS
kind-control-plane   Ready    control-plane   161m   v1.24.0   beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=kind-control-plane,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node.kubernetes.io/exclude-from-external-load-balancers=
kind-worker          Ready    <none>          160m   v1.24.0   beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=kind-worker,kubernetes.io/os=linux,topology.kubernetes.io/zone=us-east-1a
kind-worker2         Ready    <none>          160m   v1.24.0   beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=kind-worker2,kubernetes.io/os=linux,topology.kubernetes.io/zone=us-east-1c
2.2. 安装 metrics-server 组件

HPA 依赖 metrics-server 提供监控指标,通过如下命令安装:

$ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

提示: 国内网络不能直接下载到 registry.k8s.io/metrics-server/metrics-server:v0.6.4 镜像,可以替换为等同的 shidaqiu/metrics-server:v0.6.4。同时,关闭 tls 安全校验,如下图:
在这里插入图片描述

检查部署后的 metrics-server 运行正常:

$ kubectl top node
NAME                 CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
kind-control-plane   238m         5%     667Mi           8%        
kind-worker          76m          1%     207Mi           2%        
kind-worker2         41m          1%     110Mi           1% 
2.3. 部署测试服务

准备如下 YAML,命名为 hpa-php-demo.yaml

注意:Deployment 的 topologySpreadConstraints 配置为可用区分散!

apiVersion: apps/v1
kind: Deployment
metadata:
  name: php-web-demo
spec:
  selector:
    matchLabels:
      run: php-web-demo
  replicas: 1
  template:
    metadata:
      labels:
        run: php-web-demo
    spec:
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: ScheduleAnyway
        labelSelector:
          matchLabels:
            run: php-web-demo
      containers:
      - name: php-web-demo
        image: shidaqiu/hpademo:latest
        ports:
        - containerPort: 80
        resources:
          limits:
            cpu: 500m
          requests:
            cpu: 200m
---
apiVersion: v1
kind: Service
metadata:
  name: php-web-demo
  labels:
    run: php-web-demo
spec:
  ports:
  - port: 80
  selector:
    run: php-web-demo

部署上述服务:

kubectl apply -f hpa-php-demo.yaml
2.4. 部署 HPA 配置

准备 HPA 配置文件,命名为 hpa-demo.yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: php-web-demo
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-web-demo
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 50
        periodSeconds: 15
      - type: Pods
        value: 2
        periodSeconds: 15
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
      - type: Pods
        value: 2
        periodSeconds: 15 
      selectPolicy: Max

部署上述 HPA 配置:

$ kubectl apply -f hpa-demo.yaml

上述 HPA 通过 scaleUp 和 scaleDown 定义了扩容和缩容的行为,每次扩容一倍或 2 个 Pod(取较大者),每次缩容一半或 2 个 Pod(取较大者)。

2.5. 验证扩容

扩容前,观察 Pod 分别运行在两个区:

$ kubectl get pod -owide
NAME                           READY   STATUS    RESTARTS   AGE     IP           NODE           NOMINATED NODE   READINESS GATES
php-web-demo-d6d66c8d5-22tn6   1/1     Running   0          6m57s   10.244.2.3   kind-worker2   <none>           <none>
php-web-demo-d6d66c8d5-tz8m9   1/1     Running   0          76s     10.244.1.3   kind-worker    <none>           <none>

给服务施加压力:

$ kubectl run -it --rm load-generator --image=busybox /bin/sh
进入容器后,执行如下脚本:
while true; do wget -q -O- http://php-web-demo; done

可以观察到 Pod 扩容时,同时在两个可用区进行,实现了可用区同步扩容的效果
在这里插入图片描述
停止施加压力,可以观察到 Pod 缩容保持了可用区分散的状态

3. 如何保证缩容后,Pod 仍在多可用区均匀分散?

可以考虑借助 descheduler 的 rebalance 能力,参考 https://github.com/kubernetes-sigs/descheduler?tab=readme-ov-file#removepodsviolatingtopologyspreadconstraint

部署 descheduler

$ git clone https://github.com/kubernetes-sigs/descheduler.git
$ cd descheduler/charts/descheduler
$ helm upgrade --install descheduler .

部署前,可以修改 values.yaml 文件,关闭不需要的插件:

kind: Deployment # 设置为 Deployment 模式
...
replicas: 2 # 双副本
...
leaderElection: 
  enabled: true  # 启用 leader 选举
  leaseDuration: 15s
  renewDeadline: 10s
  retryPeriod: 2s
  resourceLock: "leases"
  resourceName: "descheduler"
  resourceNamescape: "kube-system"

...
deschedulerPolicy:
  strategies:
    RemoveDuplicates:
      enabled: false
    RemovePodsHavingTooManyRestarts:
      enabled: false
      params:
        podsHavingTooManyRestarts:
          podRestartThreshold: 100
          includingInitContainers: true
    RemovePodsViolatingNodeTaints:
      enabled: false
    RemovePodsViolatingNodeAffinity:
      enabled: false
      params:
        nodeAffinityType:
        - requiredDuringSchedulingIgnoredDuringExecution
    RemovePodsViolatingInterPodAntiAffinity:
      enabled: false
    RemovePodsViolatingTopologySpreadConstraint:
      enabled: true  # 只启用需要的可用区均衡插件
      params:
        includeSoftConstraints: true # 包含软限制
    LowNodeUtilization:
      enabled: false
      params:
        nodeResourceUtilizationThresholds:
          thresholds:
            cpu: 20
            memory: 20
            pods: 20
          targetThresholds:
            cpu: 50
            memory: 50
            pods: 50

...

反复扩容和缩容,观察 descheduler 自动将失衡的 Pod 重新均衡。

Tip: 由于 maxSkew 最小只能设置为 1,因此不同可用区偏差为 1 个 Pod 是会被认定为 Balance 平衡状态。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值