Horizontal Pod Autoscaler（Pod水平自动伸缩）

最新推荐文章于 2025-05-03 15:18:59 发布

h7ml

最新推荐文章于 2025-05-03 15:18:59 发布

阅读量1.8k

点赞数

本文链接：https://blog.youkuaiyun.com/ling_76539446/article/details/104236851

版权

Horizontal Pod Autoscaler 根据观察到的CPU利用率（或在支持自定义指标的情况下，根据其他一些应用程序提供的指标）自动伸缩 replication controller, deployment, replica set, stateful set 中的pod数量。注意，Horizontal Pod Autoscaling不适用于无法伸缩的对象，例如DaemonSets。

Horizontal Pod Autoscaler 被实现作为Kubernetes API资源和控制器。该资源决定控制器的行为。控制器会定期调整副本控制器或部署中副本的数量，以使观察到的平均CPU利用率与用户指定的目标相匹配。

1. Horizontal Pod Autoscaler 是如何工作的

Horizontal Pod Autoscaler 实现为一个控制循环，其周期由--horizontal-pod-autoscaler-sync-period选项指定（默认15秒）。

在每个周期内，controller manager都会根据每个HorizontalPodAutoscaler定义的指定的指标去查询资源利用率。 controller manager从资源指标API（针对每个pod资源指标）或自定义指标API（针对所有其他指标）获取指标。

对于每个Pod资源指标（比如：CPU），控制器会从资源指标API中获取相应的指标。然后，如果设置了目标利用率值，则控制器计算利用率值作为容器上等效的资源请求百分比。如果设置了目标原始值，则直接使用原始指标值。然后，控制器将所有目标容器的利用率或原始值（取决于指定的目标类型）取平均值，并产生一个用于缩放所需副本数量的比率。

如果某些Pod的容器未设置相关资源请求，则不会定义Pod的CPU使用率，并且自动缩放器不会对该指标采取任何措施。

2. 算法细节

desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

直译为：(当前指标值 ➗ 期望指标值) ✖️ 当前副本数，结果再向上取整，最终结果就是期望的副本数量

例如，假设当前指标值是200m ，期望指标值是100m，期望的副本数量就是双倍。因为，200.0 / 100.0 == 2.0

如果当前值是50m，则根据50.0 / 100.0 == 0.5，那么最终的副本数量就是当前副本数量的一半

如果该比率足够接近1.0，则会跳过伸缩

当targetAverageValue或者targetAverageUtilization被指定的时候，currentMetricValue取HorizontalPodAutoscaler伸缩目标中所有Pod的给定指标的平均值。

所有失败的和标记删除的Pod将被丢弃，即不参与指标计算

当基于CPU利用率来进行伸缩时，如果有尚未准备好的Pod（即它仍在初始化），那么该Pod将被放置到一边，即将被保留。

kubectl 也支持Horizontal Pod Autoscaler

# 查看autoscalers列表
kubectl get hpa
# 查看具体描述
kubectl describe hpa
# 删除autoscaler
kubectl delete hpa

# 示例：以下命名将会为副本集foo创建一个autoscaler，并设置目标CPU利用率为80%，副本数在2~5之间
kubectl autoscale rs foo --min=2 --max=5 --cpu-percent=80

3. 演示

Horizontal Pod Autoscaler automatically scales the number of pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization.

创建Dockerfile，并构建镜像

FROM java:8
COPY ./hello-world-0.0.1-SNAPSHOT.jar hello-world.jar 
CMD java -jar hello-world.jar

在hello-world.jar中执行一些CPU密集型计算

运行镜像并暴露为服务

kubectl run hello-world-example \
     --image=registry.cn-hangzhou.aliyuncs.com/chengjs/hello-world:2.0 \
     --requests='cpu=200m' \
     --limits='cpu=500m' \
     --expose \
     --port=80 \
     --generator=run-pod/v1

创建 Horizontal Pod Autoscaler

HPA将增加和减少副本数量，以将所有Pod的平均CPU利用率维持在50％

kubectl autoscale deployment hello-world-example --cpu-percent=50 --min=1 --max=10

检查autoscaler的当前状态

kubectl get hpa

增加负载

接下来，利用压测工具持续请求，以增加负载，再查看

kubectl get deployment hello-world-example

通过使用autoscaling/v2beta2版本，你可以定义更多的指标

首先，以autoscaling/v2beta2格式获取HorizontalPodAutoscaler的YAML

kubectl get hpa.v2beta2.autoscaling -o yaml > /tmp/hpa-v2.yaml

在编辑器中打开/tmp/hpa-v2.yaml文件，接下来对其进行修改

第一个可以替换的指标类型是Pod指标。这些指标在各个容器中平均在一起，并且和目标值进行比较，已确定副本数。例如：

type: Pods
pods:
  metric:
    name: packets-per-second
  target:
    type: AverageValue
    averageValue: 1k

第二个可以替换的指标类型是对象指标。顾名思义，它描述的是Object，而不是Pod。例如：

type: Object
object:
   metric:
      name: requests-per-second
   describedObject:
      apiVersion: networking.k8s.io/v1beta1
      kind: Ingress
      name: main-route
   target:
      type: Value
      value: 2k

修改后完整的/tmp/hpa-v2.yaml文件如下：

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
  metadata:
    name:hello-world-example
    namespace:default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: hello-world-example
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  - type: Pods
    pods:
      metric:
        name: packets-per-second
      target:
        type: AverageValue
        averageValue: 1k
  - type: Object
    object:
      metric:
        name: requests-per-second
      describedObject:
        apiVersion: networking.k8s.io/v1beta1
        kind: Ingress
        name: main-route
      target:
        type: Value
        value: 10k
status:
  observedGeneration: 1
  lastScaleTime: <some-time>
  currentReplicas: 1
  desiredReplicas: 1
  currentMetrics:
  - type: Resource
    resource:
      name: cpu
    current:
      averageUtilization: 0
      averageValue: 0
  - type: Object
    object:
      metric:
        name: requests-per-second
      describedObject:
        apiVersion: networking.k8s.io/v1beta1
        kind: Ingress
        name: main-route
      current:
        value: 10k