参考资料
-
https://docs.amazonaws.cn/eks/latest/userguide/autoscaling.html
-
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md
部署过程
先决条件
-
创建eks集群
-
iam oidc提供商关联
-
asg上标签
k8s.io/cluster-autoscaler/cluster-name owned k8s.io/cluster-autoscaler/enabled true
创建所需权限,这里方便测试把条件键删了增大适用范围
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup"
],
"Resource": "*"
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeAutoScalingGroups",
"ec2:DescribeLaunchTemplateVersions",
"autoscaling:DescribeTags",
"autoscaling:DescribeLaunchConfigurations",
"ec2:DescribeInstanceTypes"
],
"Resource": "*"
}
]
}
创建sa关联
eksctl create iamserviceaccount \
--cluster=testca \
--namespace=kube-system \
--name=cluster-autoscaler \
--attach-policy-arn=arn:aws-cn:iam::037047667284:policy/AmazonEKSClusterAutoscalerPolicy \
--override-existing-serviceaccounts \
--approve
部署autoscaler
curl -O https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
kubectl apply -f cluster-autoscaler-autodiscover.yaml
有些无法下载的镜像手动导入
# 导出镜像
# sudo nerdctl -n=k8s.io save -o temp.tar quay.io/prometheus/node-exporter:v1.5.0
# sudo ctr -n=k8s.io image export --platform=linux/amd64 temp.tar quay.io/prometheus/node-exporter:v1.5.0
export imagename=registry.k8s.io/autoscaling/cluster-autoscaler:v1.22.2
# docker pull $imagename
docker save -o temp.tar $imagename && aws s3 cp temp.tar s3://zhaojiew-test
# 导入镜像
aws s3 cp s3://zhaojiew-test/temp.tar . && sudo ctr -n=k8s.io image import temp.tar
# nerdctl -n=k8s.io load -i temp.tar
# docker load -i temp.tar
# 查看镜像
ctr -n=k8s.io image ls
提交测试pod
$ cat echo-dep.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: echo-dep
spec:
selector:
matchLabels:
app: http-echo
replicas: 10
template:
metadata:
labels:
app: http-echo
spec:
containers:
- name: http-echo
image: hashicorp/http-echo:0.2.3
args:
- "-text=foo"
ports:
- containerPort: 5678
之后由于当前节点数量受限,启动新节点满足调度要求。新节点的启动主要还是判断pod的调度,如果pod上有nodeselector或者亲和度和污点标记,则autoscaler只会启动满足要求的节点
autoscaler只会修改desired count,不会变动asg的最大值和最小值
关闭节点的过程,到达默认时间(目测5分钟)后节点被关闭
filter_out_schedulable.go:82] No schedulable pods static_autoscaler.go:420] No unschedulable pods static_autoscaler.go:467] Calculating unneeded nodes scale_down.go:448] Node ip-192-168-20-239.cn-north-1.compute.internal - cpu utilization 0.122798 static_autoscaler.go:510] ip-192-168-20-239.cn-north-1.compute.internal is unneeded since 2023-02-20 06:26:06.892099794 +0000 UTC m=+1034.537371483 duration 0s
static_autoscaler.go:534] Starting scale down scale_down.go:829] ip-192-168-20-239.cn-north-1.compute.internal was unneeded for 0s scale_down.go:918] No candidates for scale down delete.go:103] Successfully added DeletionCandidateTaint on node ip-192-168-20-239.cn-north-1.compute.internal
关于全部的可配置参数
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-are-the-parameters-to-ca
不会删除的node
https://blog.youkuaiyun.com/hello2mao/article/details/80418625
- 节点上有pod被PodDisruptionBudget控制器限制。
- 节点上有命名空间是kube-system的pods。
- 节点上的pod不是被控制器创建,例如不是被deployment, replica set, job, stateful set创建。
- 节点上有pod使用了本地存储
- 节点上pod驱逐后无处可去,即没有其他node能调度这个pod
- 节点有注解:”cluster-autoscaler.kubernetes.io/scale-down-disabled”: “true”
从0扩展
当从0节点向上扩展时,群集自动定标器读取 ASG tags以获取关于节点规范的信息,即label和taint。
由于节点还未启动,此时无法从集群内部读取节点的label和taint信息,因此需要从asg上的tag上获取。autoscaler实际上并不会使用这些label和taint的tag信息。而是由userdata配置到节点上
Key: k8s.io/cluster-autoscaler/node-template/resources/$RESOURCE_NAME
Value: 5
Key: k8s.io/cluster-autoscaler/node-template/label/$LABEL_KEY
Value: $LABEL_VALUE
Key: k8s.io/cluster-autoscaler/node-template/taint/$TAINT_KEY
Value: NoSchedule
1.24集群开始不需要手动往asg上加tag,这个过程得到了简化,但是需要给autoscaler角色DescribeNodegroup的权限。此外,当asg的的值与节点组本身发生冲突时,优先选择asg标记的值
除此之外还可以通过asg的tag覆盖autoscaler的全局设置
k8s.io/cluster-autoscaler/node-template/autoscaling-options/scaledownutilizationthreshold
:0.5
(overrides--scale-down-utilization-threshold
value for that specific ASG)k8s.io/cluster-autoscaler/node-template/autoscaling-options/scaledowngpuutilizationthreshold
:0.5
(overrides--scale-down-gpu-utilization-threshold
value for that specific ASG)k8s.io/cluster-autoscaler/node-template/autoscaling-options/scaledownunneededtime
:10m0s
(overrides--scale-down-unneeded-time
value for that specific ASG)k8s.io/cluster-autoscaler/node-template/autoscaling-options/scaledownunreadytime
:20m0s
(overrides--scale-down-unready-time
value for that specific ASG)
为了加快测试,将--scale-down-unneeded-time
设置为1m
添加--skip-nodes-with-system-pods=false
确保节点缩减为0时pod能够完全驱逐,不会考虑daemon-set
错误和解决
1.24版本镜像开始多加了一条权限,坑
aws_cloud_provider.go:386] Failed to generate AWS EC2 Instance Types: UnauthorizedOperation: You are not authorized to perform this operation
注意:一定要修改--node-group-auto-discovery
字段,否则会报node group config找不到
containers:
- command
- ./cluster-autoscaler
- --v=4
- --stderrthreshold=info
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --expander=least-waste
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-cluster
- --balance-similar-node-groups
- --skip-nodes-with-system-pods=false