题记
这几天苦受kubernetes的折磨一直使用的Centos突然换到Ubuntu有那么点不适应
这里编写博客的内容是kubernetes踩坑的地方和安装的方法
kubernetes官方安装#已经支持中文的阅读了
Ubuntu 20.0.04 | 4G内存,两核CPU,100G硬盘 |
---|---|
Ubuntu 20.0.04 | 4G内存,两核CPU,100G硬盘 |
Ubuntu 20.0.04 | 4G内存,两核CPU,100G硬盘 |
首先先去阿里云镜像仓库下载docker
进行安装完成之后
配置下镜像加速器
#这里就使用我自己的了 如果不想用可以去 ---->阿里云控制台----->容器镜像服务----->镜像加速器
sudo mkdir -p /etc/docker
sudo tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://bvx1i6ic.mirror.aliyuncs.com"]
}
EOF
sudo systemctl daemon-reload
sudo systemctl restart docker
sudo systemctl enable docker
安装kubernetes
这里如果apt-get update更新不成功 导入公钥失败可以查看我上篇文章
查看版本信息 # apt-cache madison kubeadm
**安装 kubeadm kubectl kubelet # apt install kubeadm= ${值} kubectl= ${值} kubelet= ${值} **
启动并验证 kubelet # systemctl start kubelet && systemctl enable kubelet && systemctl status kubelet
查看安装指定版本k8s需要的镜像有那些
kubeadm config images list --kubernetes-version v1.23.5
这里推荐阿里云下载镜像
这里呢就编写一个docker pull的脚本scp传递给其他的节点
scp images-download.sh root@kubernetes2:/root #需要配置hosts文件和传递ssh密钥
scp images-download.sh root@kubernetes3:/root
cat images-download.sh
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.23.5
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.23.5
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.23.5
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.23.5
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.1-0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns/coredns:v1.8.6
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.8.6
这里注意的是三台主机都要下载
开始注意事项
设置开机自启动kubelet
systemctl enable kubelet
首先要关闭交换分区
#临时关闭交换分区 swapoff -a
#永久关闭 vim /etc/fstab 注释swap那一行
同步时间
timedatectl set-timezone Asia/Shanghai
配置hosts文件
10.0.0.101 master
10.0.0.102 kubernetes2
10.0.0.103 kubernetes3
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
配置路由转发
sudo /bin/su -c "echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf"
sudo /bin/su -c "echo 'net.bridge.bridge-nf-call-iptables = 1' >> /etc/sysctl.conf"
sysctl -p
开始踩坑了
踩坑1
kubeadm init --image-repository=registry.aliyuncs.com/google_containers --pod-network-cidr=10.244.0.0/16 --kubernetes-version=v1.23.5
[init] Using Kubernetes version: v1.23.5
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-apiserver:v1.23.5: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": context deadline exceeded
, error: exit status 1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-controller-manager:v1.23.5: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-scheduler:v1.23.5: output: Error response from daemon: Get "https:/
#出现这个错误 我开始排查了很长时间以为是镜像的问题这里我就修改了镜像名称应为当时是从阿里云哪里下载过来的
docker tag 3fc1d62d6587 k8s.gcr.io/v2/kube-apiserver:v1.23.5
docker tag 3c53fa8541f9 k8s.gcr.io/v2/kube-proxy:v1.23.5
docker tag b0c9e5e4dbb1 k8s.gcr.io/v2/kube-contorller-manager:v1.23.5
docker tag 884d49d6d8c9 k8s.gcr.io/v2/kube-scheduler:v1.23.5
docker tag 25f8c7f3da61 k8s.gcr.io/v2/etcd:3.5.1-0
docker tag a4ca41631cc7 k8s.gcr.io/v2/coredns:1.8.6
docker tag 6270bb605e12 k8s.gcr.io/v2/pause:3.6
解决方法
#原来国外的不行 这里使用的镜像就变成了阿里云的
kubeadm init --image-repository=registry.aliyuncs.com/google_containers --pod-network-cidr=10.244.0.0/16 --kubernetes-version=v1.23.5
踩坑2
kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
解决方法
vim /usr/lib/systemd/system/docker.service
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock –exec-opt native.cgroupdriver=systemd
进行重新启动和加载容器就行了
sudo systemctl daemon-reload
sudo systemctl restart docker
踩坑3 初始化完成之后
root@master:~# kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6d8c4cb4d-9pb9w 0/1 Running 2 (109s ago) 25h
kube-system coredns-6d8c4cb4d-cspc7 0/1 Running 2 (109s ago) 25h
解决方法
root@master:~# kubectl delete pod coredns-6d8c4cb4d-9pb9w -n kube-system
pod "coredns-6d8c4cb4d-9pb9w" deleted
root@master:~# kubectl delete pod coredns-6d8c4cb4d-cspc7 -n kube-system
pod "coredns-6d8c4cb4d-cspc7" deleted
望以上的错误,如君所找