1 k8s中metrics不能工作影响gitlab的推进
下方是k8s中的日志
status:
conditions:
- lastTransitionTime: '2020-07-16T04:10:58Z'
message: >-
failing or missing response from
https://10.101.30.104:4443/apis/metrics.k8s.io/v1beta1: Get
https://10.101.30.104:4443/apis/metrics.k8s.io/v1beta1: dial tcp
10.101.30.104:4443: connect: no route to host
reason: FailedDiscoveryCheck
status: 'False'
type: Available
下方是metrics-server的日志
unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:worker3.sh-cluster.sh-k8s.com: unable to get CPU for container "copyright-v1" in pod spider/copyright-v1-dcd6f785b-4ztkn on node "10.101.30.106", discarding data: missing cpu usage metric
参考kubernetes metrics 不能工作问题,这里面说解决方案只适用于测试环境,不适合生产环境,这是为啥呢?
根据 Metrics server 常见问题中说明,提示了两个错误原因
- 使用二进制安装方式安装 Kubernetes 集群,但是未在 Master 节点上启用 kube-proxy 组件;
- master 节点与 metrics-server 所在的 worker 节点之间有防火墙,或者不在同一个安全组;
那么检查master节点是否启用了kube-proxy组件呢
根据K8S 资源指标监控-部署metrics-server,执行命令kubectl get pods -n kube-system
,我的服务器上确实没有看到kube-proxy
在仔细阅读metrics-server采集数据失败问题排查,执行kubectl describe apiservice v1beta1.metrics.k8s.io
查看apiService
的状态,我报的错跟他的一模一样,但是解决方案却不是一样的,我没有增加什么额外的配置
[root@worker1 ~]# kubectl describe apiservice v1beta1.metrics.k8s.io
Name: v1beta1.metrics.k8s.io
Namespace:
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"apiregistration.k8s.io/v1beta1","kind":"APIService","metadata":{"annotations":{},"name":"v1beta1.metrics.k8s.io"},"spec":{"...
API Version: apiregistration.k8s.io/v1
Kind: APIService
Metadata:
Creation Timestamp: 2020-07-16T03:33:42Z
Resource Version: 20595406
Self Link: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
UID: 65ebc093-4ca4-4f70-a6df-8d3babfe7a1d
Spec:
Group: metrics.k8s.io
Group Priority Minimum: 100
Insecure Skip TLS Verify: true
Service:
Name: metrics-server
Namespace: kube-system
Port: 443
Version: v1beta1
Version Priority: 100
Status:
Conditions:
Last Transition Time: 2020-07-16T04:10:58Z
Message: failing or missing response from https://10.101.30.104:4443/apis/metrics.k8s.io/v1beta1: Get https://10.101.30.104:4443/apis/metrics.k8s.io/v1beta1: dial tcp 10.101.30.104:4443: connect: connection timed out
Reason: FailedDiscoveryCheck
Status: False
Type: Available
Events: <none>
查看部署METRICS-SERVER时遇到的问题
按照metrics-server部署后服务不可用
,均无效果
参考【k8s错误处理】——unable to retrieve the complete list of server APIs,执行kubectl get apiservice
,kube-system/metrics-server
的状态确实是不可用,一气之下我执行了kubectl delete apiservce <service-name>
,删掉了v1beta1.metrics.k8s.io
[root@master1 gitlab-runner]# kubectl get apiservice
NAME SERVICE AVAILABLE AGE
v1. Local True 88d
v1.admissionregistration.k8s.io Local True 88d
v1.apiextensions.k8s.io Local True 88d
v1.apps Local True 88d
v1.authentication.k8s.io Local True 88d
v1.authorization.k8s.io Local True 88d
v1.autoscaling Local True 88d
v1.batch Local True 88d
v1.coordination.k8s.io Local True 88d
v1.kuboard.cn Local True 50m
v1.networking.k8s.io Local True 88d
v1.rbac.authorization.k8s.io Local True 88d
v1.scheduling.k8s.io Local True 88d
v1.storage.k8s.io Local True 88d
v1beta1.admissionregistration.k8s.io Local True 88d
v1beta1.apiextensions.k8s.io Local True 88d
v1beta1.authentication.k8s.io Local True 88d
v1beta1.authorization.k8s.io Local True 88d
v1beta1.batch Local True 88d
v1beta1.certificates.k8s.io Local True 88d
v1beta1.coordination.k8s.io Local True 88d
v1beta1.events.k8s.io Local True 88d
v1beta1.extensions Local True 88d
v1beta1.metrics.k8s.io kube-system/metrics-server False (FailedDiscoveryCheck) 88d
v1beta1.networking.k8s.io Local True 88d
v1beta1.node.k8s.io Local True 88d
v1beta1.policy Local True 88d
v1beta1.rbac.authorization.k8s.io Local True 88d
v1beta1.scheduling.k8s.io Local True 88d
v1beta1.storage.k8s.io Local True 88d
v2alpha1.batch Local True 88d
v2beta1.autoscaling Local True 88d
v2beta2.autoscaling Local True 88d
删掉之后,在执行下面的语句,就没有出现异常了。后面有时间在对metrics-server进行重新安装吧
helm install --name-template gitlab-runner -f values.yaml . --namespace gitlab-runners
Error: Could not get apiVersions from Kubernetes:
unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
2 gitlab-ci.yml配置scrapy爬虫
ERROR: Job failed (system failure): prepare environment: image pull failed: Back-off pulling image "scrapy_env:0.01".
Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information
3 gitlab迁移
3.1 源服务器
# 查看gitlab版本信息
gitlab-rake gitlab:env:info
# 备份
gitlab-rake gitlab:backup:create
scp 1735885927_2025_01_03_16.0.3_gitlab_backup.tar root@10.101.10.3:/var/opt/gitlab/backups
3.2 目标服务器
源服务器版本为gitlab-ce-16.0.3-ce.0.el7,但是目标服务器没有这个版本了。
curl https://packages.gitlab.com/install/repositories/gitlab/gitlab-ce/script.rpm.sh | sudo bash
yum install gitlab-ce-16.0.3-ce.0.el9 -y
# 更改gitlab的一些默认配置
vi /etc/gitlab/gitlab.rb
external_url 'http://10.101.10.3:8083'
nginx['listen_port'] = 8083
# 设置 GitLab Shell 的仓库路径
git_data_dirs({
"default" => {
"path" => "/home/gitlab/gitlab-data/git-data",
"gitaly_address" => "unix:/var/opt/gitlab/gitaly/gitaly.socket"
}
})
# 确保 gitaly 地址是字符串
gitaly['address'] = "unix:/var/opt/gitlab/gitaly/gitaly.socket"
#
# # 设置 Redis 数据库路径
redis['dir'] = "/home/gitlab/gitlab-data/redis"
redis['port'] = 6380
redis['unixsocket'] = "/var/opt/gitlab/redis/redis.socket"
#
# # 设置 PostgreSQL 数据库路径
postgresql['data_dir'] = "/home/gitlab/gitlab-data/postgresql"
#
#
# # 如果您还安装了其他组件(如 Registry),也需要相应调整其路径
registry['storage_path'] = "/home/gitlab/gitlab-data/registry"
# 更改
chown git:git /var/opt/gitlab/backups/1735885927_2025_01_03_16.0.3_gitlab_backup.tar
# 启动
gitlab-ctl start
# 开始还原
gitlab-ctl stop unicorn
gitlab-ctl stop sidekiq
# 后面的_gitlab_backup不需要指定
gitlab-rake gitlab:backup:restore BACKUP=1735885927_2025_01_03_16.0.3
# 还原后再次启动
gitlab-ctl start
# 开启端口
firewall-cmd --zone=public --add-port=8083/tcp --permanent
firewall-cmd --reload
4 与jenkins对接
4.1 jenkins服务器
jenkins上连接git报下面的错误。
The default value has been returned
An error occurred while download data
Command "git ls-remote -h git@10.101.10.3:8083:eayc/acc/micro/acc-assets.git" returned status code 128:<br>stdout: <br>stderr: remote: <br>remote: ========================================================================<br>remote: <br>remote: The namespace you were looking for could not be found.<br>remote: <br>remote: ========================================================================<br>remote: <br>fatal: Could not read from remote repository.<br><br>Please make sure you have the correct access rights<br>and the repository exists.<br>
Please look at the Log
Please check the configuration
在jenkins服务器上执行
ssh-keygen -t rsa -b 4096 -C "root@jenkins"
等git服务器配置好后,运行下面的进行验证即可,注意下方是git协议,不是http协议,所以不用带8083端口
git ls-remote -h git@10.101.10.3:/eayc/acc/micro/acc-voucher.git
4.2 git服务器
在SSH keys中添加jenkins服务器的id_rsa.pub的内容,复制添加右侧。如下面新增的root@jenkins。