k8s 外部的 Prometheus 监控 k8s 集群

1. 配置 Kubernetes API 访问

为了让外部的 Prometheus 能够使用 kubernetes_sd_configs 进行服务发现,你需要确保 Prometheus 可以访问 Kubernetes API 服务器,并且具备足够的权限。

1.1 创建 Kubernetes Service Account 并授予权限

首先,在 Kubernetes 集群中创建一个 ServiceAccount 和对应的 ClusterRoleBinding,以便 Prometheus 能够访问 Kubernetes API 进行服务发现。

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  - nodes
  - nodes/metrics
  - nodes/proxy
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups:
  - extensions
  resources:
  - ingresses
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: prom
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: prom
1.2 获取 Kubernetes API Server 的访问凭证
  • 通过 kubectl 命令获取 ServiceAccount 的 token:

1.24 之前的版本

kubectl -n prom get secret $(kubectl -n prom get sa/prometheus -o jsonpath="{.secrets[0].name}") -o jsonpath="{.data.token}" | base64 --decode

1.24 开始及以后

创建临时的 token,会过期

kubectl create token prometheus -n prom

创建永久 toke

在 Kubernetes 中,生成的 Token 默认是临时的。要生成永久的 Token,你需要为 ServiceAccount 创建一个与之关联的 Secret,并确保 Token 没有过期时间。下面是生成永久 Token 的步骤。

1. 创建 ServiceAccount

首先,确保你已经为 **prometheus** 创建了一个 ServiceAccount。如果还没有,你可以使用以下命令创建:

kubectl create serviceaccount prometheus -n prom

2. 创建与 ServiceAccount 关联的 Secret

接下来,为这个 ServiceAccount 创建一个 **Secret**,这个 **Secret** 会包含永久的 Token。

apiVersion: v1
kind: Secret
metadata:
  name: prometheus-token
  namespace: prom
  annotations:		
    kubernetes.io/service-account.name: "prometheus"
type: kubernetes.io/service-account-token

将上述 YAML 文件保存为 **prometheus-token-secret.yaml**,然后应用它:

kubectl apply -f prometheus-token-secret.yaml

3. 获取生成的永久 Token

应用上面的配置后,Kubernetes 会自动为 **prometheus** ServiceAccount 生成一个永久的 Token。你可以使用以下命令获取它:

kubectl get secret prometheus-token -n prom -o go-template='{{.data.token | base64decode}}'

这个命令将输出一个长字符串,即为生成的 Token。

将 token 存放在文件中

mkdir -p /etc/prometheus/token
kubectl get secret prometheus-token -n prom -o go-template='{{.data.token | base64decode}}' > /etc/prometheus/token/prometheus_bearer_token

  • 记录 Kubernetes API 服务器的地址:
kubectl cluster-info

创建 ca 文件

mkdir -p /etc/prometheus/certs
kubectl get configmap -n kube-system kube-root-ca.crt -o jsonpath='{.data.ca\.crt}' > /etc/prometheus/certs/ca.crt
chown prometheus.prometheus /etc/prometheus/certs/ca.crt

2. **配置 Prometheus 的 **kubernetes_sd_configs

在 Prometheus 的 prometheus.yml 配置文件中,配置 kubernetes_sd_configs 使用刚才获取的 API 访问凭证来采集 exporter 暴露的指标,因为 Prometheus 在 k8s 集群外部不方便访问 k8s 内部(当然可以用 LoadBalancer、Ingress 的形式暴露,但有些情况不适合用这些种方式,因为他们自带负载均衡的效果,而采集指标是采集每一个,不希望是负载均衡的方式采集),因此,采用通过 APIserver 服务发现和代理访问的方式采集 exporter 暴露的指标。

# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  scrape_timeout: 10s
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.

# Alertmanager configuration
alerting:
  alertmanagers:
  - scheme: http
    static_configs:
    - targets:
      - localhost:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
- rules/alert-rules-*.yml
- rules/record-rules-*.yml


# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
      - targets: ["localhost:9090"]

  # 监控的是 k8s 中的资源对象 node,pod,service,endpoint,ingress等
  - job_name: 'kube-state-metrics'
    scheme: https
    metrics_path: /api/v1/namespaces/prom/services/kube-state-metrics:8080/proxy/metrics
    #metrics_path: /api/v1/namespaces/prom/services/kube-state-metrics:http/proxy/metrics
    kubernetes_sd_configs:
    - api_server: 'https://139.196.12.198:6443'
      role: pod
      bearer_token_file: /etc/prometheus/token/prometheus_bearer_token
      tls_config:
        ca_file: /etc/prometheus/certs/ca.crt
    bearer_token_file: /etc/prometheus/token/prometheus_bearer_token
    tls_config:
      ca_file: /etc/prometheus/certs/ca.crt
    relabel_configs:
    - separator: ;
      regex: (.*)
      target_label: __address__
      replacement: 139.196.12.198:6443
      action: replace
    - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
      action: keep
      regex: true
    - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
      action: replace
      target_label: __scheme__
      regex: (https?)
    - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
      action: replace
      target_label: __metrics_path__
      regex: (.+)
    - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
      action: replace
      target_label: __address__
      regex: ([^:]+)(?::\d+)?;(\d+)
      replacement: $1:$2


  # 监控 kubernetes 的 apiservers
  - job_name: 'kubernetes-apiservers'
    kubernetes_sd_configs:
    - role: endpoints
    scheme: https
    tls_config:
      ca_file: /etc/prometheus/certs/ca.crt
    bearer_token_file: /etc/prometheus/token/prometheus_bearer_token
    relabel_configs:
    - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: default;kubernetes;https

  # 监控 kubelet 的 CAdvisor
  - job_name: 'kubernetes-cadvisor'
    honor_timestamps: true  #表示 Prometheus 会遵循从监控目标返回的时间戳
    metrics_path: /metrics
    scheme: https
    kubernetes_sd_configs:
    - api_server: 'https://139.196.12.198:6443'
      role: node
      bearer_token_file: /etc/prometheus/token/prometheus_bearer_token
      tls_config:
        ca_file: /etc/prometheus/certs/ca.crt
    bearer_token_file: /etc/prometheus/token/prometheus_bearer_token
    tls_config:
      ca_file: /etc/prometheus/certs/ca.crt
    relabel_configs:
    - action: labelmap
      regex: __meta_kubernetes_node_label_(.+)
    - separator: ;
      regex: (.*)
      target_label: __address__
      replacement: 139.196.12.198:6443
      action: replace
    - source_labels: [__meta_kubernetes_node_name]
      separator: ;
      regex: (.+)
      target_label: __metrics_path__
      replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
      action: replace

  # 监控 pod
  - job_name: 'k8s-pods-metrics'
    scheme: https
    kubernetes_sd_configs:
    - api_server: 'https://139.196.12.198:6443'
      role: pod
      bearer_token_file: /etc/prometheus/token/prometheus_bearer_token
      tls_config:
        ca_file: /etc/prometheus/certs/ca.crt
    bearer_token_file: /etc/prometheus/token/prometheus_bearer_token
    tls_config:
      ca_file: /etc/prometheus/certs/ca.crt
    relabel_configs:
    - separator: ;
      regex: (.*)
      target_label: __address__
      replacement: 139.196.12.198:6443
      action: replace
    - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
      action: keep
      regex: true
    - source_labels: [__meta_kubernetes_namespace,__meta_kubernetes_pod_name]
      action: replace
      target_label: __metrics_path__
      replacement: /api/v1/namespaces/${1}/pods/${2}/proxy/metrics
      regex: (.+);(.+)


  # AlertManager
  - job_name: 'alertmanager'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    metrics_path: /metrics
    static_configs:
    - targets:
      - localhost:9093

2.1 可选:通过 relabel_configs 进行过滤

你可以通过 relabel_configs 来进一步过滤或修改抓取的目标。例如,只抓取 node-exporter 服务的指标。

relabel_configs:
  - source_labels: [__meta_kubernetes_node_label_name]
    action: keep
    regex: node-exporter

3. 验证配置

  • 确保 Prometheus 配置文件语法正确,并重新启动 Prometheus 服务。
  • 在 Prometheus 的 Web 界面 (http://<prometheus-server>/targets) 中检查是否成功发现了 node-exporter 节点。
curl -k -H "Authorization: Bearer xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" https://10.0.0.100:10250/metrics/cadvisor
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值