MicroK8s集群监控：Prometheus+Grafana部署与告警配置-优快云博客

MicroK8s集群监控：Prometheus+Grafana部署与告警配置

【免费下载链接】microk8s MicroK8s is a small, fast, single-package Kubernetes for datacenters and the edge. 项目地址: https://gitcode.com/gh_mirrors/mic/microk8s

MicroK8s作为轻量级Kubernetes发行版，提供了完整的容器编排能力。集群监控是保障服务稳定性的关键环节，本文将详细介绍如何在MicroK8s环境中部署Prometheus（普罗米修斯，指标收集系统）和Grafana（可视化仪表盘），并配置关键告警规则。

环境准备与依赖检查

在开始部署前，请确保MicroK8s集群状态正常。执行以下命令验证节点状态：

microk8s status

确认集群运行正常后，启用基础依赖组件。根据README.md中监控组件说明，需先启用DNS服务：

microk8s enable dns

监控组件部署

启用Prometheus与Grafana

MicroK8s提供了一键式插件管理功能。通过enable命令部署监控套件：

microk8s enable prometheus grafana

该命令会自动部署以下组件：

Prometheus服务器（指标收集）
Grafana可视化平台
Node Exporter（节点指标采集）
默认告警规则

部署脚本位于${SNAP}/actions/目录（默认路径为/snap/microk8s/current/actions/），可通过scripts/wrappers/microk8s-enable.wrapper查看实现逻辑。

验证部署状态

执行以下命令检查监控命名空间下的Pod状态：

microk8s kubectl get pods -n monitoring

预期输出应包含prometheus-server-xxxx和grafana-xxxx等运行中的Pod。

访问Grafana仪表盘

获取访问凭证

Grafana默认管理员密码存储在Secret中，通过以下命令提取：

microk8s kubectl get secret grafana-admin-credentials -n monitoring -o jsonpath='{.data.admin-password}' | base64 -d

端口转发配置

为安全访问Grafana界面，使用端口转发功能：

microk8s kubectl port-forward -n monitoring service/grafana 3000:80

在浏览器中访问http://localhost:3000，使用用户名admin和上述密码登录。

导入Kubernetes监控面板

Grafana内置多种监控模板，推荐导入Kubernetes Cluster Monitoring面板（ID: 7249）：

登录后点击左侧"+" > "Import"
输入面板ID并加载
选择Prometheus数据源（通常自动配置）

核心监控指标配置

关键指标采集

Prometheus默认采集以下核心指标，配置定义在Calico网络插件的YAML文件中：

节点资源使用率（CPU/内存/磁盘）
Pod网络流量（通过upgrade-scripts/000-switch-to-calico/resources/calico.yaml中的prometheusMetricsEnabled: true启用）
Kubernetes组件健康状态

关键配置项示例（来自Calico配置）：

prometheusMetricsEnabled: true
prometheusMetricsPort: 9091

自定义指标采集

如需添加应用级指标，可通过以下步骤实现：

在应用Pod中添加Prometheus注解：

annotations:
  prometheus.io/scrape: "true"
  prometheus.io/port: "8080"

重启应用使配置生效
在Grafana中创建自定义面板

告警规则配置

默认告警规则

MicroK8s预配置了基础告警规则，位于prometheus-server的ConfigMap中：

microk8s kubectl get configmap prometheus-server -n monitoring -o yaml

包含节点内存使用率过高、Pod CrashLoop等常见告警。

添加自定义告警

创建告警规则文件custom-alerts.yaml：

groups:
- name: custom.rules
  rules:
  - alert: HighCpuUsage
    expr: avg(rate(node_cpu_seconds_total{mode!="idle"}[5m])) by (instance) > 0.8
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High CPU usage on {{ $labels.instance }}"

应用配置：

microk8s kubectl apply -f custom-alerts.yaml -n monitoring

在Prometheus UI（http://localhost:9090/rules）验证规则加载状态。

告警通知配置

配置SMTP通知

编辑Grafana配置文件（通过ConfigMap）：

microk8s kubectl edit configmap grafana-config -n monitoring

添加SMTP设置：

[smtp]
enabled = true
host = smtp.example.com:587
user = alert@example.com
password = your-password
skip_verify = true
from_address = alert@example.com

重启Grafana Pod使配置生效。
在Grafana界面创建通知渠道：
- 导航至"Alerting" > "Notification channels"
- 选择"Email"类型并配置收件人

高可用与性能优化

持久化存储配置

为防止监控数据丢失，建议为Prometheus配置持久卷。编辑StatefulSet：

microk8s kubectl edit statefulset prometheus-server -n monitoring

添加volumeClaimTemplates配置（参考tests/templates/pvc.yaml示例）。

指标保留策略

修改Prometheus存储保留时间以控制磁盘占用：

microk8s kubectl edit configmap prometheus-server -n monitoring

调整retention参数（默认15天）：

prometheus.yml: |
  global:
    retention: 7d

常见问题排查

监控数据缺失

检查Prometheus Pod日志：

microk8s kubectl logs -n monitoring prometheus-server-xxxx

验证网络插件指标采集状态（Calico配置）：
```
microk8s kubectl get pods -n kube-system | grep calico
```

Grafana登录问题

如遇密码错误，重置管理员密码：

microk8s kubectl delete secret grafana-admin-credentials -n monitoring
# 重启Grafana Pod自动生成新密码

总结

通过MicroK8s的内置插件系统，可快速部署企业级监控解决方案。关键配置文件参考：

监控插件定义：scripts/wrappers/microk8s-enable.wrapper
网络指标采集：upgrade-scripts/000-switch-to-calico/resources/calico.yaml
测试模板：tests/templates/simple-deploy.yaml

定期通过microk8s update保持监控组件为最新版本，确保兼容性与安全性。

【免费下载链接】microk8s MicroK8s is a small, fast, single-package Kubernetes for datacenters and the edge. 项目地址: https://gitcode.com/gh_mirrors/mic/microk8s

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考