1.开启prometheus 插件
部署的是 EMQX 4.4 开源版, 则需要通过 EmqxPlugin CRD 开启emqx_prometheus 插件:
cat << "EOF" | kubectl apply -f -
apiVersion: apps.emqx.io/v1beta4
kind: EmqxPlugin
metadata:
name: emqx-prometheus
spec:
pluginName: emqx_prometheus
selector:
apps.emqx.io/instance: emqx
apps.emqx.io/managed-by: emqx-operator
获取AppID和AppSecret
curl -u {AppID}:{AppSecret} -X GET localhost:18083/api/v4/emqx_prometheus?type=prometheus
2.配置 EMQX 服务的 Kubernetes Service
确保 EMQX 的接口通过一个 Kubernetes Service 暴露。
kind: Service
apiVersion: v1
metadata:
name: emqx-prometheus
labels:
app: emqx
spec:
ports:
- name: prometheus
protocol: TCP
port: 18083
targetPort: 18083
type: ClusterIP
selector:
apps.emqx.io/instance: emqx
apps.emqx.io/managed-by: emqx-operator
3.创建 ServiceMonitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: emqx-monitor
spec:
endpoints:
- basicAuth:
password:
key: password
name: emqx-basic-auth
username:
key: username
name: emqx-basic-auth
params:
type:
- prometheus
path: /api/v4/emqx_prometheus
port: prometheus
scheme: http
namespaceSelector:
matchNames:
- emqx
selector:
matchLabels:
app: emqx
4.创建 BasicAuth Secret
kubectl create secret generic emqx-basic-auth \
--from-literal=username={AppID} \
--from-literal=password={AppSecret}
5.验证Promethues 是否已经能读取到emqx的metric
6. 设置promethues告警
EMQX 客户端连接数告警大于2000
EMQX消息订阅失败
EMQX消息发布失败
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: emqx-rules
namespace: emqx
spec:
groups:
- name: emqx_alerts
rules:
- alert: HighEmqxConnectionCount
annotations:
description: '{{ $value }} connections detected on instance {{ $labels.instance }}. This exceeds the threshold of 2000.'
summary: 'High number of connections detected (instance {{ $labels.instance }})'
expr: emqx_connections_count > 2000
for: 1s
labels:
severity: critical
- alert: EmqxHighPacketsSubscribeError
annotations:
description: 'There are subscribe errors on instance {{ $labels.instance }}. The error count is {{ $value }}.'
summary: 'Subscribe errors detected (instance {{ $labels.instance }})'
expr: emqx_packets_subscribe_error > 0
for: 1m
labels:
severity: critical
- alert: EmqxHighpublishError
annotations:
description: 'There are publish errors on instance {{ $labels.instance }}. The error count is {{ $value }}.'
summary: 'Publish errors detected (instance {{ $labels.instance }})'
expr: emqx_packets_publish_error > 0
for: 1m
labels:
severity: critical
7. 用grafana进行展示
refer doc:
https://docs.emqx.com/en/emqx-operator/latest/tasks/configure-emqx-prometheus.html
github:
https://github.com/emqx/emqx-exporter/tree/main/grafana-dashboard/template
我们使用如上的overview.json文件作为grafana模板