注:文章主要介绍通过redis_exporter来配合普罗米修斯和grafana监控red
Docker部署
1.下载镜像
# docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/oliver006/redis_exporter:v1.51.0
# docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/oliver006/redis_exporter:v1.51.0 redis_exporter:v1.51.0
2.运行redis_exporter
docker run -d \
--name redis_exporter \
--restart always \
--network host \
-e REDIS_ADDR="localhost:6379" \
-e REDIS_PASSWORD="123456" \
redis_exporter:v1.51.0
这个命令的各部分含义如下:
-d: 在后台运行容器。
--name redis_exporter: 为容器指定一个名称。
--restart always: 设置容器自动重启策略。
--network host: 使用宿主机的网络。
-e REDIS_ADDR="localhost:6379": 设置环境变量 REDIS_ADDR。
-e REDIS_PASSWORD="123456": 设置环境变量 REDIS_PASSWORD。
redis_exporter:v1.51.0: 指定要运行的镜像及其标签。
或者使用docker-compose运行,配置docker-compose文件
version : '3.8'
networks:
base:
driver: bridge
services:
redis_exporter:
image: redis_exporter:v1.51.0
container_name: redis_exporter
restart: always
network_mode: host
environment:
REDIS_ADDR: "localhost:6379"
REDIS_PASSWORD: "123456"
ports:
- "9121:9121"
# docker-compose up redis_exporter -d
查看服务是否启动
# docker ps
# netstat -nltp | grep 9121
也可以通过浏览器访问
ip:9121
3.修改Prometheus配置文件
- targets: ['192.168.1.201:9121']
labels:
instance: 测试环境服务器1
group: 项目1
environment: test
4.重启Prometheus
# docker restart prometheus
5.配置grafana面板
推荐模板ip:763
安装包安装
1.下载安装redis_exporter
# wget https://github.com/oliver006/redis_exporter/releases/download/v1.51.0/redis_exporter-v1.51.0.linux-amd64.tar.gz
# mkdir /usr/local/redis_exporter/
# tar -xzvf redis_exporter-v1.51.0.linux-amd64.tar.gz -C /usr/local/redis_exporter/
2. redis_exporter 用法
-redis.addr:指明一个或多个 Redis 节点的地址,多个节点使用逗号分隔,默认为 redis://localhost:6379
-redis.password:验证 Redis 时使用的密码;
-redis.file:包含一个或多个redis 节点的文件路径,每行一个节点,此选项与 -redis.addr 互斥。
-web.listen-address:监听的地址和端口,默认为 0.0.0.0:9121
3. 创建redis_exporter.service启动脚本
# vim /etc/systemd/system/redis_exporter.service
#添加如下内容:
[Unit]
Description=redis_exporter
After=network.target
[Service]
Type=simple
User=root
ExecStart=/usr/local/redis_exporter/redis_exporter -redis.addr 127.0.0.1:6379 -redis.password 123456
Restart=on-failure
[Install]
WantedBy=multi-user.target
4.启动服务
# systemctl daemon-reload
# systemctl start redis_exporter
# systemctl status redis_exporter
# systemctl enable redis_exporter
# netstat -nltp|grep 9121
5. 集成Prometheus
# 编辑prometheus.yml
# 添加以下内容
- job_name: redis
static_configs:
- targets: ['localhost:9121']
# 重启Prometheus
# systemctl restart prometheus
告警规则
注:根据需求自行修改
groups:
- name: Redis
rules:
- alert: RedisDown
expr: redis_up == 0
for: 5m
labels:
severity: error
annotations:
summary: "Redis down (instance {{ $labels.instance }})"
description: "Redis 挂了啊,mmp\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
- alert: MissingBackup
expr: time() - redis_rdb_last_save_timestamp_seconds > 60 * 60 * 24
for: 5m
labels:
severity: error
annotations:
summary: "Missing backup (instance {{ $labels.instance }})"
description: "Redis has not been backuped for 24 hours\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
- alert: OutOfMemory
expr: redis_memory_used_bytes / redis_total_system_memory_bytes * 100 > 90
for: 5m
labels:
severity: warning
annotations:
summary: "Out of memory (instance {{ $labels.instance }})"
description: "Redis is running out of memory (> 90%)\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
- alert: ReplicationBroken
expr: delta(redis_connected_slaves[1m]) < 0
for: 5m
labels:
severity: error
annotations:
summary: "Replication broken (instance {{ $labels.instance }})"
description: "Redis instance lost a slave\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
- alert: TooManyConnections
expr: redis_connected_clients > 1000
for: 5m
labels:
severity: warning
annotations:
summary: "Too many connections (instance {{ $labels.instance }})"
description: "Redis instance has too many connections\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
- alert: NotEnoughConnections
expr: redis_connected_clients < 5
for: 5m
labels:
severity: warning
annotations:
summary: "Not enough connections (instance {{ $labels.instance }})"
description: "Redis instance should have more connections (> 5)\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
- alert: RejectedConnections
expr: increase(redis_rejected_connections_total[1m]) > 0
for: 5m
labels:
severity: error
annotations:
summary: "Rejected connections (instance {{ $labels.instance }})"
description: "Some connections to Redis has been rejected\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"