Prometheus的二进制安装
一、安装总体介绍
1.1 需要安装的组件
- vmware虚拟机软件
VMware Workstation是一款功能强大的桌面虚拟计算机软件,提供用户可在单一的桌面上同时运行不同的操作系统,和进行开发、测试 、部署新的应用程序的最佳解决方案。VMware Workstation可在一部实体机器上模拟完整的网络环境,以及可便于携带的虚拟机,其更好的灵活性与先进的技术胜过了市面上其他的虚拟计算机软件。对于企业的 IT开发人员和系统管理员而言, VMware在虚拟网路,实时快照,拖曳共享文件夹,支持 PXE 等方面的特点使它成为必不可少的工具。
- centos的linux操作系统
- Prometheus软件
- Grafana软件
一二三章都是关于部署虚拟机的,这里这就不写了
四、Prometheus的二进制安装
4.1 获取安装包
官网:https://www.prometheus.io/download/
#切换到家目录
cd /home
#用wget命令从github.com下载指定Prometheus
wget https://github.com/prometheus/prometheus/releases/download/v2.45.5/prometheus-2.45.5.linux-amd64.tar.gz
#解压
tar xf prometheus-2.45.5.linux-amd64.tar.gz
#查看解压后的内容
ll
#创建Prometheus目录
mkdir /opt/prometheus -p
#移动解压后的文件名到/opt/并改名
mv prometheus-2.45.5.linux-amd64/ /opt/prometheus/prometheus
4.2 创建专门用户
useradd -M -s /usr/sbin/nologin prometheus
#更改Prometheus用户的文件夹权限
chown prometheus:prometheus -R /opt/prometheus
4.3 创建系统服务
cat > /etc/systemd/system/prometheus.service << "EOF"
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
Type=simple
User=prometheus
Group=prometheus
Restart=on-failure
ExecStart=/opt/prometheus/prometheus/prometheus \
--config.file=/opt/prometheus/prometheus/prometheus.yml \
--storage.tsdb.path=/opt/prometheus/prometheus/data \
--storage.tsdb.retention.time=60d \
--web.enable-lifecycle
[Install]
WantedBy=multi-user.target
EOF
#配置Prometheus的配置文件
#配置Prometheus的数据目录
#配置Prometheus的默认存储天数15天->60天
#配置Prometheus的热加载配置
启动服务
systemctl start prometheus
systemctl enable prometheus
#查看服务状态
systemctl status prometheus
如有启动问题,进行日志查看&故障排除
journalctl -u prometheus.service -f
4.4 访问地址
#Prometheus的访问地址(Prometheus的服务端口:9090)
#如果9090的端口不通,一方面要检查Prometheus的service是否启动,另一方面要检查防火墙是否关闭systemctl stop firewalld
http://192.168.28.100:9090/
#Prometheus的监控指标
http://192.168.28.100:9090/metrics
五、安装alertmanager
5.1 获取安装包
#下载alertmanager二进制压缩包
wget https://github.com/prometheus/alertmanager/releases/download/v0.27.0/alertmanager-0.27.0.linux-amd64.tar.gz
#解压
tar xf alertmanager-0.27.0.linux-amd64.tar.gz
#查看解压后的文件名
ll
#移动解压后的文件名到/opt/,并改名为alertmanager
mv alertmanager-0.27.0.linux-amd64 /opt/prometheus/alertmanager
5.2 更改owner权限
chown prometheus:prometheus -R /opt/prometheus/alertmanager
5.3 创建系统服务
cat >/etc/systemd/system/alertmanager.service << "EOF"
[Unit]
Description=Alert Manager
Wants=network-online.target
After=network-online.target
[Service]
Type=simple
User=prometheus
Group=prometheus
ExecStart=/opt/prometheus/alertmanager/alertmanager \
--config.file=/opt/prometheus/alertmanager/alertmanager.yml \
--storage.path=/opt/prometheus/alertmanager/data
Restart=always
[Install]
WantedBy=multi-user.target
EOF
启动alertmanager
systemctl daemon-reload
systemctl start alertmanager.service
systemctl enable alertmanager.service
#查看alertmanager的服务状态
systemctl status alertmanager.service
5.4 修改prometheus配置
加入alertmanager
#vi /opt/prometheus/prometheus/prometheus.yml
alerting:
alertmanagers:
- static_configs:
- targets:
#根据实际填写alertmanager的地址
- localhost:9093
rule_files:
#根据实际名修改文件名
- "alert.yml"
增加触发器配置文件
cat > /opt/prometheus/prometheus/alert.yml <<"EOF"
groups:
- name: Prometheus alert
rules:
# 对任何实例超过30秒无法联系的情况发出警报
- alert: 服务告警
expr: up == 0
for: 30s
labels:
severity: critical
annotations:
instance: "服务异常,实例:{{ $labels.instance }}"
description: "{{ $labels.job }} 服务已关闭"
EOF
检查配置
cd /opt/prometheus/prometheus/
./promtool check config prometheus.yml
重启prometheus或重新加载配置文件
#重启
systemctl restart prometheus
#或重载配置文件,需要--web.enable-lifecycle配置(热加载)
curl -X POST http://localhost:9090/-/reload
5.5 访问地址
http://192.168.28.100:9093/
六、Grafana软件的安装
本次课程选择离线安装包方式,grafana版本9.3.16-1
6.1 上传离线包
grafana-9.3.16-1.x86_64.rpm
#切换到/home目录
cd /home
#上传grafana-9.3.16-1.x86_64.rpm
ll
6.2 离线包安装,并开机自启动
- 离线包安装
yum localinstall grafana-9.3.16-1.x86_64.rpm -y
- 开机自启动
systemctl start grafana-server.service
systemctl enable grafana-server.service
#确认3000端口是否被grafana程序占据
ss -ntulp | grep 3000
6.3 访问图形界面
http://192.168.28.100:3000/
初始密码:admin/admin

七、安装node_exporter
7.1 获取安装包
wget https://github.com/prometheus/node_exporter/releases/download/v1.8.0/node_exporter-1.8.0.linux-amd64.tar.gz
#解压
tar xvf node_exporter-1.8.0.linux-amd64.tar.gz
#查看内容
ll
#移动到指定目录
mv node_exporter-1.8.0.linux-amd64 /opt/prometheus/node_exporter
7.2 更改owner权限
chown prometheus:prometheus -R /opt/prometheus/node_exporter
7.3 创建系统服务
cat > /etc/systemd/system/node_exporter.service <<"EOF"
[Unit]
Description=node_exporter
Documentation=https://prometheus.io/
After=network.target
[Service]
User=prometheus
Group=prometheus
ExecStart=/opt/prometheus/node_exporter/node_exporter
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
启动服务
systemctl daemon-reload
systemctl start node_exporter.service
systemctl enable node_exporter.service
#查看服务状态
systemctl status node_exporter.service
7.4 访问地址
http://192.168.28.100:9100/metrics
7.5 配置Prometheus
#vi /opt/prometheus/prometheus/prometheus.yml
#node-exporter配置
- job_name: "node-exporter"
scrape_interval: 15s
static_configs:
- targets: ["localhost:9100"]
labels:
instance: Prometheus服务器
重新加载Prometheus配置
curl -X POST http://localhost:9090/-/reload
prometheus的web检查
http://192.168.28.100:9090/
检查status