一、Prometheus 部署
环境准备
hostnamectl set-hostname prometheus
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
vim /etc/resolv.conf
nameserver 114.114.114.114
ntpdate ntp1.aliyun.com #时间同步必做,否则出问题
解包并启动服务
#安装包拖进去然后解压指定目录
tar zxvf prometheus-2.27.1.linux-amd64.tar.gz -C /usr/local/
cd /usr/local/
cd prometheus-2.27.1.linux-amd64/
./prometheus
再打开一个终端并查看端口是否已经开启
[root@prometheus ~]#netstat -antp | grep 9090
tcp6 0 0 :::9090 :::* LISTEN 2463/./prometheus
tcp6 0 0 ::1:9090 ::1:53170 ESTABLISHED 2463/./prometheus
tcp6 0 0 ::1:53170 ::1:9090 ESTABLISHED 2463/./prometheus
访问web页面192.168.74.135:9090(表达式浏览器)
访问192.168.74.135:9090/metrics 查看 prometheus 自带的内键指标
二、部署监控其他节点
主机名 | 地址 | 所需安装包 |
---|---|---|
prometheus | 192.168.74.135 | prometheus-2.27.1.linux-amd64.tar.gz |
server1 | 192.168.74.122 | node_exporter-1.1.2.linuz-amd64.tar.gz |
server2 | 192.168.74.128 | node_exporter-1.1.2.linuz-amd64.tar.gz |
server3 | 192.168.74.131 | node_exporter-1.1.2.linuz-amd64.tar.gz |
主服务器由于上面已经配置完成了所以就不再重新配置了
1. 主配置文件解析
cd prometheus-2.27.1.linux-amd64/
vim prometheus.yml
my global config
global: #全局组件
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. #每隔多久抓取一次指标,不设置默认1分钟
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
#内置告警规则的评估周期
#scrape_timeout is set to the global default (10s).
# Alertmanager configuration #对接的altermanager(第三方告警模块)
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files: #告警规则;告警规则可以使用yml规则去书写
- "first_rules.yml"
- "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs: #数据采集模块
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. ##对于所抓取的指标数据采集的来源在意job_name来定义
- job_name: 'prometheus' #对于指标需要打上的标签,对于PrometheusSQL(查询语句)的标签:比如prometheus{target='values'}
# metrics_path defaults to '/metrics' #收集数据的路径;展示使用metrics模式
# scheme defaults to 'http'. #默认抓取的方式是http
static_configs: #对于Prometheus的静态配置监听端口具体数据收集的位置 默认的端口9090
- targets: ['localhost:9090']
2. server 节点配置
上传压缩包加载 node_exporter
tar zxvf node_exporter-1.1.2.linux-amd64.tar.gz
cd node_exporter-1.1.2.linux-amd64/
cp node_exporter /usr/local/bin/
开启服务
./node_exporter
netstat -antp | grep 9100
./node_exporter --help #可以查看命令可选项
服务管理方式utilfile(文件读取工具)
[Unit]
Description=node_exporter
Documentation=https:/prometheus.io/
After=network.targets
[serveice]
Type=simple
User=prometheus
ExecStart=/usr/local/bin/node_exporter \
--collector.ntp \
--collector.mountstats \
--collector.systemd \
--collertor.tcpstat
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
Restart=always
[Install]
WantedBy=multi-user.target
- 访问salve服务器节点查看抓取内容