windows-exporter
https://github.com/prometheus-community/windows_exporter
windows-exporter安装部署
1.安装包下载
windows_exporter-0.22.0-amd64.msi
2。直接双击运行该msi程序即可正常安装
安装完成后,打开任务管理器,会看到里面有个windows-exporter的服务
3.使用127.0.0.1:9182端口,可看到对应的metrics
配置Prometheus,接入windows的metrics
配置映射里Prometheus.yaml新加一个job
- job_name: windows-exporter
static_configs:
- targets:
- 192.168.1.1:9182
- 192.168.1.2:9182
- 192.168.1.3:9182
- x.x.x.x:9182
将windows的ip加入到Prometheus配置文件里
注意
默认收集只收集了cpu,cs,logical_disk,net,os,service,system,textfile相关的指标,若要开启其他的,需配置—collectors.enabled
1.新增C:\Program Files\windows_exporter\config.yml文件
collectors:
enabled: "[defaults],process,cpu_info,memory,remote_fx,tcp"
collector:
service:
services-where: "Name='windows_exporter'"
log:
level: warn
2.修改windows启动参数
sc config windows_exporter binPath= "\"C:\Program Files\windows_exporter\windows_exporter.exe\" --config.file=\"C:\Program Files\windows_exporter\config.yml\" "
3.重启windows_exporter
服务
sc stop windows_exporter
sc start windows_exporter
二、powershell脚本一键安装windows_exporter,windows_exporter.ps1
Set-Location -Path $Env:TEMP
(New-Object Net.WebClient).DownloadFile('http://foreman.chinamcloud.com:8080/source/windows_exporter-0.23.1-amd64.msi', $ExecutionContext.SessionState.Path.GetUnresolvedProviderPathFromPSPath('./windows_exporter-0.23.1-amd64.msi'))
Start-Process ./windows_exporter-0.23.1-amd64.msi
Add-Content "C:\Program Files\windows_exporter\config.yml" "collectors:"
Add-Content "C:\Program Files\windows_exporter\config.yml" " enabled: ""[defaults],process,cpu_info,memory,remote_fx,tcp"" "
Add-Content "C:\Program Files\windows_exporter\config.yml" "collector:"
Add-Content "C:\Program Files\windows_exporter\config.yml" " service:"
Add-Content "C:\Program Files\windows_exporter\config.yml" " services-where: ""Name='windows_exporter'"""
Add-Content "C:\Program Files\windows_exporter\config.yml" "log:"
Add-Content "C:\Program Files\windows_exporter\config.yml" " level: warn"
sleep 30
sc.exe config "windows_exporter" binpath= """""""C:\Program Files\windows_exporter\windows_exporter.exe"""""" --config.file=""""""C:\Program Files\windows_exporter\config.yml"""""""
sc.exe config "windows_exporter" binpath= """""""C:\Program Files\windows_exporter\windows_exporter.exe"""""" --config.file=""""""C:\Program Files\windows_exporter\config.yml"""""""
sc.exe stop windows_exporter
sc.exe start windows_exporter
sleep 10
常用指标
主机CPU使用率
100 - avg(irate(windows_cpu_time_total{job=~"$job",instance=~"$instance",mode="idle"}[5m]))*100
主机内存使用率
100 - 100 * (windows_os_physical_memory_free_bytes{job=~"$job"} / windows_cs_physical_memory_bytes{job=~"$job"})
主机磁盘使用率
100-(windows_logical_disk_free_bytes/windows_logical_disk_size_bytes)*100
主机mpc相关进程句柄数
主机mpc相关进程句柄数
windows_process_handles{environment=~"$environment",job="windows-exporter",process=~'MPC.*|cloudia.*|APPBaseInTT.*|AppBaseInTT.*'}
node-exporter安装部署
未在k8s集群内的linux机器监控
https://github.com/prometheus/node_exporter
node_exporter:用于监控Linux系统的指标采集器。
常用指标:
•CPU
• 内存
• 硬盘
• 网络流量
• 文件描述符
• 系统负载
• 系统服务
数据接口:http://IP:9100/metrics
第一种:linux主机安装
#x86_64主机下载此客户端
wget --no-check-certificate http://foreman.chinamcloud.com:8080/source/node_exporter-1.6.1.linux-amd64.tar.gz
#安装
useradd prometheus -s /sbin/nologin
#x86_64主机
tar zxvf node_exporter-1.6.1.linux-amd64.tar.gz -C /tmp/
#x86_64主机
mv /tmp/node_exporter-1.6.1.linux-amd64 /usr/local/node_exporter
#目录授权
chown prometheus:prometheus -R /usr/local/node_exporter
#封装service
tee /etc/systemd/system/node-exporter.service <<-'EOF'
[Unit]
Description=Prometheus Node Exporter
After=network.target
[Service]
ExecStart=/usr/local/node_exporter/node_exporter
User=prometheus
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable node-exporter
systemctl start node-exporter
验证监控,访问http://ip:9100/metrics
第二种,docker容器启动
docker run -d --name node-exporter --restart=always -p 9100:9100 -v "/proc:/host/proc:ro" -v "/sys:/host/sys:ro" -v "/:/rootfs:ro" prom/node-exporter
配置Prometheus,接入Prometheus的metrics
配置映射里Prometheus.yaml新加一个job
- job_name: linux-exporter
static_configs:
- targets:
- 192.168.1.1:9100
- 192.168.1.2:9100
- 192.168.1.3:9100
- x.x.x.x:9100
将linux的ip加入到配置文件里
常用指标
主机 内存 使用率 > 90%
(1 - (node_memory_MemAvailable_bytes{environment=~"项目标志"} / (node_memory_MemTotal_bytes{environment=~"项目标志"})))* 100 >90
主机 CPU 使用率> 90%
(1 - avg(rate(node_cpu_seconds_total{environment=~"项目标志",mode="idle"})) by (instance))*100 >90
主机NTP时间差[5m] > 5s
node_timex_tai_offset_seconds{environment=~"项目标志"}>3
主机磁盘使用率>80%
(node_filesystem_size_bytes{environment=~"项目标志",fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{environment=~"项目标志",fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}) *100/(node_filesystem_avail_bytes {environment=~"项目标志",fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}+(node_filesystem_size_bytes{environment=~"项目标志",fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{environment=~"项目标志",fstype=~"ext.*|xfs",mountpoint !~".*pod.*"})) >80
主机Inode 使用率[5m] > 80%
(1-node_filesystem_files_free{environment=~"项目标志",fstype=~"ext.?|xfs"} / node_filesystem_files{environment=~"项目标志",fstype=~"ext.?|xfs"})*100 >80
blackbox-exporter安装部署
https://github.com/prometheus/blackbox_exporter
Blackbox Exporter是Prometheus社区提供的官方黑盒监控解决方案,其允许用户通过:HTTP、HTTPS、DNS、TCP以及ICMP的方式对网络进行探测。
应用场景
- HTTP 测试
- 定义 Request Header 信息 判断 Http status Http Respones Header Http Body 内容
- TCP 测试
- 业务组件端口状态监听 应用层协议定义与监听
- ICMP 测试
- 主机探活机制
- POST 测试
- 接口联通性
- SSL 证书过期时间
运行Blackbox Exporter时,需要用户提供探针的配置信息,这些配置信息可能是一些自定义的HTTP头信息,也可能是探测时需要的一些TSL配置,也可能是探针本身的验证行为。在Blackbox Exporter每一个探针配置称为一个module,并且以YAML配置文件的形式提供给Blackbox Exporter。每一个module主要包含以下配置内容,包括探针类型(prober)、验证访问超时时间(timeout)、以及当前探针的具体配置项:
探针类型:http、 tcp、 dns、 icmp.
prober: <prober_string>
超时时间
[ timeout: <duration> ]
探针的详细配置,最多只能配置其中的一个
[ http: <http_probe> ]
[ tcp: <tcp_probe> ]
[ dns: <dns_probe> ]
[ icmp: <icmp_probe> ]
安装部署blackbox-exporter
1.rancher内导入下面的blackbox-exporter.yml文件到Prometheus
apiVersion: apps/v1
kind: Deployment
metadata:
generation: 1
labels:
cattle.io/creator: norman
workload.user.cattle.io/workloadselector: deployment-prometheus-blackbox-exporter
name: blackbox-exporter
namespace: prometheus
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
workload.user.cattle.io/workloadselector: deployment-prometheus-blackbox-exporter
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
annotations:
field.cattle.io/ports: '[[{"containerPort":9115,"dnsName":"blackbox-exporter","hostPort":0,"kind":"ClusterIP","name":"blackbox-port","protocol":"TCP","sourcePort":0}]]'
creationTimestamp: null
labels:
app.kubernetes.io/name: blackbox
workload.user.cattle.io/workloadselector: deployment-prometheus-blackbox-exporter
spec:
containers:
- args:
- --config.file=/etc/blackbox_exporter/blackbox.yml
- --log.level=info
- --web.listen-address=:9115
image: prom/blackbox-exporter:v0.16.0
imagePullPolicy: Always
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 2
successThreshold: 1
tcpSocket:
port: 9115
timeoutSeconds: 2
name: blackbox-exporter
ports:
- containerPort: 9115
name: blackbox-port
protocol: TCP
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 2
successThreshold: 2
tcpSocket:
port: 9115
timeoutSeconds: 2
resources: {}
securityContext:
allowPrivilegeEscalation: false
capabilities: {}
privileged: false
readOnlyRootFilesystem: false
runAsNonRoot: false
stdin: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
tty: true
volumeMounts:
- mountPath: /etc/blackbox_exporter
name: vol1
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- configMap:
defaultMode: 420
name: blackbox-exporter
optional: false
name: vol1
2.rancher内导入下面yml,新增服务发现blackbox-exporter
apiVersion: v1
kind: Service
metadata:
annotations:
labels:
app.kubernetes.io/name: blackbox
name: blackbox-exporter
namespace: prometheus
selfLink: /api/v1/namespaces/prometheus/services/blackbox-exporter
spec:
ports:
- name: balckbox
port: 9115
protocol: TCP
targetPort: 9115
selector:
app.kubernetes.io/name: blackbox
sessionAffinity: None
type: ClusterIP
Prometheus下配置映射新增blackbox-exporter
blackbox.yml
modules:
http_2xx:
prober: http
timeout: 5s
http:
valid_http_versions: ["HTTP/1.1", "HTTP/2"]
valid_status_codes: [200,301,302,403,405]
method: GET
preferred_ip_protocol: "ipv4"
https_2xx:
prober: https
timeout: 5s
http:
valid_http_versions: ["HTTP/1.1", "HTTP/2"]
valid_status_codes: [200,301,302,403,405]
method: POST
preferred_ip_protocol: "ipv4"
tcp_connect:
prober: tcp
timeout: 2s
http_403:
prober: http
timeout: 2s
http:
valid_http_versions: ["HTTP/1.1", "HTTP/2"]
valid_status_codes: [403,405]
method: GET
preferred_ip_protocol: "ipv4"
https_403:
prober: https
timeout: 2s
http:
valid_http_versions: ["HTTP/1.1", "HTTP/2"]
valid_status_codes: [403,405]
method: GET
preferred_ip_protocol: "ipv4"
icmp:
prober: icmp/etc/blackbox_exporter
prometheus配置映射内新增job
如配置证书的监控,修改Prometheus内配置映射的prometheus.yml
- job_name: "blackbox_ssl"
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets:
- https://login.demo.chinamcloud.cn
relabel_configs:
- source_labels: [__address__]
target_label: instance
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: blackbox-exporter:9115
如配置TCP端口检测,修改Prometheus内配置映射的prometheus.yml
- job_name: "tcp-port-check"
scrape_interval: 30s
metrics_path: /probe
params:
module: [tcp_connect]
static_configs:
- targets:
- mpc.server:8088
- mysql.server:3306
- redis.server:6379
- kafka.kafka:9092
labels:
server: 'tcp-port-check'
relabel_configs:
- source_labels: [__address__]
target_label: instance
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: blackbox-exporter:9115
POST 测试,监听业务接口地址,用来判断接口是否在线
- job_name: 'blackbox_http_2xx_post'
scrape_interval: 10s
metrics_path: /probe
params:
module: [http_post_2xx_query]
static_configs:
- targets:
- https://xx.xxx.com/api/xx/xx/xx/query.action
labels:
group: 'Interface monitoring'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: blackbox-exporter:9115
常用指标
probe_success == 0 ##联通性异常
probe_success == 1 ##联通性正常
证书过期时间低于30天
(probe_ssl_earliest_cert_expiry-time()) / 86400 <30
http状态码
probe_http_status_code
请求耗时
probe_duration_seconds{job=~"blackbox_ssl"}
http请求耗时
probe_http_duration_seconds