概述
炫酷的官网示例:
项目Github地址:https://github.com/sqshq/sampler
该项目简而言之,就是可以通过配置文件,指定频率命令运行,再通过图形将结果展示出来,并可以设定告警条件。
日常工作中,进行上线、变更等操作时,可能需要针对关键指标进行一段时间的观察。此场景下,如果采用sampler工具,相比于专业的监控工具,无需安装新的包和数据库,也不存在指标定义的差异;相比于直接执行命令,又更为直观且灵活。
安装
linux下,该工具的使用非常简单,只需下载二进制文件并赋予执行权限即可:
sudo wget https://github.com/sqshq/sampler/releases/download/v1.1.0/sampler-1.1.0-linux-amd64 -O /usr/local/bin/sampler
sudo chmod +x /usr/local/bin/sampler
注意:如果要使用Sampler的报警功能,则需要使用libasound2-dev包。
使用
使用也非常简单,只需要准备好yml格式的配置文件(例如config.yml),再使用sampler -c config.yml
命令即可,想要退出按q即可。
官网给出的几种示例图表配置如下:
Runchart
探测网络时延示例
runcharts:
- title: Search engine response time
rate-ms: 500 # sampling rate, default = 1000
scale: 2 # number of digits after sample decimal point, default = 1
legend:
enabled: true # enables item labels, default = true
details: false # enables item statistics: cur/min/max/dlt values, default = true
items:
- label: GOOGLE
sample: curl -o /dev/null -s -w '%{time_total}' https://www.google.com
color: 178 # 8-bit color number, default one is chosen from a pre-defined palette
- label: YAHOO
sample: curl -o /dev/null -s -w '%{time_total}' https://search.yahoo.com
- label: BING
sample: curl -o /dev/null -s -w '%{time_total}' https://www.bing.com
Sparkline
展示CPU和内存使用情况示例
sparklines:
- title: CPU usage
rate-ms: 200
scale: 0
sample: ps -A -o %cpu | awk '{s+=$1} END {print s}'
- title: Free memory pages
rate-ms: 200
scale: 0
sample: memory_pressure | grep 'Pages free' | awk '{print $3}'
Barchart
展示tcp、udp流量示例
barcharts:
- title: Local network activity
rate-ms: 500 # sampling rate, default = 1000
scale: 0 # number of digits after sample decimal point, default = 1
items:
- label: UDP bytes in
sample: nettop -J bytes_in -l 1 -m udp | awk '{sum += $4} END {print sum}'
- label: UDP bytes out
sample: nettop -J bytes_out -l 1 -m udp | awk '{sum += $4} END {print sum}'
- label: TCP bytes in
sample: nettop -J bytes_in -l 1 -m tcp | awk '{sum += $4} END {print sum}'
- label: TCP bytes out
sample: nettop -J bytes_out -l 1 -m tcp | awk '{sum += $4} END {print sum}'
Gauge
展示时间进度示例
gauges:
- title: Minute progress
rate-ms: 500 # sampling rate, default = 1000
scale: 2 # number of digits after sample decimal point, default = 1
percent-only: false # toggle display of the current value, default = false
color: 178 # 8-bit color number, default one is chosen from a pre-defined palette
cur:
sample: date +%S # sample script for current value
max:
sample: echo 60 # sample script for max value
min:
sample: echo 0 # sample script for min value
- title: Year progress
cur:
sample: date +%j
max:
sample: echo 365
min:
sample: echo 0
Textbox
展示容器状态示例
textboxes:
- title: Local weather
rate-ms: 10000 # sampling rate, default = 1000
sample: curl wttr.in?0ATQF
border: false # border around the item, default = true
color: 178 # 8-bit color number, default is white
- title: Docker containers stats
rate-ms: 500
sample: docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.PIDs}}"
Asciibox
展示时间示例
asciiboxes:
- title: UTC time
rate-ms: 500 # sampling rate, default = 1000
font: 3d # font type, default = 2d
border: false # border around the item, default = true
color: 43 # 8-bit color number, default is white
sample: env TZ=UTC date +%r
更多配置可参考项目主页
其他应用
除了原生linux命令,主页还给出了一些其他应用的配置示例
MySQL
# prerequisite: installed mysql shell
variables:
mysql_connection: mysql -u root -s --database mysql --skip-column-names
sparklines:
- title: MySQL (random number example)
pty: true
init: $mysql_connection
sample: select rand();
PostgreSQL
# prerequisite: installed psql shell
variables:
PGPASSWORD: pwd
postgres_connection: psql -h localhost -U postgres --no-align --tuples-only
sparklines:
- title: PostgreSQL (random number example)
init: $postgres_connection
sample: select random();
Kafka
variables:
kafka_connection: $KAFKA_HOME/bin/kafka-consumer-groups --bootstrap-server localhost:9092
runcharts:
- title: Kafka lag per consumer group
rate-ms: 5000
scale: 0
items:
- label: A->B
sample: $kafka_connection --group group_a --describe | awk 'NR>1 {sum += $5} END {print sum}'
- label: B->C
sample: $kafka_connection --group group_b --describe | awk 'NR>1 {sum += $5} END {print sum}'
- label: C->D
sample: $kafka_connection --group group_c --describe | awk 'NR>1 {sum += $5} END {print sum}'
ssh
除了本地命令外,还可以通过ssh协议采集远程主机信息
variables:
sshconnection: ssh -i ~/my-key-pair.pem ec2-user@1.2.3.4
textboxes:
- title: SSH
pty: true
init: $sshconnection
sample: top