由于环境上主机资源紧张,vm对资源争抢严重,导致时常会出现文件系统只读的情况。
- 在zabbix-agent上:
mkdir -p /usr/local/zabbix/scripts
cd /usr/local/zabbix/scripts
cat >> disk_health_check.py <<EOF
#!/usr/bin/python
# -*- coding: utf-8 -*-
#磁盘只读检测脚本正常0,异常1
import time
try:
fileDisk = open ( '/usr/local/zabbix/scripts/disk_health_check.log', 'w' )
old = str(time.time())
fileDisk.write(old)
fileDisk = open ( '/usr/local/zabbix/scripts/disk_health_check.log' )
new=fileDisk.read()
if (old==new):
print '0'
else:
print '1'
fileDisk.close()
except:
print '1'
EOF
chmod +x disk_health_check.py
cd /etc/zabbix/zabbix_agentd.d/
echo "UserParameter=disk.health.check,/usr/bin/python /usr/local/zabbix/scripts/disk_health_check.py" > disk.conf
chown -R zabbix:zabbix /usr/local/zabbix/
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/sysconfig/selinux
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
setenforce 0
systemctl restart zabbix-agent
- 把下面的模板在zabbix上面导入(配置-模板-导入):
<?xml version="1.0" encoding="UTF-8"?>
<zabbix_export>
<version>4.2</version>
<date>2019-05-21T02:13:55Z</date>
<groups>
<group>
<name>Template-Hardware</name>
</group>
</groups>
<templates>
<template>
<template>DiskHealth-Check</template>
<name>DiskHealth-Check</name>
<description>磁盘写入输出状态0正常,1异常</description>
<groups>
<group>
<name>Template-Hardware</name>
</group>
</groups>
<applications>
<application>
<name>diskHealth</name>
</application>
</applications>
<items>
<item>
<name>文件系统read-only检查</name>
<type>0</type>
<snmp_community/>
<snmp_oid/>
<key>disk.health.check</key>
<delay>120</delay>
<history>90d</history>
<trends>365d</trends>
<status>0</status>
<value_type>3</value_type>
<allowed_hosts/>
<units/>
<snmpv3_contextname/>
<snmpv3_securityname/>
<snmpv3_securitylevel>0</snmpv3_securitylevel>
<snmpv3_authprotocol>0</snmpv3_authprotocol>
<snmpv3_authpassphrase/>
<snmpv3_privprotocol>0</snmpv3_privprotocol>
<snmpv3_privpassphrase/>
<params/>
<ipmi_sensor/>
<authtype>0</authtype>
<username/>
<password/>
<publickey/>
<privatekey/>
<port/>
<description/>
<inventory_link>0</inventory_link>
<applications>
<application>
<name>diskHealth</name>
</application>
</applications>
<valuemap/>
<logtimefmt/>
<preprocessing/>
<jmx_endpoint/>
<timeout>3s</timeout>
<url/>
<query_fields/>
<posts/>
<status_codes>200</status_codes>
<follow_redirects>1</follow_redirects>
<post_type>0</post_type>
<http_proxy/>
<headers/>
<retrieve_mode>0</retrieve_mode>
<request_method>0</request_method>
<output_format>0</output_format>
<allow_traps>0</allow_traps>
<ssl_cert_file/>
<ssl_key_file/>
<ssl_key_password/>
<verify_peer>0</verify_peer>
<verify_host>0</verify_host>
<master_item/>
</item>
</items>
<discovery_rules/>
<httptests/>
<macros/>
<templates/>
<screens/>
<tags/>
</template>
</templates>
<triggers>
<trigger>
<expression>{DiskHealth-Check:disk.health.check.count(#2,1,"eq")}>1</expression>
<recovery_mode>0</recovery_mode>
<recovery_expression/>
<name>文件系统read-only</name>
<correlation_mode>0</correlation_mode>
<correlation_tag/>
<url/>
<status>0</status>
<priority>5</priority>
<description>2次内触发器等于1的次数大于1(等于2)次就会告警,判断两次都异常即告警</description>
<type>0</type>
<manual_close>0</manual_close>
<dependencies/>
<tags/>
</trigger>
</triggers>
<graphs>
<graph>
<name>文件系统read-only检查</name>
<width>900</width>
<height>200</height>
<yaxismin>0.0000</yaxismin>
<yaxismax>100.0000</yaxismax>
<show_work_period>1</show_work_period>
<show_triggers>1</show_triggers>
<type>0</type>
<show_legend>1</show_legend>
<show_3d>0</show_3d>
<percent_left>0.0000</percent_left>
<percent_right>0.0000</percent_right>
<ymin_type_1>0</ymin_type_1>
<ymax_type_1>0</ymax_type_1>
<ymin_item_1>0</ymin_item_1>
<ymax_item_1>0</ymax_item_1>
<graph_items>
<graph_item>
<sortorder>0</sortorder>
<drawtype>3</drawtype>
<color>00C800</color>
<yaxisside>0</yaxisside>
<calc_fnc>7</calc_fnc>
<type>0</type>
<item>
<host>DiskHealth-Check</host>
<key>disk.health.check</key>
</item>
</graph_item>
</graph_items>
</graph>
</graphs>
</zabbix_export>
- 可以在zabbix-server上先测试一下:
[root@0f3c27f24c08 alertscripts]# zabbix_get -s 172.16.6.24 -k disk.health.check
0
- 主机链接新创建的模板