Redhat 5.2 配置nagios
-
安装系统组件
-
安装apache、gcc
-
配置apache
-
安装nagios组件
-
安装nagios
-
安装nagios-plugins
-
安装nrpe
-
配置被监控机
-
配置linux被监控机(客户端)
-
配置交换机被监控机(客户端)
-
配置windows被监控机(客户端)
-
配置邮件监控
安装组件
Apache、gcc glibc glibc-common gd-devel、 openssl、openssl-devel(open安装未列出)
net-snmp-libs net-snmp-devel net-snmp net-snmp-utils
安装apache、gcc
(由于rpm的依赖关系太烦人,配置yum,舒服多了,具体请见yum安装)
| [root@localhost ~]# yum install httpd …… Downloading Packages: (1/4): apr-util-1.2.7-7.e 100% |=========================| 76 kB 00:00 (2/4): apr-1.2.7-11.i386. 100% |=========================| 123 kB 00:00 (3/4): httpd-2.2.3-11.el5 100% |=========================| 1.1 MB 00:01 (4/4): postgresql-libs-8. 100% |=========================| 196 kB 00:00 warning: rpmts_HdrFromFdno: Header V3 DSA signature: NOKEY, key ID 37017186 Importing GPG key 0x37017186 "Red Hat, Inc. (release key) <security@redhat.com>" from http://10.155.2.75/Server/RPM-GPG-KEY-redhat-release Is this ok [y/N]: y Running rpm_check_debug Running Transaction Test Finished Transaction Test Transaction Test Succeeded Running Transaction Installing: apr ######################### [1/4] Installing: postgresql-libs ######################### [2/4] Installing: apr-util ######################### [3/4] Installing: httpd ######################### [4/4] …… |
| [root@localhost ~]# yum install gcc …… Downloading Packages: (1/5): libgomp-4.1.2-42.e 100% |=========================| 82 kB 00:00 (2/5): glibc-headers-2.5- 100% |=========================| 610 kB 00:00 (3/5): glibc-devel-2.5-24 100% |=========================| 2.0 MB 00:02 (4/5): gcc-4.1.2-42.el5.i 100% |=========================| 5.2 MB 00:07 (5/5): kernel-headers-2.6 100% |=========================| 843 kB 00:01 Running rpm_check_debug Running Transaction Test Finished Transaction Test Transaction Test Succeeded Running Transaction Installing: libgomp ######################### [1/5] Installing: kernel-headers ######################### [2/5] Installing: glibc-headers ######################### [3/5] Installing: glibc-devel ######################### [4/5] Installing: gcc ######################### [5/5]
Installed: gcc.i386 0:4.1.2-42.el5 Dependency Installed: glibc-devel.i386 0:2.5-24 glibc-headers.i386 0:2.5-24 kernel-headers.i386 0:2.6.18-92.el5 libgomp.i386 0:4.1.2-42.el5 Complete! |
| [root@localhost ~]# yum install glibc glibc-common gd-devel …… Running Transaction Installing: zlib-devel ####################### [ 1/12] Installing: freetype-devel ####################### [ 2/12] Installing: fontconfig-devel ####################### [ 3/12] Installing: libpng-devel ####################### [ 4/12] Installing: libXau-devel ####################### [ 5/12] Installing: libjpeg-devel ####################### [ 6/12] Installing: xorg-x11-proto-devel ####################### [ 7/12] Installing: libX11-devel ####################### [ 8/12] Installing: libXpm-devel ####################### [ 9/12] Installing: libXdmcp-devel ####################### [10/12] Installing: mesa-libGL-devel ####################### [11/12] Installing: gd-devel ####################### [12/12]
Installed: gd-devel.i386 0:2.0.33-9.4.el5_1.1 Dependency Installed: fontconfig-devel.i386 0:2.4.1-7.el5 freetype-devel.i386 0:2.2.1-19.el5 libX11-devel.i386 0:1.0.3-9.el5 libXau-devel.i386 0:1.0.1-3.1 libXdmcp-devel.i386 0:1.0.1-2.1 libXpm-devel.i386 0:3.5.5-3 libjpeg-devel.i386 0:6b-37 libpng-devel.i386 2:1.2.10-7.1.el5_0.1 mesa-libGL-devel.i386 0:6.5.1-7.5.el5 xorg-x11-proto-devel.i386 0:7.1-9.fc6 zlib-devel.i386 0:1.2.3-3 Complete! |
配置apache
注把apache 加入到nagcmd组,以便在通过web Interface 操作nagios是有足够的权限;
| [root@localhost ~]# useradd nagios && passwd nagios //chinahr123$ Changing password for user nagios. New UNIX password: BAD PASSWORD: it is based on a dictionary word Retype new UNIX password: Sorry, passwords do not match. New UNIX password: Retype new UNIX password: passwd: all authentication tokens updated successfully. [root@localhost ~]# groupadd nagcmd [root@localhost ~]# usermod -G nagcmd nagios [root@localhost ~]# usermod -G nagcmd apache |
安装nagios组件
安装nagios
http://cdnetworks-kr-2.dl.sourceforge.net/project/nagios/nagios-3.x/nagios-3.2.1/nagios-3.2.1.tar.gz
http://cdnetworks-kr-2.dl.sourceforge.net/project/nagiosplug/nagiosplug/1.4.15/nagios-plugins-1.4.15.tar.gz
| [root@localhost src]# tar zxvf nagios-3.2.1.tar.gz [root@localhost src]# cd nagios-3.2.1 [root@localhost nagios-3.2.1]#./configure --with-command-group=nagcmd --prefix=/usr/local/nagios [root@localhost nagios-3.2.1]# make all [root@localhost nagios-3.2.1]# make install [root@localhost nagios-3.2.1]# make install-init /usr/bin/install -c -m 755 -d -o root -g root /etc/rc.d/init.d /usr/bin/install -c -m 755 -o root -g root daemon-init /etc/rc.d/init.d/nagios [root@localhost nagios-3.2.1]# make install-config …… *** Config files installed *** Remember, these are *SAMPLE* config files. You'll need to read the documentation for more information on how to actually define services, hosts, etc. to fit your particular needs. [root@localhost nagios-3.2.1]# make install-commandmode /usr/bin/install -c -m 775 -o nagios -g nagcmd -d /usr/local/nagios/var/rw chmod g+s /usr/local/nagios/var/rw
*** External command directory configured *** |
验证安装在/usr/local/nagios,是否生成目录bin etc sbin share var.
| [root@localhost ~]# cd /usr/local/nagios [root@localhost nagios]# ls bin etc sbin share var
|
安装nagios-plugins
| [root@localhost src]# tar zxvf nagios-plugins-1.4.15.tar.gz [root@localhost nagios-plugins-1.4.15]#./configure --with-nagios-user=nagios --with-nagios-group=nagios --prefix=/usr/local/nagios [root@localhost nagios-plugins-1.4.15]# make && make install |
验证是否安装成功,是否生成如下文件
| [root@localhost nagios-plugins-1.4.15]# ls /usr/local/nagios/libexec check_apt check_disk_smb check_ide_smart check_mrtg check_nwstat check_sensors check_users check_breeze check_dns check_ifoperstatus check_mrtgtraf check_oracle check_smtp check_wave check_by_ssh check_dummy check_ifstatus check_nagios check_overcr check_ssh negate check_clamd check_file_age check_imap check_nntp check_ping check_swap urlize check_cluster check_flexlm check_ircd check_nt check_pop check_tcp utils.pm check_dhcp check_ftp check_load check_ntp check_procs check_time utils.sh check_dig check_http check_log check_ntp_peer check_real check_udp check_disk check_icmp check_mailq check_ntp_time check_rpc check_ups |
配置nagios的WEB接口,也可以通过如下命令更改用户密码,
| [root@localhost etc]# /usr/bin/htpasswd -c /usr/local/nagios/etc/htpasswd nagiosadmin New password: Re-type new password: Adding password for user nagiosadmin [root@localhost etc]# service httpd start Starting httpd: [ OK ] |
编辑httpd.conf,在结尾加上如下内容,保存,启动apache,service httpd restart
| ScriptAlias /nagios/cgi-bin /usr/local/nagios/sbin <Directory "/usr/local/nagios/sbin"> Options ExecCGI AllowOverride None Order allow,deny Allow from all AuthName "Nagios Access" AuthType Basic AuthUserFile /usr/local/nagios/etc/htpasswd Require valid-user </Directory>
Alias /nagios /usr/local/nagios/share <Directory "/usr/local/nagios/share"> Options None AllowOverride None Order allow,deny Allow from all AuthName "Nagios Access" AuthType Basic AuthUserFile /usr/local/nagios/etc/htpasswd Require valid-user </Directory> |
设置开机启动nagios服务
| [root@localhost nagios-plugins-1.4.15]# chkconfig --add nagios [root@localhost nagios-plugins-1.4.15]# chkconfig nagios on |
验证nagios配置是否正常,此命令以后会经常使用,warning和error为0,表示正常,后面不正常的讨论,这里可以启动nagios服务
| [root@localhost nagios-plugins-1.4.15]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg …… Checking misc settings...
Total Warnings: 0 Total Errors: 0
[root@localhost nagios-plugins-1.4.15]# service nagios start Starting nagios: done. |
安装nrpe
http://sourceforge.net/projects/nagios/files/nrpe-2.x/nrpe-2.8b1/nrpe-2.8b1.tar.gz/download
安装之前之前一定要安装gcc、 openssl、openssl-devel(尤其是linux被监控机,也是客户端)安装之后,是用/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg –d命令启动nrpe
| tar -zxvf nrpe-2.8b1 cd nrpe-2.8b1 [root@localhost nrpe-2.8b1]# ./configure [root@localhost nrpe-2.8b1]# make all [root@localhost nrpe-2.8b1]# make install-plugin [root@localhost nrpe-2.8b1]# make install-daemon [root@localhost nrpe-2.8b1]# make install-daemon-config /usr/bin/install -c -m 775 -o nagios -g nagios -d /usr/local/nagios/etc /usr/bin/install -c -m 644 -o nagios -g nagios sample-config/nrpe.cfg /usr/local/nagios/etc
[root@localhost nrpe-2.8b1]/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d |
通过如下命令验证nrpe是否正常,可以看到端口5666已经打开
| [root@localhost etc]# /usr/local/nagios/libexec/check_nrpe -H localhost NRPE v2.8b1 [root@localhost etc]# [root@localhost etc]# netstat -atulnp | grep 'nrpe' tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 24823/nrpe |
以上服务器端(也是就控制端)的nrpe(用于服务器的自身的监控)已经配置完成,接着服务器的nagios监控配置,配置如下nagios.cfg,去掉如下的“#”,每条记录都对应目录的相应的配置文件,下表中有说明,host.cfg和service.cfg是不存在的,可以自己建立,这些cfg文件都是可以自己建立的,只要在nagios.cfg添加相应的记录就行。现在我们只是做服务器自身的监控,只是用的commands.cfg、localhost.cfg,不用修改什么。
[root@localhost objects]# vi /usr/local/nagios/etc/nagios.cfg
cfg_file=/usr/local/nagios/etc/objects/commands.cfg ;
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg
cfg_file=/usr/local/nagios/etc/objects/localhost.cfg
cfg_file=/usr/local/nagios/etc/objects/windows.cfg
cfg_file=/usr/local/nagios/etc/objects/switch.cfg
| 监控命令 | command | nagios发出的哪个指令来执行某个监控,这也是自己定义的 |
| 联系人 | contact | 设置报警联系人,一般当然是系统管理员了 |
| 监控时间段 | timeperiod | 7X24小时不间断还是周一至周五,或是自定义的其他时间段 |
| 监控交换机 | switch | 例如主机是否存活,80端口是否开,磁盘使用情况或者自定义的服务等 |
| 被监控主机 | localhost | 监控机自己服务器 |
| 被监控主机 | host | 所需要监控的服务器,当然可以是监控机自己 |
| 被监控的服务 | service | 例如主机是否存活,80端口是否开,磁盘使用情况或者自定义的服务等 |
保存nagios.cfg后,用service httpd restart重启nagios服务,如果失败,用/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg验证,会提示出问题所在,关于问题解决在后面再将,这里只是服务器自身监控,应该不会有问题。
在IE下http://IP/nagios,输入nagiosadmin和密码(图1和图2)


配置被监控机
配置linux被监控机(客户端)
需要先安装gcc、openssl、openssl-devel。
先添加一个用户
| [root@localhost ~]# useradd nagios [root@localhost ~]# passwd nagios //123456 Changing password for user nagios. New UNIX password: BAD PASSWORD: it is too simplistic/systematic Retype new UNIX password: passwd: all authentication tokens updated successfully. |
安装nagios-plugins-1.4.15.tar.gz
| [root@localhost ~]# tar -zxvf nagios-plugins-1.4.15.tar.gz [root@localhost ~]# cd nagios-plugins-1.4.15 [root@localhost nagios-plugins-1.4.15]# ./configure --prefix=/usr/local/nagios [root@localhost nagios-plugins-1.4.15]# make [root@localhost nagios-plugins-1.4.15]# make install [root@localhost nagios-plugins-1.4.15]# chown nagios.nagios /usr/local/nagios [root@localhost nagios-plugins-1.4.15]# chown -R nagios.nagios /usr/local/nagios/libexec |
安装nrpe
| [root@localhost ~]# tar -zxvf nrpe-2.8b1 [root@localhost ~]# cd nrpe-2.8b1 [root@localhost nrpe-2.8b1]# ./configure [root@localhost nrpe-2.8b1]# make all [root@localhost nrpe-2.8b1]# make install-plugin [root@localhost nrpe-2.8b1]# make install-daemon [root@localhost nrpe-2.8b1]# make install-daemon-config |
编辑nrpe.cfg
| [root@localhost ~]# vi /usr/local/nagios/etc/nrpe.cfg 查找如下加上nagios服务器ip地址 allowed_hosts=127.0.0.1,10.155.2.65 |
保存后,用如下命令启动nrpe和添加到开机启动中
| /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg –d echo '/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d &> /dev/null' >> /etc/rc.local |
验证
| [root@localhost etc]# /usr/local/nagios/libexec/check_nrpe -H localhost NRPE v2.8b1 [root@localhost etc]# netstat -atulnp | grep 'nrpe' tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 24823/nrpe |
如果不正常见文章最后的问题和解决(1)
这里客户端的配置已经完成,已经可以到nagios服务器配置command.cfg监控,但是这里介绍另一种方法在编辑nrpe.cfg配置命令,nagios服务器会调用客户端的nrpe命令
| vi /usr/local/nagios/etc/nrpe.cfg
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10 command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20 command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200 command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10% 下面这条是自己添加 command[check_ping81]=/usr/local/nagios/libexec/check_ping -H 10.155.0.1 -w 100.0,20% -c 500.0,60%# command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20 -c 10 -p /dev/hda1 #command[命令名称]=命令定义 |
具体使用见帮助
| [root@localhost nagios]# ls bin etc include libexec share [root@localhost nagios]# pwd /usr/local/nagios [root@localhost nagios]# libexec/check_ping -h check_ping v1.4.15 (nagios-plugins 1.4.15) Copyright (c) 1999 Ethan Galstad <nagios@nagios.org> Copyright (c) 2000-2007 Nagios Plugin Development Team <nagiosplug-devel@lists.sourceforge.net>
Use ping to check connection statistics for a remote host.
Usage: check_ping -H <host_address> -w <wrta>,<wpl>% -c <crta>,<cpl>% [-p packets] [-t timeout] [-4|-6]
Options: -h, --help Print detailed help screen -V, --version Print version information -4, --use-ipv4 Use IPv4 connection -6, --use-ipv6 Use IPv6 connection -H, --hostname=HOST host to ping -w, --warning=THRESHOLD warning threshold pair -c, --critical=THRESHOLD critical threshold pair -p, --packets=INTEGER number of ICMP ECHO packets to send (Default: 5) -L, --link show HTML in the plugin output (obsoleted by urlize) -t, --timeout=INTEGER Seconds before connection times out (default: 10) |
配置nrpe.cfg后,必须重启nrpe。重启方法(杀死进程,在重启),否则更改不会生效
| [root@localhost ~]# ps aux|grep nrpe nagios 3327 0.0 0.0 4880 924 ? Ss Sep02 0:03 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d root 22293 0.0 0.0 3908 648 pts/0 R+ 11:07 0:00 grep nrpe [root@localhost ~]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d |
服务器端
测试与客户端连接,如下表示正常
| [root@localhost nrpe-2.8b1]# /usr/local/nagios/libexec/check_nrpe -H 10.155.2.81 NRPE v2.8b1 |
下面表示不正常,见问题与解决(2)
| [root@localhost nrpe-2.8b1]# /usr/local/nagios/libexec/check_nrpe -H 10.155.2.81 Connection refused by host |
配置nagios.cfg,增加一条记录
| [root@localhost ~]# vi /usr/local/nagios/etc/nagios.cfg cfg_file=/usr/local/nagios/etc/objects/mylinux.cfg cfg_file=/usr/local/nagios/etc/objects/commands.cfg ;此前面已经设置过 |
配置nagios.cfg,增加一条记录
| vi /usr/local/nagios/etc/objects/commands.cfg #在后面增加追加如下内容 #check nrpe define command{ command_name check_nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ } |
mylinux.cfg在目录中没有,需要自己添加
| [root@localhost ~]# touch /usr/local/nagios/etc/objects/mylinux.cfg #---------------------------------------------此段定义主机-------------------------------------------------------- define host{ use linux-server host_name 10.155.2.81-cacti alias mylinux address 10.155.2.81 } #--------------------------------下面定义服务器,也就是监控项------------------------------------------ define service{ use generic-service host_name 10.155.2.81-cacti service_description Swap Usage check_command check_nrpe!check_swap }
define service{ use generic-service host_name 10.155.2.81-cacti service_description Current Load check_command check_nrpe!check_load }
define service{ use generic-service host_name 10.155.2.81-cacti service_description Partition Usage check_command check_nrpe!check_hda1 }
define service{ use generic-service host_name 10.155.2.81-cacti service_description Current Users check_command check_nrpe!check_users }
define service{ use generic-service host_name 10.155.2.81-cacti service_description Total Processes check_command check_nrpe!check_total_procs }
define service{ use generic-service host_name 10.155.2.81-cacti service_description PING check_command check_nrpe!check_ping81 }
|
重启nagios,ie浏览(图3)

配置交换机被监控机(客户端)
这里的switch.cfg是交换机的一个监控模板,我保留这个cfg,复制为switch31.cfg编辑
| [root@localhost objects]# vi /usr/local/nagios/etc/nagios.cfg # Definitions for monitoring a router/switch #cfg_file=/usr/local/nagios/etc/objects/switch.cfg cfg_file=/usr/local/nagios/etc/objects/switch31.cfg |
几乎不用改,把host name和address改了,最后一个mrtg我没用网络里没有注释掉了
| [root@localhost objects]# vi switch31.cfg
define host{ use generic-switch ; Inherit default values from a template host_name g13a-dell5424-31 ; The name we're giving to this switch alias Linksys SRW224P Switch ; A longer name associated with the switch address 10.155.0.31 ; IP address of the switch hostgroups switches ; Host groups this switch is associated with }
define hostgroup{ hostgroup_name switches ; The name of the hostgroup alias Network Switches ; Long name of the group } define service{ use generic-service ; Inherit values from a template host_name g13a-dell5424-31 ; The name of the host the service is associated with service_description PING ; The service description check_command check_ping!200.0,20%!600.0,60% ; The command used to monitor the service normal_check_interval 5 ; Check the service every 5 minutes under normal conditions retry_check_interval 1 ; Re-check the service every minute until its final/hard state is determined }
# Monitor uptime via SNMP
define service{ use generic-service ; Inherit values from a template host_name g13a-dell5424-31 service_description Uptime check_command check_snmp!-C chrswitch -o sysUpTime.0 }
# Monitor Port 1 status via SNMP
define service{ use generic-service ; Inherit values from a template host_name g13a-dell5424-31 service_description Port 1 Link Status check_command check_snmp!-C chrswitch -o ifOperStatus.1 -r 1 -m RFC1213-MIB }
# Monitor bandwidth via MRTG logs
#define service{ # use generic-service ; Inherit values from a template # host_name linksys-srw224p # service_description Port 1 Bandwidth Usage # check_command check_local_mrtgtraf!/var/lib/mrtg/192.168.1.253_1.log!AVG!1000000,1000000!5000000,5000000!10 # } |
Switch就完成,红色见问题3(图4)

配置windows被监控机(客户端)
Windows监控需要在windows安装一个插件nsclient++,http://nsclient.org/nscp/downloads
下载后解压到c盘
| C:/NSClient>nsclient++ /install Service NSClientpp installed... l NSClient++.cpp(227) Service installed! |
用记事本编辑nsc.ini,去掉注释符号”;”除了CheckWMI.dll和RemoteConfiguration.dll
| [modules] FileLogger.dll CheckSystem.dll CheckDisk.dll NSClientListener.dll NRPEListener.dll SysTray.dll CheckEventLog.dll CheckHelpers.dll ;CheckWMI.dll CheckExternalScripts.dll NSCAAgent.dll LUAScript.dll ;RemoteConfiguration.dll NRPEClient.dll CheckTaskSched.dll |
| [Settings] allowed_hosts=10.155.2.65/32 |
| [NSClient] port=12489 |
| C:/NSClient>NSClient++ -start Starting NSClientpp
C:/NSClient>netstat -an | more Active Connections
Proto Local Address Foreign Address State TCP 0.0.0.0:5666 0.0.0.0:0 LISTENING TCP 0.0.0.0:12489 0.0.0.0:0 LISTENING |
服务器
| [root@localhost ~]# vi /usr/local/nagios/etc/nagios.cfg # Definitions for monitoring a Windows machine cfg_file=/usr/local/nagios/etc/objects/windows.cfg
[root@localhost ~]# vi /usr/local/nagios/etc/objects/windows.cfg |
windows.cfg没有什么可改的,只是把hostname和ip改了就行了(图5)

邮件监控
邮件监控可以配置sendmail,我对sendmail一窍不通,这里还是用mail命令完成。首先是编辑nagios.cfg,以前已经设置了,这里看看就行了
| [root@localhost objects]# vi /usr/local/nagios/etc/nagios.cfg cfg_file=/usr/local/nagios/etc/objects/contacts.cfg |
这里要注意的是如果在这设置间隔和期限等是在整个nagios生效的,所有的客户端都是按照这个配置,如果想单独设置在各个cfg中单独配置,一会会讲。整个cfg内容没什么说的了,contact_name和email,还有就是contactgroup_name在所有的地方都是设置这个名字,不要写错,member可以设置多个用”,”隔开,
| [root@localhost objects]# vi contacts.cfg define contact{ contact_name user1 alias Nagios Admin service_notification_period 24x7 host_notification_period 24x7 service_notification_options w,u,c,r host_notification_options d,r service_notification_commands notify-by-email host_notification_commands host-notify-by-email email xxxx@163.com pager 13800138000 } define contactgroup{ contactgroup_name admins alias Nagios Administrators members user1 }
|
notify-by-email和host-notify-by-email在command.cfg中定义了(见问题5)
这个就是在服务中单独设置,就不多讲了
| define service{ use generic-service ; Name of service template to use host_name test_nrpe service_description apache is_volatile 0 #类似声音警告功能关闭 check_period 24x7 #监控期限为24X7 max_check_attempts 1 #最大重试次数 normal_check_interval 1 #标准检测时间间隔 1分钟 retry_check_interval 1 #重试时间间隔 contact_groups admins #联系组 notification_options w,u,c,r # w,u,c,r 发生这四种情况时,进行通告。 notification_interval 960 # 通告间隔 notification_period 24x7 #通告过期时间 check_command check_http!100.0,20%!500.0,60% } |
参数说明
服务出了状况通知的时间段,这个时间段是前面 timeperiods.cfg 里面定义的。
service_notification_period 24x7
主机出现状况时通知的时间段,这个时间段是前面 timeperiods.cfg 里面定义的。
host_notification_period 24x7
当服务出现 w— 报警 (warning),u— 未知 (unkown),c— 严重 (critical),r— 从异常恢复到正常,在这四种情况下通知联系人
service_notification_options w,u,c,r
当主机出现 d— 当机 (down),u— 返回不可达 (unreachable),r— 从异常情况恢复正常 , 在这 3 种情况下通知联系人
host_notification_options d,u,r
服务出问题通知采用的命令 notify-service-by-email , 这个命令是在 commands.cfg 中定义的 , 作用是给联系人发邮件 . 在 nagios2.x 的版本上可以不一样,可以自己到 commands.cfg 里看一下;在这里也可以设置发送短信的方式通知联系人,前提是你要配置有发送知道的脚本,还要到 commands.cfg 里面添加发送脚本所用到的命令;
service_notification_commands notify-service-by-email
同上 , 主机出问题时采用的也是发邮件的方式通知联系人
host_notification_commands notify-host-by-email
指定 联系的人 email 地址
email yaozhan189@163.com
联系人的手机 , 前提是要支持短信通知,这里没有启用通过手机短信的方式发送警报 pager 13800138000
问题和解决
1、
| [root@localhost nrpe-2.8b1]# /usr/local/nagios/libexec/check_nrpe -H localhost Connection refused by host |
如果在本机上都出下如下结果,说明nrpe没有启动,用启动命令启动
2、
| [root@localhost nrpe-2.8b1]# /usr/local/nagios/libexec/check_nrpe -H 10.155.2.81 Connection refused by host |
可能是客户端nrpe.cfg,没有加服务器ip地址allowed_hosts=127.0.0.1,10.155.2.6
可能是客户端的防火墙
3、是因为nagios服务器中没有check_snmp命令,有安装nagios之前没有安装net-snmp和net-snmp-utils组件。网上有人提示安装组件后在重新安装nagios-plugins-1.4.15,我没有成功。(图6)

| [root@localhost nagios]# libexec/check_snmp -h -bash: libexec/check_snmp: No such file or directory |
4网上有文章提示安装nagios之前,系统需要安装如下。
| yum -y install gcc gcc-c++ autoconf libjpeg libjpeg-devel libpng libpng-devel freetype freetype-devel libxml2 libxml2-devel zlib zlib-devel glibc glibc-devel glib2 glib2-devel bzip2 bzip2-devel ncurses ncurses-devel curl curl-devel e2fsprogs e2fsprogs-devel krb5 krb5-devel libidn libidn-devel openssl openssl-devel openldap openldap-devel nss_ldap openldap-clients openldap-servers perl gd gd-devel jpeg jpeg-devel libpng libpng-devel Net-snmp zlib freetype libart_lgpl cairo-devel pango-devel lrzsz* |
本文详细介绍了在Redhat5.2系统上安装和配置Nagios监控系统的全过程,包括安装必要的系统组件、Nagios及其插件、配置被监控主机,并实现了Web界面监控、邮件报警等功能。
2682

被折叠的 条评论
为什么被折叠?



