说到Memcached服务,其实作为运维人员用的还是很多的:
1、在做LB的时候,为了保证同一台机器的请求的session信息防止丢失,我们用Memcached对session做分布式存储。
2、做mysql缓存的时候,我们常常吧mysql查询的结果缓存到Memcached中,这样能够较少php程序与mysql的交互,也能大大减轻数据库的压力。
从以上来看,Memcached其实也是蛮重要的,那么我们更有必要对其进行时刻的监控,接下来引入正题
Nagios监控Memcached服务是否正常运行、Memcached内存的使用比例、Memcached是否应答等等
Cacti监控Memcached,请参考之前博文
http://467754239.blog.51cto.com/4878013/1433409
一、通过php页面来监测
下载memcache.php网页,然后发布这个页面后,通过浏览器访问
下载地址:
http://livebookmark.net/memcachephp/memcachephp.zip
http://blogimg.chinaunix.net/blog/upfile2/081230231118.zip
1、修改memcache.php部分参数
2、浏览器访问
注释:或许是memcached没有任何数据吧!
可是朋友的博客是这样的
二、利用Memcached自身的命令来检查
[root@lnmp ~]# telnet 127.0.0.1 11211 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. stats STAT pid 63992 STAT uptime 6204 STAT time 1412164192 STAT version 1.4.17 STAT libevent 2.0.21-stable STAT pointer_size 64 STAT rusage_user 0.225965 STAT rusage_system 0.309952 STAT curr_connections 10 STAT total_connections 243 STAT connection_structures 12 STAT reserved_fds 20 STAT cmd_get 108 STAT cmd_set 43 STAT cmd_flush 0 STAT cmd_touch 0 STAT get_hits 68 STAT get_misses 40 STAT delete_misses 0 STAT delete_hits 0 STAT incr_misses 0 STAT incr_hits 0 STAT decr_misses 0 STAT decr_hits 0 STAT cas_misses 0 STAT cas_hits 0 STAT cas_badval 0 STAT touch_hits 0 STAT touch_misses 0 STAT auth_cmds 0 STAT auth_errors 0 STAT bytes_read 7326 STAT bytes_written 125489 STAT limit_maxbytes 67108864 STAT accepting_conns 1 STAT listen_disabled_num 0 STAT threads 4 STAT conn_yields 0 STAT hash_power_level 16 STAT hash_bytes 524288 STAT hash_is_expanding 0 STAT malloc_fails 0 STAT bytes 205 STAT curr_items 2 STAT total_items 43 STAT expired_unfetched 0 STAT evicted_unfetched 0 STAT evictions 0 STAT reclaimed 0 END
或者利用nagios的check_tcp来监测,如下
[root@NagiosServer libexec]# ./check_tcp -H 192.168.0.12 -p 11211 -t 5 -E -s 'stats\r\nquit\r\n' -e 'uptime' -M crit TCP OK - 0.001 second response time on 192.168.0.12 port 11211 [STAT pid 63992 STAT uptime 6295 STAT time 1412164283 STAT version 1.4.17 STAT libevent 2.0.21-stable STAT pointer_size 64 STAT rusage_user 0.225965 STAT rusage_system 0.317951 STAT curr_connections 11 STAT total_connections 249 STAT connection_structures 12 STAT reserved_fds 20 STAT cmd_get 108 STAT cmd_set 43 STAT cmd_flush 0 STAT cmd_touch 0 STAT get_hits 68 STAT get_misses 40 STAT delete_misses 0 STAT delete_hits 0 STAT incr_misses 0 STAT incr_hits 0 STAT decr_misses 0 STAT decr_hits 0 STAT cas_misses 0 STAT cas_hits 0 STAT cas_badval 0 STAT touch_hits 0 STAT touch_misses 0 STAT auth_cmds 0 STAT auth_errors 0 STAT bytes_read 7374 STAT bytes_written 131873 STAT limit_maxbytes 67108864 STAT accepting_conns 1 STAT listen_disabled_num 0 STAT threads 4 STAT conn_yields 0 STAT hash_power_level 16 STAT hash_bytes 524288 STAT hash_is_expanding 0 STAT malloc_fails 0 STAT bytes 205 STAT curr_items 2 STAT total_items 43 STAT expired_unfetched 0 STAT evicted_unfetched 0 S]|time=0.001293s;;;0.000000;5.000000
三、Nagios的check_memcached
下载地址:
http://www.filewatcher.com/m/Nagios-Plugins-Memcached-0.02.tar.gz.72-0.html
这个是perl写的脚本,所以在安装时需要支持perl环境,必先安装之
[root@NagiosServer ~]# yum -y install perl* 或者 [root@NagiosServer ~]# yum install perl-Carp-Clan perl-Cache-Memcached perl-Nagios-Plugin [root@NagiosServer ~]# tar xf Nagios-Plugins-Memcached-0.02.tar.gz [root@NagiosServer Nagios-Plugins-Memcached-0.02]# cd Nagios-Plugins-Memcached-0.02 [root@NagiosServer ~]# perl Makefile.PL [root@NagiosServer ~]# make && make install
做适应的修改
#默认check_memcached的安装路径 [root@NagiosServer ~]# find / -name check_memcached /usr/local/bin/check_memcached #对其进行适应的修改 [root@NagiosServer ~]# cp /usr/local/bin/check_memcached /usr/local/nagios/libexec/
四、Nagios对Memcached进行监控配置
1、定义命令
[root@NagiosServer objects]# vim commands.cfg #定义Memcached的内存使用比例 define command { command_name check_memcached_11211 command_line $USER1$/check_memcached -H 192.168.0.12:11211 --size-warning 80 --size-critical 90 } #定义Memcached是否有应答 define command { command_name memcached_response_11211 command_line $USER1$/check_memcached -H 192.168.0.12 -w 300 -c 500 } #定义Memcached的命中率 define command { command_name check_memcached_hit command_line $USER1$/check_memcached -H 192.168.0.12 --hit-warning 10 --size-critical 5 }
2、定义服务和主机
define host{ use linux-server host_name Linux Server 02 alias My Linux 02 address 192.168.0.12 } #define hostgroup{ # hostgroup_name admins # alias Nagios Administrators # members Linux Server 02 # } define service{ use generic-service host_name Linux Server 02 service_description PING check_command check_ping!100.0,20%!500.0,60% } define service{ use generic-service host_name Linux Server 02 service_description Root Partition check_command check_local_disk!20%!10%!/ } define service{ use generic-service host_name Linux Server 02 service_description Current Users check_command check_local_users!20!50 } define service{ use generic-service host_name Linux Server 02 service_description Total Processes check_command check_local_procs!250!400!RSZDT } define service{ use generic-service host_name Linux Server 02 service_description Current Load check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.0 } define service{ use generic-service host_name Linux Server 02 service_description Swap Usage check_command check_local_swap!20!10 } define service{ use generic-service host_name Linux Server 02 service_description SSH check_command check_ssh notifications_enabled 0 } define service{ use generic-service host_name Linux Server 02 service_description HTTP check_command check_http notifications_enabled 0 } define service{ use generic-service host_name Linux Server 02 service_description users check_command check_nrpe!check_users } define service{ use generic-service host_name Linux Server 02 service_description load check_command check_nrpe!check_load } define service{ use generic-service host_name Linux Server 02 service_description disk sda1 check_command check_nrpe!check_sda1 } define service{ use generic-service host_name Linux Server 02 service_description zombie_proces check_command check_nrpe!check_zombie_procs } define service{ use generic-service host_name Linux Server 02 service_description Total_proces check_command check_nrpe!check_total_procs } define service{ use generic-service host_name Linux Server 02 service_description Memcached Reponse check_command check_memcached_11211 } define service{ use generic-service host_name Linux Server 02 service_description Memcached Size check_command memcached_response_11211 } define service{ use generic-service host_name Linux Server 02 service_description Memcached Hit check_command check_memcached_hit }
3、启用定义的主机配置文件
[root@NagiosServer etc]# vim nagios.cfg cfg_file=/usr/local/nagios/etc/objects/hosts/linux02.cfg
4、重新启动nagios服务
[root@NagiosServer ~]# service nagios restart Running configuration check... Stopping nagios: done. Starting nagios: done.
五、查看页面监控结果
暂且就先到这里吧!更多的会在下一篇博文中写到
Nagios监控Nginx服务
一、监控脚本
[root@NagiosServer libexec]# cat check_nginx.sh #!/bin/bash PROGNAME=`basename $0` VERSION="Version 1.0" AUTHOR="2010.11.18-www.nginxs.com" ST_OK=0 ST_WR=1 ST_CR=2 ST_UK=3 print_version() { echo "$VERSION $AUTHOR" } print_help() { print_version $PROGNAME $VERSION echo "$PROGNAME is a Nagios plugin to monitor nginx status" echo "Use of wget nginxstatus page" echo "When using optional warning/critical thresholds all values except" echo "Usage parameters:" echo "" echo "$PROGNAME [-u|--url] [-p|--path] [-w/--warning] [-c/--critical]" echo "" echo "Options:" echo " --url|-u)" echo " Sets nginx status url" echo "" echo " --path|-p)" echo " Sets nginx status url path" echo "" echo " --warning|-w)" echo " Sets a warning level for nginx Active connections. Default is: off" echo "" echo " --critical|-c)" echo " Sets a critical level for nginx Active connections. Default is: off" echo "" echo "Example:" echo "http://www.nginxs.com/status" echo "./check_nginx.sh -u www.nginxs.com -p /status -w 10000 -c 15000" exit $ST_UK } while test -n "$1";do case "$1" in --help|-h) print_help exit $ST_UK ;; --url|-u) url=$2 shift ;; --path|-p) path=$2 shift ;; --warning|-w) warn=$2 shift ;; --critical|-c) crit=$2 shift ;; *) echo "Unknown argument: $1" print_help exit $ST_UK ;; esac shift done if [ -z $url ];then echo "Must Sets --url|-u) Parameters" exit $ST_UK elif [ -z $path ];then echo "Must sets --path|-p) Parameters" echo "Please look help" echo "$PROGNAME --help" exit $ST_UK fi if [ -n "$warn" -a -n "$crit" ];then if [ $warn -ge $crit ];then echo "Please adjust your warning/critical thresholds. The warning must be lower than the critical level!" exit $ST_UK fi fi do_status() { wget -qNO /tmp/nginx.html ${url}${path} ActiveConn=`awk -F: {'print $2'} /tmp/nginx.html |head -1` serveraccepts=`awk {'print $1'} /tmp/nginx.html |tail -n 2|head -1` handled=`awk {'print $2'} /tmp/nginx.html |tail -n 2|head -1` requests=`awk {'print $3'} /tmp/nginx.html |tail -n 2|head -1` reading=`tail -n 1 /tmp/nginx.html|awk {'print $2'}` writing=`tail -n 1 /tmp/nginx.html|awk {'print $4'}` waiting=`tail -n 1 /tmp/nginx.html|awk {'print $6'}` } do_output() { output="ActiveConn:${ActiveConn},serveraccepts:${serveraccepts},handled:${handled},requests:${requests},reading:${reading},writing:${writing},waiting:${waiting}" } do_perfdata() { perfdata="'ActiveConn'=${ActiveConn},'serveraccepts'=${serveraccepts},'handled'=${handled},'requests'=${requests},'reading'=${reading},'writing'=${writing},'waiting'=${waiting}" } do_status do_output do_perfdata if [ -n "warn" -a -n "$crit" ];then if [ $ActiveConn -ge $warn -a $ActiveConn -lt $crit ];then echo "WARNING - $output |$perfdata" exit $ST_WR elif [ $ActiveConn -ge $crit ];then echo "CRITICAL - $output|$perfdata" exit $ST_CR else echo "OK - $output|$perfdata" exit $ST_OK fi else echo "OK - $output|$perfdata" exit $ST_OK fi
使用帮助说明
[root@NagiosServer libexec]# ./check_nginx.sh -h Version 1.0 2010.11.18-www.nginxs.com check_nginx.sh is a Nagios plugin to monitor nginx status Use of wget nginxstatus page When using optional warning/critical thresholds all values except Usage parameters: check_nginx.sh [-u|--url] [-p|--path] [-w/--warning] [-c/--critical] Options: --url|-u) Sets nginx status url --path|-p) Sets nginx status url path --warning|-w) Sets a warning level for nginx Active connections. Default is: off --critical|-c) Sets a critical level for nginx Active connections. Default is: off Example: http://www.nginxs.com/status ./check_nginx.sh -u www.nginxs.com -p /status -w 10000 -c 15000
测试脚本命令
[root@NagiosServer libexec]# ./check_nginx.sh -u 192.168.0.12 -p /nginx_status -w 10000 -c 15000 OK - ActiveConn: 1 ,serveraccepts:47,handled:47,requests:47,reading:0,writing:1,waiting:0|'ActiveConn'= 1 ,'serveraccepts'=47,'handled'=47,'requests'=47,'reading'=0,'writing'=1,'waiting'=0
定义命令、主机、服务、重新启动nagios服务
[root@NagiosServer libexec]# vim ../etc/objects/commands.cfg define command { command_name check_nginx command_line $USER1$/check_nginx.sh -u 192.168.0.12 -p /nginx_status -w 10000 -c 15000 } [root@NagiosServer libexec]# vim ../etc/objects/hosts/linux02.cfg define service{ use generic-service host_name Linux Server 02 service_description Nginx Status check_command check_nginx }
查看监控页面
然后看ab压力测试页面,看监控是否有变化
[root@NagiosServer libexec]# ab -c 1000 -n 1000 http://192.168.0.12/index.php This is ApacheBench, Version 2.3 <$Revision: 655654 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 192.168.0.12 (be patient) Completed 100 requests Completed 200 requests Completed 300 requests Completed 400 requests Completed 500 requests Completed 600 requests Completed 700 requests Completed 800 requests Completed 900 requests Completed 1000 requests Finished 1000 requests Server Software: nginx/1.4.7 Server Hostname: 192.168.0.12 Server Port: 80 Document Path: /index.php Document Length: 192 bytes Concurrency Level: 1000 Time taken for tests: 1.579 seconds Complete requests: 1000 Failed requests: 142 (Connect: 0, Receive: 0, Length: 142, Exceptions: 0) Write errors: 0 Non-2xx responses: 858 Total transferred: 7851142 bytes HTML transferred: 7692130 bytes Requests per second: 633.33 [#/sec] (mean) Time per request: 1578.963 [ms] (mean) Time per request: 1.579 [ms] (mean, across all concurrent requests) Transfer rate: 4855.80 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 122 135 10.4 135 163 Processing: 120 253 149.8 218 1294 Waiting: 107 250 147.8 218 1292 Total: 265 388 144.8 351 1416 Percentage of the requests served within a certain time (ms) 50% 351 66% 377 75% 390 80% 402 90% 624 95% 754 98% 822 99% 842 100% 1416 (longest request)
等待数分钟后得到最后的效果图,如下图所示
memcached参考博文
http://storysky.blog.51cto.com/628458/244962
http://linuxjcq.blog.51cto.com/3042600/718180
nginx参考博文
转载于:https://blog.51cto.com/467754239/1560294