Flume 监控之Ganglia

最新推荐文章于 2024-08-10 09:38:54 发布

原创最新推荐文章于 2024-08-10 09:38:54 发布 · 882 阅读

1 ·

CC 4.0 BY-SA版权

Flume 专栏收录该内容

5 篇文章

订阅专栏

本文详细介绍Ganglia监控系统的安装部署过程，包括HTTPd、PHP及依赖软件包的安装，以及ganglia服务组件的配置与启动。同时，文章还介绍了如何在Flume中集成Ganglia监控，通过修改配置实现Flume运行状态的实时监控。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1.Ganglia的安装与部署

(1) 安装httpd服务与php

[luomk@hadoop102 flume]$ sudo yum -y install httpd php

(2) 安装其他依赖

[luomk@hadoop102 flume]$ sudo yum -y install rrdtool perl-rrdtool rrdtool-devel

[luomk@hadoop102 flume]$ sudo yum -y install apr-devel

(3) 安装ganglia

[luomk@hadoop102 flume]$ sudo rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm

[luomk@hadoop102 flume]$ sudo yum -y install ganglia-gmetad

[luomk@hadoop102 flume]$ sudo yum -y install ganglia-web

[luomk@hadoop102 flume]$ sudo yum install -y ganglia-gmond

(4) 修改配置文件ganglia.conf

[luomk@hadoop102 flume]$ sudo vim /etc/httpd/conf.d/ganglia.conf

修改为红颜色的配置：

# Ganglia monitoring system php web frontend

Alias /ganglia /usr/share/ganglia

<Location /ganglia>

  Order deny,allow

  Deny from all

  Allow from all

  # Allow from 127.0.0.1

  # Allow from ::1

  # Allow from .example.com

</Location>

(5) 修改配置文件gmetad.conf

[luomk@hadoop102 flume]$ sudo vim /etc/ganglia/gmetad.conf

修改为：

data_source "hadoop102” 10.211.55.102

(6) 修改配置文件gmond.conf

[luomk@hadoop102 flume]$ sudo vim /etc/ganglia/gmond.conf

修改为：

cluster {

  name = "hadoop102"

  owner = "unspecified"

  latlong = "unspecified"

  url = "unspecified"

}



/* The host section describes attributes of the host, like the location */

host {

  location = "unspecified"

}



/* Feel free to specify as many udp_send_channels as you like.  Gmond

   used to only support having a single channel */

udp_send_channel {

  #bind_hostname = yes # Highly recommended, soon to be default.

                       # This option tells gmond to use a source address

                       # that resolves to the machine's hostname.  Without

                       # this, the metrics may appear to come from any

                       # interface and the DNS names associated with

                       # those IPs will be used to create the RRDs.

# mcast_join = 239.2.11.71

  host = 10.211.55.102

  port = 8649

  ttl = 1

}



/* You can specify as many udp_recv_channels as you like as well. */

udp_recv_channel {

# mcast_join = 239.2.11.71

  port = 8649

  bind = 10.211.55.102

  retry_bind = true

  # Size of the UDP buffer. If you are handling lots of metrics you really

  # should bump it up to e.g. 10MB or even higher.

  # buffer = 10485760

}

(7) 修改配置文件config

[luomk@hadoop102 flume]$ sudo vim /etc/selinux/config

修改为：

# This file controls the state of SELinux on the system.

# SELINUX= can take one of these three values:

#     enforcing - SELinux security policy is enforced.

#     permissive - SELinux prints warnings instead of enforcing.

#     disabled - No SELinux policy is loaded.

SELINUX=disabled

# SELINUXTYPE= can take one of these two values:

#     targeted - Targeted processes are protected,

#     mls - Multi Level Security protection.

SELINUXTYPE=targeted

尖叫提示：selinux本次生效关闭必须重启，如果此时不想重启，可以临时生效之：[luomk@hadoop102 flume]$ sudo setenforce 0

(8) 启动ganglia

[luomk@hadoop102 flume]$ sudo service httpd start

[luomk@hadoop102 flume]$ sudo service gmetad start

[luomk@hadoop102 flume]$ sudo service gmond start

(9) 打开网页浏览ganglia页面

http://10.211.55.102/ganglia

尖叫提示：如果完成以上操作依然出现权限不足错误，请修改/var/lib/ganglia目录的权限：

[luomk@hadoop102 flume]$ sudo chmod -R 777 /var/lib/ganglia

2.操作Flume测试监控

(1) 修改/opt/module/flume/conf目录下的flume-env.sh配置：

JAVA_OPTS="-Dflume.monitoring.type=ganglia -Dflume.monitoring.hosts=192.168.1.102:8649 -Xms100m -Xmx200m"

(2) 启动flume任务

[luomk@hadoop102 flume]$ bin/flume-ng agent \

--conf conf/ \

--name a1 \

--conf-file job/flume-telnet-logger.conf \

-Dflume.root.logger==INFO,console \

-Dflume.monitoring.type=ganglia \

-Dflume.monitoring.hosts=192.168.1.102:8649

(3) 发送数据观察ganglia监测图

[luomk@hadoop102 flume]$ telnet localhost 44444

样式如图：

图例说明：

字段（图表名称）	字段含义
EventPutAttemptCount	source尝试写入channel的事件总数量
EventPutSuccessCount	成功写入channel且提交的事件总数量
EventTakeAttemptCount	sink尝试从channel拉取事件的总数量。这不意味着每次事件都被返回，因为sink拉取的时候channel可能没有任何数据。
EventTakeSuccessCount	sink成功读取的事件的总数量
StartTime	channel启动的时间（毫秒）
StopTime	channel停止的时间（毫秒）
ChannelSize	目前channel中事件的总数量
ChannelFillPercentage	channel占用百分比
ChannelCapacity	channel的容量