Flume

解决log传输的高延时,容错,负载均衡,压缩等问题。
Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.
在这里插入图片描述
配置文件
A)配置Source
B)配置CHannel
C)配置Sink
D)把以上三个组件串起来

a1:agent
r1:source
k1:sink
c1:channel

a1.source=r1
a1.sinks=k1
a1.channels=c1
//必须
a1.sources.r1.type=netcat
a1.sources.r1.bind=hadoop000
a1.sources.r1.port=44444

a1.sinks.k1.type=logger

a1.channels.c1.type=memory

a1.sources.r1.channecls=c1 //注意soucrce可以多个channel
a1.sinks.k1.channel=c1 //注意sink只有一个channel

启动agent
flume-ng agent
–name a1
–conf $FLUME_HOME/conf
–conf-file $FLUME_HOME/conf/example.conf
-Dflume.root.logger=INFO,console

Event: { headers:{} body: 68 65 6C 6C 6F 0D hello.}
Event是FLUME数据传输的基本单元
Event = 可选的header + byte array

从A服务器拿取日志到B服务器上则
exec source + memory channel +avro sink
avro source + memory channel + logger sink

FLUME PUSH方式

FLUME PULL方式

simple-agent.sources=netcat-source
simple-agent.sinks=spark-sink
simple-agent.channels=memory-channel

simple-agent.sources.netcat-source.type=netcat
simple-agent.sources.netcat-source.bind=hadoop001
simple-agetn.sources.netcat-source.port=44444

simple-agent.sinks.spark-sink.type=org.apache.spark.streaming.flume.sink.SparkSink
simple-agent.sinks.spark-sink.hostname=hadoop001
simple-agent.sinks.spark-sink.port=41414

simple-agent.channels.memory-channels.type=memory

simple-agent.sources.netcat-source.channels=memory-channel
simple-agent.sinks.spark-sink.channel=memory-channel
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值