Flume-常见source、channel、sink

Flume核心概念

在这里插入图片描述

Event

  1. Event是流经flume agent的最小数据单元。一个Event(由Event接口实现)从source流向channel,再到sink。
  2. Event包含了一个payload(byte array)和可选的header(string attributes)。
  3. 一个flume agent就是一个jvm下的进程:控制着Events从一个外部的源头到一个外部的目的地。

Source

  1. Source的目的是从外部客户端接收数据并将其存储到已配置的Channels中。
  2. 将接收的数据以Flume的event格式传递给一个或者多个通道channel。
  3. Flume提供多种数据接收的方式,比如Avro,Thrift等。
  4. source 必须至少和一个channel关联,不同类型的Source
    与系统集成的Source:Syslog,Netcat,监测目录池
    自动生成事件的Source:Exec
    用于Agent和Agent之间通信的IPC source:avro,thrift

Channel

  1. channel是一种短暂的存储容器,它将从source处接收到的event格式的数据缓存起来,直到它们被sinks消费掉,它在source和sink间起着桥梁的作用。
  2. channel是一个完整的事务,这一点保证了数据在收发的时候的一致性. 并且它可以和任意数量的source和sink链接.。
  3. 支持的类型有:
    Memory channel :volatile (不稳定的)
    File Channel:基于WAL( 预写式日志Write-Ahead logging)实现
    JDBC channel :基于嵌入式database实现
  4. 可以和任何数量的source和sink工作,channel 的内容只输出一次,同一个event 如果sink1 输出,sink2 不输出;如果sink1 输出,sink1 不输出。 最终 sink1+sink2=channel 中的数据。

Sink

  1. sink将数据存储到集中存储器比如Hbase和HDFS。
  2. 从channals消费数据(events)并将其传递给目标地. 目标地可能是另一个sink,也可能HDFS,HBase.
  3. 存储event到最终目的地终端sink,比如 HDFS,HBase
    自动消耗的sink 比如 null sink
    用于agent间通信的IPC:sink:Avro
    必须作用于一个确切的channel

Flume Sources

Avro Source

Listens on Avro port and receives events from external Avro client streams. When paired with the built-in Avro Sink on another (previous hop) Flume agent, it can create tiered collection topologies. Required properties are in bold.

监听Avro端口并从外部Avro客户端流接收事件,可以监听服务器指定端口

Property Name Default Description
channels
type The component type name, needs to be avro
bind hostname or IP address to listen on
port Port # to bind to
threads Maximum number of worker threads to spawn
selector.type
selector.*
interceptors Space-separated list of interceptors
interceptors.*
compression-type none This can be “none” or “deflate”. The compression-type must match the compression-type of matching AvroSource
ssl false Set this to true to enable SSL encryption. You must also specify a “keystore” and a “keystore-password”.
keystore This is the path to a Java keystore file. Required for SSL.
keystore-password The password for the Java keystore. Required for SSL.
keystore-type JKS The type of the Java keystore. This can be “JKS” or “PKCS12”.
exclude-protocols SSLv3 Space-separated list of SSL/TLS protocols to exclude. SSLv3 will always be excluded in addition to the protocols specified.
ipFilter false Set this to true to enable ipFiltering for netty
ipFilterRules Define N netty ipFilter pattern rules with this config.

样例:

a1.sources = r1
a1.channels = c1
a1.sources.r1.type = avro
a1.sources.r1.channels = c1
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 4141

Exec Source

Exec source runs a given Unix command on start-up and expects that process to continuously produce data on standard out (stderr is simply discarded, unless property logStdErr is set to true). If the process exits for any reason, the source also exits and will produce no further data. This means configurations such as cat [named pipe] or tail -F [file] are going to produce the desired results where as date will probably not - the former two commands produce streams of data where as the latter produces a single event and exits.

用于执行linux命令

Property Name Default Description
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值