flume ng performance tuning

本文探讨了FlumeNG中事务批次大小对于性能的影响。详细介绍了如何为ExecSource和AvroSink设置合适的批次大小以实现实时数据收集和传输。同时,针对Collector的HbaseSink,建议的批次大小为100,并指出可以根据实际测试结果进行调整。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Transactions are critical concept for FlumeNG.

Sources produce events into Channel based on transaction batch size.

Sinks consume events from Channel based on transaction batch size.

It means the batch size is the key for performance tuning.

 

For agent.

ExecSource is used to collect events and the default batch size is 20. In order to do it in real time, it’s better we set it to 5 or 10. It will be based on the log itself.

AvroSink is used to transmit events into collectors. So the batch size is N times as the ExecSource batch size.

 

For collector.

HbaseSink batch size is suggested to 100.

And we can adjust the batch size based on our test results.

 

 

The larger of the batch size, the better the file channels operate; but we should consider the latency.

The smaller of the batch size, the faster of the transmitting ; but we should consider the CPU and time consumed of disk sync.

 

Please ref below blogs to get more.

http://blog.cloudera.com/blog/2013/01/how-to-do-apache-flume-performance-tuning-part-1/

http://blog.cloudera.com/blog/2012/09/about-apache-flume-filechannel/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值