Flume
Movle
this is the way
展开
专栏收录文章
- 默认排序
- 最新发布
- 最早发布
- 最多阅读
- 最少阅读
-
Flume-概览
一.flume概览1.概述: Flume是一种分布式,可靠且可用的服务,用于有效地收集,聚合和移动大量日志数据。它具有基于流数据流的简单灵活的架构。它具有可靠的可靠性机制和许多故障 转移和恢复机制,具有强大的容错性。它使用简单的可扩展数据模型,允许在线分析应用程序。2.大数据架构数据采集(爬虫\日志数据\flume)数据存储(hdfs/hive/hbase(nosql))数据计算(mapreduce/hive/sparkSQL/sparkStrea原创 2020-05-09 17:43:26 · 232 阅读 · 0 评论 -
Flume-安装配置
1.下载压缩包2.上传到linux:/opt/software3.解压cd /opt/softwaretar -zxvf apache-flume-1.6.0-bin.tar.gz -C /opt/moudule4.重命名cd /opt/module/flume/confmv flume-env.sh.template flume-env.sh5.修改配置vi flume-env.sh修改内容如下:export JAVA_HOME=/opt/module/jdk1.8.0_1原创 2020-05-09 17:45:06 · 130 阅读 · 0 评论 -
Flume-拦截器(多转换,少计算,轻量级)
0.拦截器图解1.常用拦截器(1)时间戳拦截器(2)主机名拦截器(3)UUID拦截器&查询替换拦截器(4)正则拦截器2.自定义拦截器(1)写自定义拦截器程序,即成flume的拦截器包(2)打包(3)上传到linux(4)修改flume.conf文件(5)运行...原创 2020-05-09 17:46:04 · 194 阅读 · 0 评论 -
Flume的使用:监听端口
1.新建配置文件flumejob_telnet.conf#smple.conf: A single-node Flume configuration# Name the components on this agent 定义变量方便调用 加s可以有多个此角色a1.sources = r1a1.sinks = k1a1.channels = c1# Describe/configure the source 描述source角色 进行内容定制# 此配置属于tcp source 必须是netc原创 2020-05-09 17:48:42 · 761 阅读 · 0 评论 -
Flume的使用:监听本地linux文件并采集到hdfs
1.新建配置文件flumejob_hdfs.conf,然后上传# Name the components on this agent agent别名设置a1.sources = r1a1.sinks = k1a1.channels = c1# Describe/configure the source 设置数据源监听本地文件配置# exec 执行一个命令的方式去查看文件 tail -F 实时查看a1.sources.r1.type = exec# 要执行的脚本command tail -原创 2020-05-09 17:49:32 · 556 阅读 · 0 评论 -
Flume的使用:监听文件夹
1.新建配置文件flumejob_dir.conf# 定义别名a1.sources = r1a1.sinks = k1a1.channels = c1# Describe/configure the sourcea1.sources.r1.type = spooldir# 监控的文件夹a1.sources.r1.spoolDir = /opt/testdir# 上传成功后显示后缀名 a1.sources.r1.fileSuffix = .COMPLETED# 如论如何 加绝对路径的文原创 2020-05-09 17:50:13 · 1012 阅读 · 0 评论 -
Flume的使用:扇出-单source多channel多sink监控:监控文件并采集到hdfs与本地
0.图解过程:1.修改多配置文件,flumejob_1.conf,flumejob_2.conf,flumejob_3.conf(1)flumejob_1.confa1.sources = r1a1.sinks = k1 k2a1.channels = c1 c2# 将数据流复制给多个channela1.sources.r1.selector.type = replicating# 2.sourcea1.sources.r1.type = execa1.sources.r1.com原创 2020-05-09 17:51:35 · 359 阅读 · 0 评论 -
Flume的使用:扇入-flume与flume之间数据传递,多flume汇总数据到单flume
1.图示2.新建配置文件:(1)flume-1.conf# 1 agenta1.sources = r1a1.sinks = k1a1.channels = c1# 2 sourcea1.sources.r1.type = netcata1.sources.r1.bind = hadoop2a1.sources.r1.port = 55555#3 sinka1.sinks.k1.type = avroa1.sinks.k1.hostname = hadoop4a1.sink原创 2020-05-09 17:52:45 · 524 阅读 · 0 评论 -
Flume拦截器-时间戳拦截器
0.功能作用:将时间戳放到event的header(Map<key,value>)1.Timestamp.conf#1.定义agent名, source、channel、sink的名称a4.sources = r1a4.channels = c1a4.sinks = k1#2.具体定义sourcea4.sources.r1.type = spooldira4.sources.r1.spoolDir = /opt/module/flume-1.8.0/upload#定义拦截原创 2020-05-09 17:53:56 · 1523 阅读 · 0 评论 -
Flume拦截器-主机名拦截器
0.功能作用:将时间戳放到event的header(Map<key,value>)1.Host.conf#1.定义agenta1.sources= r1a1.sinks = k1a1.channels = c1#2.定义sourcea1.sources.r1.type = execa1.sources.r1.channels = c1a1.sources.r1.command = tail -F /opt/plus#拦截器a1.sources.r1.intercepto原创 2020-05-09 17:54:40 · 469 阅读 · 0 评论 -
Flume拦截器-UUID拦截器
1.uuid.confa1.sources = r1a1.sinks = k1a1.channels = c1a1.sources.r1.type = execa1.sources.r1.channels = c1a1.sources.r1.command = tail -F /opt/plusa1.sources.r1.interceptors = i1#type的参数不能写成uuid,得写具体,否则找不到类a1.sources.r1.interceptors.i1.type = or原创 2020-05-10 08:11:29 · 332 阅读 · 0 评论 -
Flume拦截器-查询替换拦截器
1.search.conf#1 agenta1.sources = r1a1.sinks = k1a1.channels = c1#2 sourcea1.sources.r1.type = execa1.sources.r1.channels = c1a1.sources.r1.command = tail -F /opt/plusa1.sources.r1.interceptors = i1a1.sources.r1.interceptors.i1.type = search_rep原创 2020-05-10 08:11:52 · 436 阅读 · 0 评论 -
Flume拦截器-正则过滤拦截器
1.filter.conf#1 agenta1.sources = r1a1.sinks = k1a1.channels = c1#2 sourcea1.sources.r1.type = execa1.sources.r1.channels = c1a1.sources.r1.command = tail -F /opt/plusa1.sources.r1.interceptors = i1a1.sources.r1.interceptors.i1.type = regex_filt原创 2020-05-10 08:12:11 · 971 阅读 · 0 评论 -
Flume拦截器-正则抽取拦截器
1.extractor.conf#1 agenta1.sources = r1a1.sinks = k1a1.channels = c1#2 sourcea1.sources.r1.type = execa1.sources.r1.channels = c1a1.sources.r1.command = tail -F /opt/plusa1.sources.r1.interceptors = i1a1.sources.r1.interceptors.i1.type = regex_e原创 2020-05-10 08:12:40 · 476 阅读 · 0 评论 -
Flume拦截器-自定义拦截器
1.写一个小写字母变成大写的拦截器;2.编写代码:(1)pom.xml <dependencies> <!-- flume核心依赖 --> <dependency> <groupId>org.apache.flume</groupId> <artifactId>flume-ng-core</artifactId>原创 2020-05-10 08:13:02 · 254 阅读 · 0 评论
分享