使用Flume消费kafka中数据,sink到hdfs中出现数据压缩格式支持错误;
报错如下:
error during configuration
java.lang.IllegalArgumentException: Unsupported compression codec Lzop. Please choose from: [None, BZip2Codec, DefaultCodec, DeflateCodec, GzipCodec, Lz4Codec, SnappyCodec]
at org.apache.flume.sink.hdfs.HDFSEventSink.getCodec(HDFSEventSink.java:334)
at org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:237)
at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:411)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:102)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:141)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:748)
解决方案为两步:
第一步:修改hadoop集群中 的core-site.xml 文件,将下面的配置文件进行添加:
<configuration>
<property>
<name>io.compression.codecs</name>
<value>
com.hadoop.compression.lzo.LzoCodec,
com.hadoop.compression.lzo.LzopCodec
</value>
</property>
<property>
<name>io.compression.codec.lzo.class</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
</configuration>
这些属性告诉flume 解压的类在哪里
第二步:
去集群搜索下
find / -name hadoop-lzo-*
/share/hadoop/common/hadoop-lzo-0.X.X-SNAPSHOT.jar
肯定有类似于这样的jar
然后将这个jar拷贝到自己flume的lib目录下
cp /opt/software/hadoop-3.1.3/share/hadoop/common/hadoop-lzo-0.4.20.jar /opt/software/flume/flume-1.9.0-bin/lib/
再次启动就搞定了。
该博客主要介绍了在使用Flume从Kafka消费数据并将其写入HDFS时遇到的压缩格式不支持问题。错误信息表明Flume不支持Lzop压缩。解决方案包括两步:首先,在Hadoop集群的core-site.xml中添加Lzo和Lzop的压缩codec配置;其次,将hadoop-lzo.jar复制到Flume的lib目录下。完成这些步骤后,重新启动Flume即可解决该问题。
8246

被折叠的 条评论
为什么被折叠?



