全国职业院校技能大赛-大数据应用赛项-采集

最新推荐文章于 2025-06-10 21:56:58 发布

「已注销」

最新推荐文章于 2025-06-10 21:56:58 发布

阅读量1.2k

点赞数 23

文章标签：大数据

本文链接：https://blog.youkuaiyun.com/Eternity_04/article/details/140296695

版权

1、在主节点使用Flume采集/data_log目录下实时日志文件中的数据，将数据存入到Kafka的Topic中（Topic名称分别为ChangeRecord、ProduceRecord和EnvironmentData，分区数为4），将Flume采集ChangeRecord主题的配置截图粘贴至客户端桌面【Release\任务D提交结果.docx】中对应的任务序号下；

该题采用flume监听实时日志文件中的数据变化，采集到对应的kafka的topic中

首先需要修改kafka的配置，使其能远程访问，配置文件地址：$KAFKA_HOME/config/server.properties

key	value	中文含义
listeners	PLAINTEXT://0.0.0.0:9092	负责绑定网卡
advertised.listeners	PLAINTEXT://your.hostname:9092	负责发布外网地址，这个地址会发布到zookeeper中。

# The address the socket server listens on. It will get the value returned from
# java.net.InetAddress.getCanonicalHostName() if not configured.
#   FORMAT:
#     listeners = listener_name://host_name:port
#   EXAMPLE:
#     listeners = PLAINTEXT://your.host.name:9092
# 表示内网所有网卡都绑定9092
listeners=PLAINTEXT://0.0.0.0:9092
 
# Hostname and port the broker will advertise to producers and consumers. If not set,
# it uses the value for "listeners" if configured.  Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
# 表示外网的访问地址，此处应该修改为直连ip
advertised.listeners=PLAINTEXT://your.hostname:9092