Flink / Scala - 8.DataSet 应用 Broadcast Variables

本文介绍了Flink中的Broadcast Variables,用于提供全局共享数据。通过`withBroadcastSet`生成广播变量,`getRuntimeContext().getBroadcastVariable`获取。在DataSet的map操作中使用广播变量,展示了创建、广播、获取的步骤,并通过一个示例解释了广播变量的优势,适用于所有Task共享的小型数据集,避免每个Task重复初始化大型数据集。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

一.引言

除了操作的常规输入之外,广播变量 Broadcast Value 允许使一个数据集对操作的所有并行实例可用,即适合 task 都需要公用的变量,就像是 spark 中各个 executor 都需要访问的公共变量一样。这对于辅助数据集或依赖于数据的参数化非常有用。然后,该数据集将作为一个集合在操作员处进行访问。在 Flink 中,广播变量通过下述方法生成和获取 :

生成 : withBroadcastSet(DataSet, String) 前者为广播的数据集,后者为该数据集对应的名字标识

获取 : getRuntimeContext().getbroadcastvariable(String) 方法获取,参数为生成对应的名字标识

二.Boardcast 使用 demo

本例将在 DataSet 的 map 操作中调用 broadcast value 并打印其结果。

    val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment
    import org.apache.flink.api.scala._

    // 1. The DataSet to be broadcast 
    // A.创建
    val toBroadcastV1 = env.fromElements(1, 2, 3)
    val toBroadca
networks:  net:    external: trueservices:  jobmanager1:    restart: always    image: apache/flink:1.16.3    container_name: jobmanager1    hostname: jobmanager1    ports:     - '8081:8081'    volumes:      - /etc/localtime:/etc/localtime      - /home/sumengnan/apache/flink/timezone:/etc/timezone      - /home/sumengnan/apache/flink/conf/flink-conf-jobmanager1.yaml:/opt/flink/conf/flink-conf.yaml      - /home/sumengnan/apache/flink/conf/log4j.properties:/opt/flink/conf/log4j.properties      - /home/sumengnan/apache/flink/conf/logback.xml:/opt/flink/conf/logback.xml      - /home/sumengnan/apache/flink/conf/log4j-console.properties:/opt/flink/conf/log4j-console.properties      - /home/sumengnan/apache/flink/conf/logback-console.xml:/opt/flink/conf/logback-console.xml      - /home/sumengnan/apache/flink/data:/opt/flink/data      - /home/sumengnan/apache/flink/log:/opt/flink/log      - /home/sumengnan/apache/flink/tmp:/opt/flink/tmp    networks:      - net  jobmanager2:    restart: always    image: apache/flink:1.16.3    container_name: jobmanager2    hostname: jobmanager2    ports:     - '8082:8081'    volumes:      - /etc/localtime:/etc/localtime      - /home/sumengnan/apache/flink/timezone:/etc/timezone      - /home/sumengnan/apache/flink/conf/flink-conf-jobmanager2.yaml:/opt/flink/conf/flink-conf.yaml      - /home/sumengnan/apache/flink/conf/log4j.properties:/opt/flink/conf/log4j.properties      - /home/sumengnan/apache/flink/conf/logback.xml:/opt/flink/conf/logback.xml      - /home/sumengnan/apache/flink/conf/log4j-console.properties:/opt/flink/conf/log4j-console.properties      - /home/sumengnan/apache/flink/conf/logback-console.xml:/opt/flink/conf/logback-console.xml      - /home/sumengnan/apache/flink/data:/opt/flink/data      - /home/sumengnan/apache/flink/log:/opt/flink/log      - /home/sumengnan/apache/flink/tmp:/opt/flink/tmp    networks:      - net    depenes_on:      - jobmanager1  taskmanager1:    restart: always    image: apache/flink:1.16.3    container_name: taskmanager1    hostname: taskmanager1    command: taskmanager    volumes:      - /etc/localtime:/etc/localtime      - /home/sumengnan/apache/flink/timezone:/etc/timezone      - /home/sumengnan/apache/flink/conf/flink-conf-taskmanager1.yaml:/opt/flink/conf/flink-conf.yaml      - /home/sumengnan/apache/flink/conf/log4j.properties:/opt/flink/conf/log4j.properties      - /home/sumengnan/apache/flink/conf/logback.xml:/opt/flink/conf/logback.xml      - /home/sumengnan/apache/flink/conf/log4j-console.properties:/opt/flink/conf/log4j-console.properties      - /home/sumengnan/apache/flink/conf/logback-console.xml:/opt/flink/conf/logback-console.xml      - /home/sumengnan/apache/flink/data:/opt/flink/data      - /home/sumengnan/apache/flink/log:/opt/flink/log      - /home/sumengnan/apache/flink/tmp:/opt/flink/tmp    networks:      - net    depenes_on:      - jobmanager1      - jobmanager2  taskmanager2:    restart: always    image: apache/flink:1.16.3    container_name: taskmanager2    hostname: taskmanager2    command: taskmanager    volumes:      - /etc/localtime:/etc/localtime      - /home/sumengnan/apache/flink/timezone:/etc/timezone      - /home/sumengnan/apache/flink/conf/flink-conf-taskmanager2.yaml:/opt/flink/conf/flink-conf.yaml      - /home/sumengnan/apache/flink/conf/log4j.properties:/opt/flink/conf/log4j.properties      - /home/sumengnan/apache/flink/conf/logback.xml:/opt/flink/conf/logback.xml      - /home/sumengnan/apache/flink/conf/log4j-console.properties:/opt/flink/conf/log4j-console.properties      - /home/sumengnan/apache/flink/conf/logback-console.xml:/opt/flink/conf/logback-console.xml      - /home/sumengnan/apache/flink/data:/opt/flink/data      - /home/sumengnan/apache/flink/log:/opt/flink/log      - /home/sumengnan/apache/flink/tmp:/opt/flink/tmp    networks:      - net    depenes_on:      - jobmanager1      - jobmanager2  taskmanager3:    restart: always    image: apache/flink:1.16.3    container_name: taskmanager3    hostname: taskmanager3    command: taskmanager    volumes:      - /etc/localtime:/etc/localtime      - /home/sumengnan/apache/flink/timezone:/etc/timezone      - /home/sumengnan/apache/flink/conf/flink-conf-taskmanager3.yaml:/opt/flink/conf/flink-conf.yaml      - /home/sumengnan/apache/flink/conf/log4j.properties:/opt/flink/conf/log4j.properties      - /home/sumengnan/apache/flink/conf/logback.xml:/opt/flink/conf/logback.xml      - /home/sumengnan/apache/flink/conf/log4j-console.properties:/opt/flink/conf/log4j-console.properties      - /home/sumengnan/apache/flink/conf/logback-console.xml:/opt/flink/conf/logback-console.xml      - /home/sumengnan/apache/flink/data:/opt/flink/data      - /home/sumengnan/apache/flink/log:/opt/flink/log      - /home/sumengnan/apache/flink/tmp:/opt/flink/tmp    networks:      - net    depenes_on:      - jobmanager1      - jobmanager2
最新发布
04-04
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

BIT_666

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值