flink系列(7)-streamGraph

本文深入探讨了Flink中StreamGraph的概念,详细介绍了其在四层执行图中的位置及作用,包括如何通过不同的Transformation构建数据流,如Map、Reduce、KeyBy等操作,并解释了DataStream与StreamOperator的关系。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

StreamGraph是flink四层执行图中的第一层图,代码在org.apache.flink.streaming.api.graph包中,第一层graph主要做的事情是将所有的stransformation添加到DAG中,并设置并行度,设置slot槽位

具体涉及到的transformation大概有11个,继承图如下

1682974504c0f8ee8bbc861db5921c70e77.jpg

首先我们来看一下如何获取StreamTransformation

public <OUT> DataStreamSource<OUT> addSource(SourceFunction<OUT> function, String sourceName, TypeInformation<OUT> typeInfo) {

		if (typeInfo == null) {
			if (function instanceof ResultTypeQueryable) {
				typeInfo = ((ResultTypeQueryable<OUT>) function).getProducedType();
			} else {
				try {
					typeInfo = TypeExtractor.createTypeInfo(
							SourceFunction.class,
							function.getClass(), 0, null, null);
				} catch (final InvalidTypesException e) {
					typeInfo = (TypeInformation<OUT>) new MissingTypeInfo(sourceName, e);
				}
			}
		}

		boolean isParallel = function instanceof ParallelSourceFunction;

		clean(function);
		StreamSource<OUT, ?> sourceOperator;
		if (function instanceof StoppableFunction) {
			sourceOperator = new StoppableStreamSource<>(cast2StoppableSourceFunction(function));
		} else {
			sourceOperator = new StreamSource<>(function);
		}

		return new DataStreamSource<>(this, typeInfo, sourceOperator, isParallel, sourceName);
	}

最终返回的是DataStreamSource,内部封装了SourceTransformation,下面看一下DataStream的类图结构

3f6af9ee732815f74e3ec4d3ca9ffd0543c.jpg

可以看到DataStreamSource是DataStream的子类

DataStreamSource是DataStream的数据流抽象,StreamSource是StreamOperator的抽象,在 flink 中一个 DataStream 封装了一次数据流转换,一个 StreamOperator 封装了一个函数接口,比如 map、reduce、keyBy等。下面我们在看一下StreamOperator的类图关系

可以看到StreamMap/StreamFlatMap都是operator的子类,下面来看一段具体生成operator和transformation的代码

/**
	 * Applies a Map transformation on a {@link DataStream}. The transformation
	 * calls a {@link MapFunction} for each element of the DataStream. Each
	 * MapFunction call returns exactly one element. The user can also extend
	 * {@link RichMapFunction} to gain access to other features provided by the
	 * {@link org.apache.flink.api.common.functions.RichFunction} interface.
	 *
	 * @param mapper
	 *            The MapFunction that is called for each element of the
	 *            DataStream.
	 * @param <R>
	 *            output type
	 * @return The transformed {@link DataStream}.
	 */
	public <R> SingleOutputStreamOperator<R> map(MapFunction<T, R> mapper) {

		TypeInformation<R> outType = TypeExtractor.getMapReturnTypes(clean(mapper), getType(),
				Utils.getCallLocationName(), true);

		return transform("Map", outType, new StreamMap<>(clean(mapper)));
	}

/**
	 * Method for passing user defined operators along with the type
	 * information that will transform the DataStream.
	 *
	 * @param operatorName
	 *            name of the operator, for logging purposes
	 * @param outTypeInfo
	 *            the output type of the operator
	 * @param operator
	 *            the object containing the transformation logic
	 * @param <R>
	 *            type of the return stream
	 * @return the data stream constructed
	 */
	@PublicEvolving
	public <R> SingleOutputStreamOperator<R> transform(String operatorName, TypeInformation<R> outTypeInfo, OneInputStreamOperator<T, R> operator) {

		// read the output type of the input Transform to coax out errors about MissingTypeInfo
		transformation.getOutputType();

		OneInputTransformation<T, R> resultTransform = new OneInputTransformation<>(
				this.transformation,
				operatorName,
				operator,
				outTypeInfo,
				environment.getParallelism());

		@SuppressWarnings({ "unchecked", "rawtypes" })
		SingleOutputStreamOperator<R> returnStream = new SingleOutputStreamOperator(environment, resultTransform);

		getExecutionEnvironment().addOperator(resultTransform);

		return returnStream;
	}

到这里基本说完了DataStream和StreamOperator,包含transformation的产生,DataStream的操作等,下一篇我们在来说一下transformation的转换

转载于:https://my.oschina.net/u/1262062/blog/3038019

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值