Exception in thread "main" java.lang.IllegalArgumentException: The parallelism of non parallel operator must be 1.
at org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:139)
at org.apache.flink.api.common.operators.util.OperatorValidationUtils.validateParallelism(OperatorValidationUtils.java:38)
at org.apache.flink.streaming.api.datastream.DataStreamSource.setParallelism(DataStreamSource.java:85)
at org.apache.flink.streaming.api.datastream.DataStreamSource.setParallelism(DataStreamSource.java:36)
at org.apache.flink.streaming.api.scala.DataStream.setParallelism(DataStream.scala:131)
at com.xk.bigdata.flink.datastream.datasource.buildin.CollectionDataSource$.main(CollectionDataSource.scala:16)
at com.xk.bigdata.flink.datastream.datasource.buildin.CollectionDataSource.main(CollectionDataSource.scala)
调用 fromCollection API
val dataStream2 = env.fromCollection(List("spark,hadoop", 1L))
println(dataStream2.parallelism)
val mapStream2 = dataStream2.map(x => x)
println(mapStream2.parallelism)
运行结果
1
4
查看 fromCollection API 源代码
Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
/**
* Creates a DataStream from the given [[SplittableIterator]].
*/
def fromParallelCollection[T: TypeInformation](data: SplittableIterator[T]):
DataStream[T]={
val typeInfo = implicitly[TypeInformation[T]]asScalaStream(javaEnv.fromParallelCollection(data, typeInfo))}
val dataStream3 = env.fromParallelCollection(new LongValueSequenceIterator(1L,10L))
println(dataStream3