源码:
def aggregate[U: ClassTag](zeroValue: U)(seqOp: (U, T) => U, combOp: (U, U) => U): U[U: ClassTag]:泛型
zeroValue:初始值seqOp: 每个分区的计算方法combOp:分区间的计算方法案例:
object Aggregate { def main(args: Array[String]) { val conf = new SparkConf().setAppName("Aggregate").setMaster("local"); val sc = new SparkContext(conf); val arr=sc.parallelize(Array(1,2,3,4,5,6,7,8,9,10),3) val aggregate: Int = arr.aggregate(8)(Math.max(_,_), _ + _); println(aggregate) } }结果:34运行过程:1)RDD中有3个partition part0(1,2,3) part1(4,5,6) part3(7,8,9,10)2)对每一个part执行 Math.max(_,_) 方法:Math.max(zeroValue,value)=>result=>Math.max(result,value)......part0(1,2,3)=>Math.max(8,1)=>8=>Math.max(8,2)=>8=>Math.max(8,3)=>8part1(4,5,6)=>Math.max(8,4)=>8=>Math.max(8,5)=>8=>Math.max(8,6)=>8part2(7,8,9,10)=>Math.max(8,7)=>8=>Math.max(8,8)=>8=>Math.max(8,9)=>9=>Math.max(9,10)=>103)对每个分区的计算结果执行_ + _方法:zeroValue+part(0).result+part(1).result+part(2).result_ + _ =>8+8+8+10=34
4)结果是34
![]()