Spark中的各种action算子操作(scala版)

本文通过示例展示了Scala中Spark的reduce、collect、take、count、countByKey和saveAsTextFile等action算子的使用,探讨了Scala的函数式编程在Spark上的应用及其简化代码的优势。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

这里直接贴代码了,action的介绍都在java那里。
package cn.spark.study.core

import org.apache.spark.SparkConf
import org.apache.spark.SparkContext

object ActionOperation {
def main(args: Array[String]): Unit = {
//reduceTest()
// collectTest()
//takeTest()
//countTest()
countByKeyTest()
}

def reduceTest(){
val conf = new SparkConf()
.setAppName(“reduce”)
.setMaster(“local”)
val sc = new SparkContext(conf)

val list = Array(1,2,3,4,5,6,7,8,9,10)

val numbersRDD = sc.parallelize(list, 1)
val sum = numbersRDD.reduce(_+_)
println(sum)

}

def collectTest(){
val conf = new SparkConf()
.setAppName(“collect”)
.setMaster(“local”)
val sc = new SparkContext(conf)

val list = Array(1,2,3,4,5,6,7,8,9,10)

val numbersRDD = sc.parallelize(list, 1)
val doubleNumber = numbersRDD.map { num => num*2 }
val doubleList = doubleNumber.collect()
for(num <- doubleList){
  println(num)
}

}

def takeTest(){
val conf = new SparkConf()
.setAppName(“take”)
.setMaster(“local”)
val sc = new SparkContext(conf)
val list = Array(1,2,3,4,5,6,7,8,9,10)

val numbersRDD = sc.parallelize(list, 1)

val top3 = numbersRDD.take(3)
for(num <- top3){
  println(num)
}

}

def countTest(){
val conf = new SparkConf()
.setAppName(“count”)
.setMaster(“local”)
val sc = new SparkContext(conf)
val list = Array(1,2,3,4,5,6,7,8,9,10)

val numbersRDD = sc.parallelize(list, 1)
val count = numbersRDD.count()
println(count)

}

def saveAsTextFileTest(){
val conf = new SparkConf()
.setAppName(“saveAsTextFile”)
.setMaster(“local”)
val sc = new SparkContext(conf)
val linesRDD = sc.textFile(“”, 1)

linesRDD.saveAsTextFile("hdfs://spark1:9000/spark.txt")

}

def countByKeyTest(){
val conf = new SparkConf()
.setAppName(“countByKey”)
.setMaster(“local”)
val sc = new SparkContext(conf)

val studentList = Array(("class1","leo"),
                        ("class2","jack"),
                        ("class1","marry"),
                        ("class2","ksc"),
                        ("class2","my"))
val studentsRDD = sc.parallelize(studentList, 1)
val studentsCount = studentsRDD.countByKey()
for((k,v) <- studentsCount){
  println(k+":"+v)
}

}
}

这里不得不说scala的函数式编程与各种特性使得scala的程序比java简洁的多,但是对于我这里菜鸟来说scala的特性学习起来真的很麻烦。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值