spark 2.2.0 action操作python版

本文通过实例展示了 Apache Spark 中 RDD 的多种操作方法,包括 reduce、collect、count、take 和 saveAsTextFile 等基本功能,同时也演示了 countByKey 和 foreach 的使用。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

import os
import sys
os.environ['SPARK_HOME'] = '/opt/spark'
sys.path.append("/opt/spark/python")

from pyspark import SparkContext
from pyspark import SparkConf

def reducetest():
    sc = SparkContext("spark://node0:7077", "reduce")
    list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    listRdd = sc.parallelize(list)
    count =listRdd.reduce(lambda x,y:x+y)
    print count
    sc.stop
def collecttest():
    sc = SparkContext("spark://node0:7077", "collec")
    list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    listRdd = sc.parallelize(list)
    collect =listRdd.collect()
    print collect
    sc.stop
def counttest():
    sc = SparkContext("spark://node0:7077", "count")
    list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    listRdd = sc.parallelize(list)
    count =listRdd.count()
    print count
    sc.stop
def taketest():
    sc = SparkContext("spark://node0:7077", "take")
    list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    listRdd = sc.parallelize(list)
    three =listRdd.take(3)
    print three
    sc.stop
def saveAstextFiletest():
    sc = SparkContext("spark://node0:7077", "saveAstextFile")
    list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    listRdd = sc.parallelize(list)
    listRdd.saveAsTextFile("/count")
    #print three
    sc.stop
def countByKeytest():
    sc = SparkContext("spark://node0:7077", "countByKey")
    listtest = [("class1","elo"), ("class2","jave"), ("class1","tom"), ("class2","smi")]
    listRDD = sc.parallelize(listtest)
    count = listRDD.countByKey()
    print count
def f(x):
    print(x)
def foreachtest():
    sc = SparkContext("local", "foreach")
    sc.parallelize([1, 2, 3, 4, 5]).foreach(f)


if __name__ == '__main__':
    #reducetest()
    #collecttest()
    #counttest()
    #taketest()
    #saveAstextFiletest()
    #countByKeytest()
    foreachtest()

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值