
pyspark
我是京城小白
这个作者很懒,什么都没留下…
展开
-
pyspark rdd dataframe互转
from pyspark.conf import SparkConffrom pyspark.context import SparkContextfrom pyspark.sql import SparkSession sparkConf = SparkConf()# 设置Driver进程的内存sparkConf.set('spark.driver.memory', '8G')...原创 2020-04-27 17:40:50 · 486 阅读 · 0 评论 -
pyspark join
1. 单个column joinfrom pyspark.sql import Rowrdd1 = sc.parallelize([Row(name='Alice', age=5, height=80), \ Row(name='Alice', age=10, height=80), \ Row(nam...原创 2020-04-27 16:55:12 · 1002 阅读 · 0 评论