文章大纲
Quickstart: DataFrame
This is a short introduction and quickstart for the PySpark DataFrame API. PySpark DataFrames are lazily evaluated. They are implemented on top of RDDs. When Spark transforms data, it does not immediately compute the transformation but plans how to compute later. When actions such as collect() are explicitly called, the computation starts.
This notebook shows the basic usages of th

订阅专栏 解锁全文
1361

被折叠的 条评论
为什么被折叠?



