创建SparkSession
sparkSQl 可以读取不同数据源的数据,比如jdbc,json,csv,parquet
执行读操作就用sparkSession.read.文件类型,执行写操作就用SparkSession.write.文件类型
首先创建一个SparkSession:
val spark = SparkSession.builder().appName("load data")
.master("local[4]")
.getOrCreate()
如果通过spark-shell 进入的,会自动创建
[xxx@hadoop bin]$ ./spark-shell
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2020-06-15 16:17:00,429 [main] WARN org.apache.spark.util.Utils - Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
2020-06-15 16:17:00,430 [main] WARN org.apache.spark.util.Utils - Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
2020-06-15 16:17:00,430 [main] WARN org.apache.spark.util.Utils - Service 'SparkUI' co