SparkSQLExample.scala官方范例学习

本文通过Spark Shell演示了如何使用SparkSQL读取JSON文件、操作DataFrame、执行SQL查询、转换数据以及处理基本数据类型。从创建DataFrame、过滤、聚合到使用全局临时视图,深入理解SparkSQL的功能。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

$ bin/spark-shell --master local[4]

scala> spark.

baseRelationToDataFrame   conf              emptyDataFrame   implicits         range        sparkContext   stop      time      

catalog                   createDataFrame   emptyDataset     listenerManager   read         sql            streams   udf       

close                     createDataset     experimental     newSession        readStream   sqlContext     table     version

scala> spark.conf

18/03/19 15:22:48 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException

res0: org.apache.spark.sql.RuntimeConfig = org.apache.spark.sql.RuntimeConfig@4138af7

// global_temp可以跨session使用的

scala> spark.read.json("examples/src/main/resources/people.json")

org.apache.spark.sql.AnalysisException: 

Path does not exist: hdfs://xxxxxxx1:8020/user/YYYYYYYYY/examples/src/main/resources/people.json;

// 系统在.sparkStaging所在的目录搜寻examples/目录

scala> spark.read.json("file:///examples/src/main/resources/people.json")

org.apache.spark.sql.AnalysisException: Path does not exist: file:/examples/src/main/resources/people.json;

......

// 不在$SPARK_HOME目录

scala> spark.read.json("file:///$SPARK_HOME/examples/src/main/resources/people.json")

org.apache.spark.sql.AnalysisException: Path does not exist: file:/$SPARK_HOME/examples/src/main/resources/people.json;

......

// 不识别$SPARK_HOME,毕竟不是同一个shell……鱼唇的尝试!

scala> spark.read.json("file:////opt/bigdata/nfs/spark-2.1.2-bin-hadoop2.7/examples/src/main/resources/people.json")

res4: org.apache.spark.sql.DataFrame = [age: bigint, name: string]

// 终于对了,json返回的是DataFrame

scala> res4.show

+----+-------+

| age|   name|

+----+-------+

|null|Michael|

|  30|   Andy|

|  19| Justin|

+----+-------+

 

scala> val df =res4

df: org.apache.spark.sql.DataFrame = [age: bigint, name: string]

 

scala> df.printSchema

root

 |-- age: long (nullable = true)

 |-- name: string (nullable = true)

 

scala> df.select("name").show

+-------+

|   name|

+-------+

|Michael|

|   Andy|

| Justin|

+-------+

// DF.select("[column_name]")

scala> results.select("name", "age").show

+-------+---+

|   name|age|

+-------+---+

|Michael| 29|

|   Andy| 30|

| Justin| 19|

+-------+---+

// 可以多选几列

scala> df.select($"name", $"age" + 100).show

+-------+-----------+

|   name|(age + 100)|

+-------+-----------+

Error:scalac: Error: Error compiling the sbt component 'compiler-interface-2.10.0-52.0' sbt.internal.inc.CompileFailed: Error compiling the sbt component 'compiler-interface-2.10.0-52.0' at sbt.internal.inc.AnalyzingCompiler$.handleCompilationError$1(AnalyzingCompiler.scala:331) at sbt.internal.inc.AnalyzingCompiler$.$anonfun$compileSources$4(AnalyzingCompiler.scala:346) at sbt.internal.inc.AnalyzingCompiler$.$anonfun$compileSources$4$adapted(AnalyzingCompiler.scala:341) at sbt.io.IO$.withTemporaryDirectory(IO.scala:376) at sbt.io.IO$.withTemporaryDirectory(IO.scala:383) at sbt.internal.inc.AnalyzingCompiler$.$anonfun$compileSources$2(AnalyzingCompiler.scala:341) at sbt.internal.inc.AnalyzingCompiler$.$anonfun$compileSources$2$adapted(AnalyzingCompiler.scala:335) at sbt.io.IO$.withTemporaryDirectory(IO.scala:376) at sbt.io.IO$.withTemporaryDirectory(IO.scala:383) at sbt.internal.inc.AnalyzingCompiler$.compileSources(AnalyzingCompiler.scala:335) at org.jetbrains.jps.incremental.scala.local.CompilerFactoryImpl$.org$jetbrains$jps$incremental$scala$local$CompilerFactoryImpl$$getOrCompileInterfaceJar(CompilerFactoryImpl.scala:123) at org.jetbrains.jps.incremental.scala.local.CompilerFactoryImpl.$anonfun$getScalac$1(CompilerFactoryImpl.scala:55) at scala.Option.map(Option.scala:163) at org.jetbrains.jps.incremental.scala.local.CompilerFactoryImpl.getScalac(CompilerFactoryImpl.scala:47) at org.jetbrains.jps.incremental.scala.local.CompilerFactoryImpl.createCompiler(CompilerFactoryImpl.scala:25) at org.jetbrains.jps.incremental.scala.local.CachingFactory.$anonfun$createCompiler$3(CachingFactory.scala:24) at org.jetbrains.jps.incremental.scala.local.Cache.$anonfun$getOrUpdate$2(Cache.scala:20) at scala.Option.getOrElse(Option.scala:138) at org.jetbrains.jps.incremental.scala.local.Cache.getOrUpdate(Cache.scala:19) at org.jetbrains.jps.incremental.scala.local.CachingFactory.createCompiler(CachingFactory.scala:24) at org.jetbrains.jps.incremental.scala.local.LocalServer.compile(LocalServer.scala:34) at org.jetbrains.jps.incremental.scala.remote.Main$.compileLogic(Main.scala:117) at org.jetbrains.jps.incremental.scala.remote.Main$.handleCommand(Main.scala:109) at org.jetbrains.jps.incremental.scala.remote.Main$.serverLogic(Main.scala:95) at org.jetbrains.jps.incremental.scala.remote.Main$.nailMain(Main.scala:53) at org.jetbrains.jps.incremental.scala.remote.Main.nailMain(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.martiansoftware.nailgun.NGSession.run(NGSession.java:319)
最新发布
08-15
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值