测试代码如下:
|
数据源如下:
bid,name,iphone
1,a,118
2,b,111
3,c,125
报错如下:
20/01/09 17:42:52 INFO SparkSqlParser: Parsing command: select bid from test
Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve '`bid`' given input columns: [bid, name, iphone]; line 1 pos 7;
'Project ['bid]
+- SubqueryAlias test, `test`
+- Relation[bid#0,name#1,iphone#2] csv
at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:77)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:74)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:310)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:310)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:309)
at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionUp$1(QueryPlan.scala:282)
at org.apache.spark.sql.catalyst.plans.QueryPlan.org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$2(QueryPlan.scala:292)
at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$2$1.apply(QueryPlan.scala:296)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
at scala.collection.immutable.List.foreach(List.scala:383)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
at scala.collection.immutable.List.map(List.scala:286)
at org.apache.spark.sql.catalyst.plans.QueryPlan.org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$2(QueryPlan.scala:296)
at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$7.apply(QueryPlan.scala:301)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188)
at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUp(QueryPlan.scala:301)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:74)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:67)
at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:128)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:67)
at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:57)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:48)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:63)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
at com.news.spark.readfile.ReadFileTest$.main(ReadFileTest.scala:20)
at com.news.spark.readfile.ReadFileTest.main(ReadFileTest.scala)
20/01/09 17:42:52 INFO SparkContext: Invoking stop() from shutdown hook
20/01/09 17:42:52 INFO SparkUI: Stopped Spark web UI at http://10.26.50.72:4040
20/01/09 17:42:52 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
20/01/09 17:42:52 INFO MemoryStore: MemoryStore cleared
20/01/09 17:42:52 INFO BlockManager: BlockManager stopped
20/01/09 17:42:52 INFO BlockManagerMaster: BlockManagerMaster stopped
20/01/09 17:42:52 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
20/01/09 17:42:52 INFO SparkContext: Successfully stopped SparkContext
20/01/09 17:42:52 INFO ShutdownHookManager: Shutdown hook called
20/01/09 17:42:52 INFO ShutdownHookManager: Deleting directory /private/var/folders/lz/_bwh5ptx50vf60sl4qfjl5hw0000gn/T/spark-7f1d49b3-f338-49ae-9f30-0756ef8889e8
解决方式: 1.楼猪是将Excel文件转换成CSV文件的,转换过程中首行存在不明字符,看报错的bid前面有一个空格(粘贴处来的exception并没有空格,很诡异的问题,个别编译器显示的是·)。 2.个人感觉是spark2.1版本问题,将pom中的sparkversion改为2.2版本就可以了,应该是spark2进行了字符集的转换或者兼容,具体原因楼猪并未查明。