spark Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Re

在windows上运行spark2.0的ml算法报错:

Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:F:/program/MyPrograms/spark-warehouse

错误位置:

val spark = SparkSession
      .builder
      .appName("EstimatorTransformerParamExample")
      .getOrCreate()

查询后发现使用spark sql时需要指定数据库的文件地址。

但这里并没有使用spark sql。应该是使用SparkSession,取代了原本的SQLContext与HiveContext。

改为:

val spark = SparkSession
      .builder
      .appName("EstimatorTransformerParamExample")
      .config("spark.sql.warehouse.dir", "F:/program/MyPrograms/spark-warehouse")
      .master("local")
      .getOrCreate()

就好了。

或者

.config("spark.sql.warehouse.dir", "file:///F:/program/MyPrograms/spark-warehouse")

这个目录可以随便选择,直接使用file:///也可以

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 25/06/12 16:55:19 INFO SparkContext: Running Spark version 3.0.1 25/06/12 16:55:20 INFO ResourceUtils: ============================================================== 25/06/12 16:55:20 INFO ResourceUtils: Resources for spark.driver: 25/06/12 16:55:20 INFO ResourceUtils: ============================================================== 25/06/12 16:55:20 INFO SparkContext: Submitted application: casesql 25/06/12 16:55:20 INFO SecurityManager: Changing view acls to: 86187 25/06/12 16:55:20 INFO SecurityManager: Changing modify acls to: 86187 25/06/12 16:55:20 INFO SecurityManager: Changing view acls groups to: 25/06/12 16:55:20 INFO SecurityManager: Changing modify acls groups to: 25/06/12 16:55:20 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(86187); groups with view permissions: Set(); users with modify permissions: Set(86187); groups with modify permissions: Set() 25/06/12 16:55:23 INFO Utils: Successfully started service 'sparkDriver' on port 61072. 25/06/12 16:55:23 INFO SparkEnv: Registering MapOutputTracker 25/06/12 16:55:23 INFO SparkEnv: Registering BlockManagerMaster 25/06/12 16:55:23 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 25/06/12 16:55:23 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 25/06/12 16:55:23 INFO SparkEnv: Registering BlockManagerMasterHeartbeat 25/06/12 16:55:23 INFO DiskBlockManager: Created local directory at D:\temp\blockmgr-a1603d4c-9c4b-4c22-af32-4034aa14f5b1 25/06/12 16:55:23 INFO MemoryStore: MemoryStore started with capacity 1969.5 MiB 25/06/12 16:55:23 INFO SparkEnv: Registering OutputCommitCoordinator 25/06/12 16:55:23 INFO Utils: Successfully started service 'SparkUI' on port 4040. 25/06/12 16:55:23 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://karida:4040 25/06/12 16:55:23 INFO Executor: Starting executor ID driver on host karida 25/06/12 16:55:24 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 61088. 25/06/12 16:55:24 INFO NettyBlockTransferService: Server created on karida:61088 25/06/12 16:55:24 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 25/06/12 16:55:24 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, karida, 61088, None) 25/06/12 16:55:24 INFO BlockManagerMasterEndpoint: Registering block manager karida:61088 with 1969.5 MiB RAM, BlockManagerId(driver, karida, 61088, None) 25/06/12 16:55:24 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, karida, 61088, None) 25/06/12 16:55:24 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, karida, 61088, None) 25/06/12 16:55:25 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 241.5 KiB, free 1969.3 MiB) 25/06/12 16:55:25 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 23.4 KiB, free 1969.2 MiB) 25/06/12 16:55:25 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on karida:61088 (size: 23.4 KiB, free: 1969.5 MiB) 25/06/12 16:55:25 INFO SparkContext: Created broadcast 0 from textFile at sqldemo1.scala:10 25/06/12 16:55:26 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/C:/Users/86187/IdeaProjects/testspark-5.19/spark-warehouse'). 25/06/12 16:55:26 INFO SharedState: Warehouse path is 'file:/C:/Users/86187/IdeaProjects/testspark-5.19/spark-warehouse'. root |-- name: string (nullable = true) |-- age: integer (nullable = false) 25/06/12 16:55:28 INFO CodeGenerator: Code generated in 264.5137 ms Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Expected scheme-specific part at index 8: bigdata: at org.apache.hadoop.fs.Path.initialize(Path.java:205) at org.apache.hadoop.fs.Path.<init>(Path.java:171) at org.apache.hadoop.fs.Path.<init>(Path.java:93) at org.apache.hadoop.fs.Globber.glob(Globber.java:211) at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1676) at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:259) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:205) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:272) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:272) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:272) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:272) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:272) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:272) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:272) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:437) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:420) at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:47) at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3627) at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2697) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3618) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3616) at org.apache.spark.sql.Dataset.head(Dataset.scala:2697) at org.apache.spark.sql.Dataset.take(Dataset.scala:2904) at org.apache.spark.sql.Dataset.getRows(Dataset.scala:300) at org.apache.spark.sql.Dataset.showString(Dataset.scala:337) at org.apache.spark.sql.Dataset.show(Dataset.scala:824) at org.apache.spark.sql.Dataset.show(Dataset.scala:783) at org.apache.spark.sql.Dataset.show(Dataset.scala:792) at com.sql.sqldemo1$.main(sqldemo1.scala:17) at com.sql.sqldemo1.main(sqldemo1.scala) Caused by: java.net.URISyntaxException: Expected scheme-specific part at index 8: bigdata: at java.net.URI$Parser.fail(URI.java:2845) at java.net.URI$Parser.failExpecting(URI.java:2851) at java.net.URI$Parser.parse(URI.java:3054) at java.net.URI.<init>(URI.java:746) at org.apache.hadoop.fs.Path.initialize(Path.java:202) ... 56 more 25/06/12 16:55:28 INFO SparkContext: Invoking stop() from shutdown hook 25/06/12 16:55:28 INFO SparkUI: Stopped Spark web UI at http://karida:4040 25/06/12 16:55:28 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 25/06/12 16:55:28 INFO MemoryStore: MemoryStore cleared 25/06/12 16:55:28 INFO BlockManager: BlockManager stopped 25/06/12 16:55:28 INFO BlockManagerMaster: BlockManagerMaster stopped 25/06/12 16:55:28 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 25/06/12 16:55:28 INFO SparkContext: Successfully stopped SparkContext 25/06/12 16:55:28 INFO ShutdownHookManager: Shutdown hook called 25/06/12 16:55:28 INFO ShutdownHookManager: Deleting directory D:\temp\spark-53e8a44f-a1f9-44f0-8502-f7e06a974218 Process finished with exit code 1
最新发布
06-13
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值