spark-submit 错误: ava.lang.ClassNotFoundException: WordCount

本文介绍了一个简单的Spark WordCount程序实现过程,通过Scala语言编写并在Eclipse环境下运行。文章详细展示了如何解决使用spark-submit命令时遇到的ClassNotFoundException问题,并给出了正确的提交格式。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

今天整了一上午,终于在spark上跑出来了这个程序。


在eclipse上编了个简单Scala程序,code如下

package spark.wordcount

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

object WordCount {
  def main(args: Array[String]) {
    val infile = "/input" // Should be some file on your system
    val conf = new SparkConf().setAppName("word count")
    val sc = new SparkContext(conf)
    val indata = sc.textFile(infile, 2).cache()
    val words = indata.flatMap(line => line.split(" ")).map(word => (word,1)).reduceByKey((a,b) => (a+b))
    words.saveAsTextFile("/output")
    println("All words are counted!")
  }
}

用spark-submit,走起:

[root@sparkmaster bin]# ./spark-submit --class WordCount /opt/spark-wordcount-in-scala.jar 
java.lang.ClassNotFoundException: WordCount
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.spark.util.Utils$.classForName(Utils.scala:174)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:689)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


琢磨了好久,bing了好多,都没有答案,最后Google了好一通,说可能跟package name有关,于是尝试下面的提交方式:

[root@sparkmaster bin]# ./spark-submit --class spark.wordcount.WordCount  /opt/spark-wordcount-in-scala.jar 

终于走起了!


所以,--class后接的格式应该是packageName.objectName。


2025-07-16 20:16:21 INFO 25/07/16 20:16:21 [main] WARN ObserverReadProxyProvider: Invocation returned exception on [lfrz-10k-11-131-27-168.hadoop.jd.local/11.131.27.168:8029]; 1 failure(s) so far 2025-07-16 20:16:21 INFO org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): org.apache.hadoop.ipc.StandbyException: Observer handler process timeout - error 2025-07-16 20:16:21 INFO at org.apache.hadoop.ipc.Server$ObserverProcessTimeOutException.<clinit>(Server.java:3215) 2025-07-16 20:16:21 INFO Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): org.apache.hadoop.ipc.StandbyException: Observer handler process timeout - error 2025-07-16 20:16:21 INFO at org.apache.hadoop.ipc.Server$ObserverProcessTimeOutException.<clinit>(Server.java:3215) 2025-07-16 20:16:21 INFO at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:543) 2025-07-16 20:16:22 INFO 25/07/16 20:16:22 [main] WARN ObserverReadProxyProvider: Invocation returned exception on [lfrz-10k-11-131-29-137.hadoop.jd.local/11.131.29.137:8029]; 2 failure(s) so far 2025-07-16 20:16:22 INFO org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): org.apache.hadoop.ipc.StandbyException: Observer handler process timeout - error 2025-07-16 20:16:22 INFO at org.apache.hadoop.ipc.Server$ObserverProcessTimeOutException.<clinit>(Server.java:3215) 2025-07-16 20:16:22 INFO Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): org.apache.hadoop.ipc.StandbyException: Observer handler process timeout - error 2025-07-16 20:16:22 INFO at org.apache.hadoop.ipc.Server$ObserverProcessTimeOutException.<clinit>(Server.java:3215) 2025-07-16 20:16:22 INFO at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:543) 2025-07-16 20:16:25 INFO To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS.; line 136 pos 23; 2025-07-16 20:16:28 INFO log4j:ERROR setFile(null,false) call failed. 2025-07-16 20:16:28 INFO java.nio.file.FileSystemException: /: Operation not permitted 2025-07-16 20:16:28 INFO at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91) 2025-07-16 20:16:28 INFO at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) 2025-07-16 20:16:28 INFO at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) 2025-07-16 20:17:09 INFO 25/07/16 20:17:09 [main] ERROR SparkSQLDriver: Failed in [ 2025-07-16 20:17:09 INFO java.util.concurrent.ExecutionException: java.lang.ClassNotFoundException: Failed to find data source: hudi. Please find packages at http://spark.apache.org/third-party-projects.html 2025-07-16 20:17:09 INFO Caused by: java.lang.ClassNotFoundException: Failed to find data source: hudi. Please find packages at http://spark.apache.org/third-party-projects.html 2025-07-16 20:17:09 INFO Caused by: java.lang.ClassNotFoundException: hudi.DefaultSource 2025-07-16 20:17:09 INFO java.util.concurrent.ExecutionException: java.lang.ClassNotFoundException: Failed to find data source: hudi. Please find packages at http://spark.apache.org/third-party-projects.html 2025-07-16 20:17:09 INFO Caused by: java.lang.ClassNotFoundException: Failed to find data source: hudi. Please find packages at http://spark.apache.org/third-party-projects.html 2025-07-16 20:17:09 INFO Caused by: java.lang.ClassNotFoundException: hudi.DefaultSource raise Exception('SQL执行失败,具体原因请查看前面的日志。') Exception: SQL执行失败,具体原因请查看前面的日志。
最新发布
07-17
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

飞鸿踏雪Ben归来

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值