org/apache/hadoop/util/ShutdownHookManager$2

本文分析了在尝试关闭Tomcat时遇到的一个特定错误,详细解释了错误信息中提到的NoClassDefFoundError和ClassNotFoundException的原因,并探讨了可能的解决方案。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

关闭tomcat出现以下错误:
Exception in thread "Thread-15" java.lang.NoClassDefFoundError: org/apache/hadoop/util/ShutdownHookManager$2
 at org.apache.hadoop.util.ShutdownHookManager.getShutdownHooksInOrder(ShutdownHookManager.java:183)
 at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:64)
Caused by: java.lang.ClassNotFoundException: Illegal access: this web application instance has been stopped already. Could not load [org.apache.hadoop.util.ShutdownHookManager$2]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access.
 at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForClassLoading(WebappClassLoaderBase.java:1285)
 at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1142)
 at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1104)
 ... 2 more
Caused by: java.lang.IllegalStateException: Illegal access: this web application instance has been stopped already. Could not load [org.apache.hadoop.util.ShutdownHookManager$2]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access.
 at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForResourceLoading(WebappClassLoaderBase.java:1295)
 at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForClassLoading(WebappClassLoaderBase.java:1283)
 ... 4 more
2025-05-12 15:46:34,855 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT] 2025-05-12 15:46:34,950 INFO namenode.NameNode: createNameNode [-format] 2025-05-12 15:46:35,138 ERROR conf.Configuration: error parsing conf yarn-site.xml com.ctc.wstx.exc.WstxParsingException: String '--' not allowed in comment (missing '>'?) at [row,col,system-id]: [9,5,"file:/opt/module/hadoop-3.2.4/etc/hadoop/yarn-site.xml"] at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:634) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:504) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:488) at com.ctc.wstx.sr.BasicStreamReader.skipCommentOrCData(BasicStreamReader.java:3579) at com.ctc.wstx.sr.BasicStreamReader.skipToken(BasicStreamReader.java:3430) at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2071) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1179) at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3336) at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3130) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3023) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2984) at org.apache.hadoop.conf.Configuration.loadProps(Configuration.java:2862) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2844) at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1254) at org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1532) at org.apache.hadoop.security.Groups.<init>(Groups.java:113) at org.apache.hadoop.security.Groups.<init>(Groups.java:102) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:451) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:337) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:304) at org.apache.hadoop.security.UserGroupInformation.isAuthenticationMethodEnabled(UserGroupInformation.java:392) at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:386) at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1184) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1680) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1790) 2025-05-12 15:46:35,142 ERROR namenode.NameNode: Failed to start namenode. java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: String '--' not allowed in comment (missing '>'?) at [row,col,system-id]: [9,5,"file:/opt/module/hadoop-3.2.4/etc/hadoop/yarn-site.xml"] at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3040) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2984) at org.apache.hadoop.conf.Configuration.loadProps(Configuration.java:2862) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2844) at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1254) at org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1532) at org.apache.hadoop.security.Groups.<init>(Groups.java:113) at org.apache.hadoop.security.Groups.<init>(Groups.java:102) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:451) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:337) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:304) at org.apache.hadoop.security.UserGroupInformation.isAuthenticationMethodEnabled(UserGroupInformation.java:392) at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:386) at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1184) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1680) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1790) Caused by: com.ctc.wstx.exc.WstxParsingException: String '--' not allowed in comment (missing '>'?) at [row,col,system-id]: [9,5,"file:/opt/module/hadoop-3.2.4/etc/hadoop/yarn-site.xml"] at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:634) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:504) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:488) at com.ctc.wstx.sr.BasicStreamReader.skipCommentOrCData(BasicStreamReader.java:3579) at com.ctc.wstx.sr.BasicStreamReader.skipToken(BasicStreamReader.java:3430) at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2071) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1179) at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3336) at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3130) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3023) ... 16 more 2025-05-12 15:46:35,145 INFO util.ExitUtil: Exiting with status 1: java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: String '--' not allowed in comment (missing '>'?) at [row,col,system-id]: [9,5,"file:/opt/module/hadoop-3.2.4/etc/hadoop/yarn-site.xml"] 2025-05-12 15:46:35,147 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at master/192.168.164.123 ************************************************************/ 2025-05-12 15:46:35,168 ERROR conf.Configuration: error parsing conf yarn-site.xml com.ctc.wstx.exc.WstxParsingException: String '--' not allowed in comment (missing '>'?) at [row,col,system-id]: [9,5,"file:/opt/module/hadoop-3.2.4/etc/hadoop/yarn-site.xml"] at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:634) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:504) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:488) at com.ctc.wstx.sr.BasicStreamReader.skipCommentOrCData(BasicStreamReader.java:3579) at com.ctc.wstx.sr.BasicStreamReader.skipToken(BasicStreamReader.java:3430) at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2071) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1179) at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3336) at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3130) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3023) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2984) at org.apache.hadoop.conf.Configuration.loadProps(Configuration.java:2862) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2844) at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812) at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789) at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) at org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) at org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) Exception in thread "Thread-1" java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: String '--' not allowed in comment (missing '>'?) at [row,col,system-id]: [9,5,"file:/opt/module/hadoop-3.2.4/etc/hadoop/yarn-site.xml"] at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3040) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2984) at org.apache.hadoop.conf.Configuration.loadProps(Configuration.java:2862) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2844) at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812) at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789) at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) at org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) at org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) Caused by: com.ctc.wstx.exc.WstxParsingException: String '--' not allowed in comment (missing '>'?) at [row,col,system-id]: [9,5,"file:/opt/module/hadoop-3.2.4/etc/hadoop/yarn-site.xml"] at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:634) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:504) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:488) at com.ctc.wstx.sr.BasicStreamReader.skipCommentOrCData(BasicStreamReader.java:3579) at com.ctc.wstx.sr.BasicStreamReader.skipToken(BasicStreamReader.java:3430) at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2071) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1179) at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3336) at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3130) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3023) ... 10 more
05-13
import os from pyspark.sql import SparkSession from pyspark.ml.classification import LogisticRegression from pyspark.ml.evaluation import MulticlassClassificationEvaluator from T2 import spark # 设置环境变量 spark_files = 'D:\spark实训\spark_files' os.environ['PYSPARK_PYTHON'] = os.path.join(spark_files, 'spark_virenvs\python.exe') os.environ['SPARK_HOME'] = os.path.join(spark_files, 'spark-3.4.4-bin-hadoop3') os.environ['JAVA_HOME'] = os.path.join(spark_files, 'Java\jdk-18.0.1') # 加载数据集 train_data = spark.read.csv("D:\spark大数据快速运算大作业\数据集\题目二\data_train.txt", inferSchema=True, header=False) test_data = spark.read.csv("D:\spark大数据快速运算大作业\数据集\题目二\data_test.txt", inferSchema=True, header=False) # 数据预处理 from pyspark.ml.feature import VectorAssembler assembler = VectorAssembler( inputCols=[f"_{i}" for i in range(1, 55)], # 前54列为特征 outputCol="features" ) train = assembler.transform(train_data).withColumnRenamed("_55", "label") test = assembler.transform(test_data).withColumnRenamed("_55", "label") from pyspark.ml.classification import RandomForestClassifier rf = RandomForestClassifier( numTrees=100, maxDepth=10, seed=42, labelCol="label", featuresCol="features" ) rf_model = rf.fit(train) rf_predictions = rf_model.transform(test) lr = LogisticRegression( maxIter=100, regParam=0.01, family="multinomial", # 多分类设置 labelCol="label", featuresCol="features" ) lr_model = lr.fit(train) lr_predictions = lr_model.transform(test) evaluator = MulticlassClassificationEvaluator( labelCol="label", predictionCol="prediction", metricName="accuracy" ) rf_accuracy = evaluator.evaluate(rf_predictions) lr_accuracy = evaluator.evaluate(lr_predictions) # F1分数评估 evaluator.setMetricName("f1") rf_f1 = evaluator.evaluate(rf_predictions) lr_f1 = evaluator.evaluate(lr_predictions) # 模型训练与评估完整流程 def train_evaluate(model, train_data, test_data): model = model.fit(train_data) predictions = model.transform(test_data) # 计算评估指标 accuracy = evaluator.setMetricName("accuracy").evaluate(predictions) f1 = evaluator.setMetricName("f1").evaluate(predictions) # 特征重要性(随机森林特有) if isinstance(model, RandomForestClassificationModel): importances = model.featureImportances print("Top 5 features:", importances.values.argsort()[-5:][::-1]) return accuracy, f1 # 执行比较 rf_acc, rf_f1 = train_evaluate(rf, train, test) lr_acc, lr_f1 = train_evaluate(lr, train, test)D:\spark实训\spark_files\spark_virenvs\python.exe D:\spark大数据快速运算大作业\22.py 25/06/12 17:24:08 WARN Shell: Did not find winutils.exe: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 25/06/12 17:24:08 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 正在加载数据... 训练随机森林模型... 随机森林训练时间: 21.42秒 随机森林评估结果: 准确率: 0.6507 F1分数: 0.6208 训练逻辑回归模型... 25/06/12 17:24:38 WARN InstanceBuilder: Failed to load implementation from:dev.ludovic.netlib.blas.JNIBLAS 逻辑回归训练时间: 14.64秒 逻辑回归评估结果: 准确率: 0.6712 F1分数: 0.6558 ===== 模型比较 ===== 随机森林训练时间: 21.42秒 vs 逻辑回归训练时间: 14.64秒 随机森林准确率: 0.6507 vs 逻辑回归准确率: 0.6712 随机森林F1分数: 0.6208 vs 逻辑回归F1分数: 0.6558 随机森林特征重要性(前10): 特征 0: 0.3683 特征 13: 0.2188 特征 35: 0.0529 特征 23: 0.0500 特征 25: 0.0368 特征 51: 0.0336 特征 52: 0.0281 特征 17: 0.0279 特征 5: 0.0244 特征 36: 0.0236 25/06/12 17:24:49 ERROR Instrumentation: java.lang.RuntimeException: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:735) at org.apache.hadoop.util.Shell.getSetPermissionCommand(Shell.java:270) at org.apache.hadoop.util.Shell.getSetPermissionCommand(Shell.java:286) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:978) at org.apache.hadoop.fs.RawLocalFileSystem.mkOneDirWithMode(RawLocalFileSystem.java:660) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:700) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:672) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:699) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:672) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:699) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:672) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:699) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:672) at org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.java:788) at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:356) at org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:131) at org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:265) at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupJob(HadoopMapReduceCommitProtocol.scala:188) at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:79) at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopDataset$1(PairRDDFunctions.scala:1091) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1089) at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopFile$4(PairRDDFunctions.scala:1062) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1027) at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopFile$3(PairRDDFunctions.scala:1009) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1008) at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopFile$2(PairRDDFunctions.scala:965) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:963) at org.apache.spark.rdd.RDD.$anonfun$saveAsTextFile$2(RDD.scala:1593) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1593) at org.apache.spark.rdd.RDD.$anonfun$saveAsTextFile$1(RDD.scala:1579) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1579) at org.apache.spark.ml.util.DefaultParamsWriter$.saveMetadata(ReadWrite.scala:413) at org.apache.spark.ml.Pipeline$SharedReadWrite$.$anonfun$saveImpl$1(Pipeline.scala:250) at org.apache.spark.ml.Pipeline$SharedReadWrite$.$anonfun$saveImpl$1$adapted(Pipeline.scala:247) at org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:191) at scala.util.Try$.apply(Try.scala:213) at org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:191) at org.apache.spark.ml.Pipeline$SharedReadWrite$.saveImpl(Pipeline.scala:247) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.saveImpl(Pipeline.scala:346) at org.apache.spark.ml.util.MLWriter.save(ReadWrite.scala:168) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.super$save(Pipeline.scala:344) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.$anonfun$save$4(Pipeline.scala:344) at org.apache.spark.ml.MLEvents.withSaveInstanceEvent(events.scala:174) at org.apache.spark.ml.MLEvents.withSaveInstanceEvent$(events.scala:169) at org.apache.spark.ml.util.Instrumentation.withSaveInstanceEvent(Instrumentation.scala:42) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.$anonfun$save$3(Pipeline.scala:344) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.$anonfun$save$3$adapted(Pipeline.scala:344) at org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:191) at scala.util.Try$.apply(Try.scala:213) at org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:191) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.save(Pipeline.scala:344) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) at py4j.ClientServerConnection.run(ClientServerConnection.java:106) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems at org.apache.hadoop.util.Shell.fileNotFoundException(Shell.java:547) at org.apache.hadoop.util.Shell.getHadoopHomeDir(Shell.java:568) at org.apache.hadoop.util.Shell.getQualifiedBin(Shell.java:591) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:688) at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79) at org.apache.hadoop.conf.Configuration.getTimeDurationHelper(Configuration.java:1907) at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1867) at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1840) at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) at org.apache.hadoop.util.ShutdownHookManager$HookEntry.<init>(ShutdownHookManager.java:207) at org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:304) at org.apache.spark.util.SparkShutdownHookManager.install(ShutdownHookManager.scala:181) at org.apache.spark.util.ShutdownHookManager$.shutdownHooks$lzycompute(ShutdownHookManager.scala:50) at org.apache.spark.util.ShutdownHookManager$.shutdownHooks(ShutdownHookManager.scala:48) at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153) at org.apache.spark.util.ShutdownHookManager$.<init>(ShutdownHookManager.scala:58) at org.apache.spark.util.ShutdownHookManager$.<clinit>(ShutdownHookManager.scala) at org.apache.spark.util.Utils$.createTempDir(Utils.scala:341) at org.apache.spark.util.Utils$.createTempDir(Utils.scala:331) at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:370) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. at org.apache.hadoop.util.Shell.checkHadoopHomeInner(Shell.java:467) at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:438) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:515) ... 23 more 25/06/12 17:24:49 ERROR Instrumentation: java.lang.RuntimeException: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:735) at org.apache.hadoop.util.Shell.getSetPermissionCommand(Shell.java:270) at org.apache.hadoop.util.Shell.getSetPermissionCommand(Shell.java:286) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:978) at org.apache.hadoop.fs.RawLocalFileSystem.mkOneDirWithMode(RawLocalFileSystem.java:660) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:700) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:672) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:699) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:672) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:699) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:672) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:699) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:672) at org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.java:788) at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:356) at org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:131) at org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:265) at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupJob(HadoopMapReduceCommitProtocol.scala:188) at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:79) at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopDataset$1(PairRDDFunctions.scala:1091) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1089) at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopFile$4(PairRDDFunctions.scala:1062) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1027) at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopFile$3(PairRDDFunctions.scala:1009) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1008) at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopFile$2(PairRDDFunctions.scala:965) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:963) at org.apache.spark.rdd.RDD.$anonfun$saveAsTextFile$2(RDD.scala:1593) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1593) at org.apache.spark.rdd.RDD.$anonfun$saveAsTextFile$1(RDD.scala:1579) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1579) at org.apache.spark.ml.util.DefaultParamsWriter$.saveMetadata(ReadWrite.scala:413) at org.apache.spark.ml.Pipeline$SharedReadWrite$.$anonfun$saveImpl$1(Pipeline.scala:250) at org.apache.spark.ml.Pipeline$SharedReadWrite$.$anonfun$saveImpl$1$adapted(Pipeline.scala:247) at org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:191) at scala.util.Try$.apply(Try.scala:213) at org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:191) at org.apache.spark.ml.Pipeline$SharedReadWrite$.saveImpl(Pipeline.scala:247) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.saveImpl(Pipeline.scala:346) at org.apache.spark.ml.util.MLWriter.save(ReadWrite.scala:168) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.super$save(Pipeline.scala:344) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.$anonfun$save$4(Pipeline.scala:344) at org.apache.spark.ml.MLEvents.withSaveInstanceEvent(events.scala:174) at org.apache.spark.ml.MLEvents.withSaveInstanceEvent$(events.scala:169) at org.apache.spark.ml.util.Instrumentation.withSaveInstanceEvent(Instrumentation.scala:42) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.$anonfun$save$3(Pipeline.scala:344) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.$anonfun$save$3$adapted(Pipeline.scala:344) at org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:191) at scala.util.Try$.apply(Try.scala:213) at org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:191) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.save(Pipeline.scala:344) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) at py4j.ClientServerConnection.run(ClientServerConnection.java:106) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems at org.apache.hadoop.util.Shell.fileNotFoundException(Shell.java:547) at org.apache.hadoop.util.Shell.getHadoopHomeDir(Shell.java:568) at org.apache.hadoop.util.Shell.getQualifiedBin(Shell.java:591) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:688) at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79) at org.apache.hadoop.conf.Configuration.getTimeDurationHelper(Configuration.java:1907) at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1867) at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1840) at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) at org.apache.hadoop.util.ShutdownHookManager$HookEntry.<init>(ShutdownHookManager.java:207) at org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:304) at org.apache.spark.util.SparkShutdownHookManager.install(ShutdownHookManager.scala:181) at org.apache.spark.util.ShutdownHookManager$.shutdownHooks$lzycompute(ShutdownHookManager.scala:50) at org.apache.spark.util.ShutdownHookManager$.shutdownHooks(ShutdownHookManager.scala:48) at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153) at org.apache.spark.util.ShutdownHookManager$.<init>(ShutdownHookManager.scala:58) at org.apache.spark.util.ShutdownHookManager$.<clinit>(ShutdownHookManager.scala) at org.apache.spark.util.Utils$.createTempDir(Utils.scala:341) at org.apache.spark.util.Utils$.createTempDir(Utils.scala:331) at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:370) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. at org.apache.hadoop.util.Shell.checkHadoopHomeInner(Shell.java:467) at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:438) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:515) ... 23 more Traceback (most recent call last): File "D:\spark大数据快速运算大作业\22.py", line 6, in <module> from T2 import spark File "D:\spark大数据快速运算大作业\T2.py", line 113, in <module> rf_model.write().overwrite().save("models/random_forest_model") File "D:\spark实训\spark_files\spark_virenvs\lib\site-packages\pyspark\ml\util.py", line 197, in save self._jwrite.save(path) File "D:\spark实训\spark_files\spark_virenvs\lib\site-packages\py4j\java_gateway.py", line 1322, in __call__ return_value = get_return_value( File "D:\spark实训\spark_files\spark_virenvs\lib\site-packages\pyspark\errors\exceptions\captured.py", line 169, in deco return f(*a, **kw) File "D:\spark实训\spark_files\spark_virenvs\lib\site-packages\py4j\protocol.py", line 326, in get_return_value raise Py4JJavaError( py4j.protocol.Py4JJavaError: An error occurred while calling o963.save. : java.lang.RuntimeException: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:735) at org.apache.hadoop.util.Shell.getSetPermissionCommand(Shell.java:270) at org.apache.hadoop.util.Shell.getSetPermissionCommand(Shell.java:286) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:978) at org.apache.hadoop.fs.RawLocalFileSystem.mkOneDirWithMode(RawLocalFileSystem.java:660) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:700) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:672) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:699) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:672) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:699) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:672) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:699) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:672) at org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.java:788) at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:356) at org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:131) at org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:265) at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupJob(HadoopMapReduceCommitProtocol.scala:188) at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:79) at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopDataset$1(PairRDDFunctions.scala:1091) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1089) at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopFile$4(PairRDDFunctions.scala:1062) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1027) at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopFile$3(PairRDDFunctions.scala:1009) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1008) at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopFile$2(PairRDDFunctions.scala:965) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:963) at org.apache.spark.rdd.RDD.$anonfun$saveAsTextFile$2(RDD.scala:1593) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1593) at org.apache.spark.rdd.RDD.$anonfun$saveAsTextFile$1(RDD.scala:1579) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:405) at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1579) at org.apache.spark.ml.util.DefaultParamsWriter$.saveMetadata(ReadWrite.scala:413) at org.apache.spark.ml.Pipeline$SharedReadWrite$.$anonfun$saveImpl$1(Pipeline.scala:250) at org.apache.spark.ml.Pipeline$SharedReadWrite$.$anonfun$saveImpl$1$adapted(Pipeline.scala:247) at org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:191) at scala.util.Try$.apply(Try.scala:213) at org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:191) at org.apache.spark.ml.Pipeline$SharedReadWrite$.saveImpl(Pipeline.scala:247) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.saveImpl(Pipeline.scala:346) at org.apache.spark.ml.util.MLWriter.save(ReadWrite.scala:168) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.super$save(Pipeline.scala:344) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.$anonfun$save$4(Pipeline.scala:344) at org.apache.spark.ml.MLEvents.withSaveInstanceEvent(events.scala:174) at org.apache.spark.ml.MLEvents.withSaveInstanceEvent$(events.scala:169) at org.apache.spark.ml.util.Instrumentation.withSaveInstanceEvent(Instrumentation.scala:42) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.$anonfun$save$3(Pipeline.scala:344) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.$anonfun$save$3$adapted(Pipeline.scala:344) at org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:191) at scala.util.Try$.apply(Try.scala:213) at org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:191) at org.apache.spark.ml.PipelineModel$PipelineModelWriter.save(Pipeline.scala:344) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) at py4j.ClientServerConnection.run(ClientServerConnection.java:106) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems at org.apache.hadoop.util.Shell.fileNotFoundException(Shell.java:547) at org.apache.hadoop.util.Shell.getHadoopHomeDir(Shell.java:568) at org.apache.hadoop.util.Shell.getQualifiedBin(Shell.java:591) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:688) at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79) at org.apache.hadoop.conf.Configuration.getTimeDurationHelper(Configuration.java:1907) at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1867) at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1840) at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) at org.apache.hadoop.util.ShutdownHookManager$HookEntry.<init>(ShutdownHookManager.java:207) at org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:304) at org.apache.spark.util.SparkShutdownHookManager.install(ShutdownHookManager.scala:181) at org.apache.spark.util.ShutdownHookManager$.shutdownHooks$lzycompute(ShutdownHookManager.scala:50) at org.apache.spark.util.ShutdownHookManager$.shutdownHooks(ShutdownHookManager.scala:48) at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153) at org.apache.spark.util.ShutdownHookManager$.<init>(ShutdownHookManager.scala:58) at org.apache.spark.util.ShutdownHookManager$.<clinit>(ShutdownHookManager.scala) at org.apache.spark.util.Utils$.createTempDir(Utils.scala:341) at org.apache.spark.util.Utils$.createTempDir(Utils.scala:331) at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:370) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. at org.apache.hadoop.util.Shell.checkHadoopHomeInner(Shell.java:467) at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:438) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:515) ... 23 more Process finished with exit code 1
最新发布
06-18
2025-06-12 06:45:06,534 ERROR org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[Thread-1,5,main] threw an Throwable, but we are shutting down, so ignoring this java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple roots (start tag in epilog?). at [row,col,system-id]: [19,2,"file:/app/hadoop3.1/etc/hadoop/yarn-site.xml"] at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3024) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2968) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848) at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812) at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789) at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) at org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) at org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple roots (start tag in epilog?). at [row,col,system-id]: [19,2,"file:/app/hadoop3.1/etc/hadoop/yarn-site.xml"] at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:621) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:491) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:475) at com.ctc.wstx.sr.BasicStreamReader.handleExtraRoot(BasicStreamReader.java:2242) at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2156) at com.ctc.wstx.sr.BasicStreamReader.closeContentTree(BasicStreamReader.java:2991) at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2734) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1123) at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3320) at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3114) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3007) ... 9 more
06-13
/home/hadoopmaster/jdk1.8.0_161/bin/java -javaagent:/home/hadoopmaster/idea-IC-221.6008.13/lib/idea_rt.jar=34515:/home/hadoopmaster/idea-IC-221.6008.13/bin -Dfile.encoding=UTF-8 -classpath /home/hadoopmaster/jdk1.8.0_161/jre/lib/charsets.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/deploy.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/cldrdata.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/dnsns.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/jaccess.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/jfxrt.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/localedata.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/nashorn.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/sunec.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/sunjce_provider.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/sunpkcs11.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/zipfs.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/javaws.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/jce.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/jfr.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/jfxswt.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/jsse.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/management-agent.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/plugin.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/resources.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/rt.jar:/root/IdeaProjects/kkk/out/production/kkk:/home/hadoopmaster/scala-2.12.15/lib/scala-reflect.jar:/home/hadoopmaster/scala-2.12.15/lib/scala-xml_2.12-1.0.6.jar:/home/hadoopmaster/scala-2.12.15/lib/scala-parser-combinators_2.12-1.0.7.jar:/home/hadoopmaster/scala-2.12.15/lib/scala-swing_2.12-2.0.3.jar:/home/hadoopmaster/scala-2.12.15/lib/scala-library.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/xz-1.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jta-1.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jpam-1.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/json-1.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/ST4-4.0.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/guice-3.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/ivy-2.5.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/oro-2.0.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/blas-2.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/core-1.1.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/gson-2.2.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/tink-1.6.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/avro-1.10.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jsp-api-2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/okio-1.14.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/opencsv-2.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/shims-0.9.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/xmlenc-0.52.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/arpack-2.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/guava-14.0.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jetty-6.1.26.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jline-2.14.6.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jsr305-3.0.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/lapack-2.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/log4j-1.2.17.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/minlog-1.3.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/stream-2.9.6.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/velocity-1.5.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/generex-1.0.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hk2-api-2.6.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/janino-3.0.16.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jdo-api-3.0.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/objenesis-2.6.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/paranamer-2.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/py4j-0.10.9.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/pyrolite-4.30.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/HikariCP-2.5.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-io-2.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-cli-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/javax.inject-1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/libfb303-0.9.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/lz4-java-1.7.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/okhttp-3.12.12.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/snakeyaml-1.27.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/stax-api-1.0.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/JTransforms-3.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/aopalliance-1.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/avro-ipc-1.10.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/breeze_2.12-1.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-cli-1.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-net-3.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/derby-10.14.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-jdbc-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hk2-utils-2.6.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/httpcore-4.4.14.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jaxb-api-2.2.11.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jersey-hk2-2.34.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jodd-core-3.5.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/orc-core-1.6.12.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/super-csv-2.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/xml-apis-1.4.01.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/zookeeper-3.6.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/JLargeArrays-1.5.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/activation-1.1.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/automaton-1.11-8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-dbcp-1.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-lang-2.6.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-text-1.6.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-serde-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-shims-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/javolution-5.5.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/libthrift-0.12.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/orc-shims-1.6.12.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/slf4j-api-1.7.30.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/zjsonpatch-0.3.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/zstd-jni-1.5.0-4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/chill-java-0.10.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/chill_2.12-0.10.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/guice-servlet-3.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-auth-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-hdfs-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-common-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hk2-locator-2.6.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/httpclient-4.5.13.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-xc-1.9.13.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jetty-util-6.1.26.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/joda-time-2.10.10.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kryo-shaded-4.0.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/metrics-jmx-4.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/metrics-jvm-4.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/rocksdbjni-6.20.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spire_2.12-0.17.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/xercesImpl-2.12.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/aircompressor-0.21.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/algebra_2.12-2.0.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/annotations-17.0.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/antlr4-runtime-4.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/api-util-1.0.0-M20.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/arrow-format-2.0.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/arrow-vector-2.0.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/avro-mapred-1.10.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-codec-1.15.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-pool-1.5.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/compress-lzf-1.0.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-beeline-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/javax.jdo-3.2.0-m3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jaxb-runtime-2.3.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jersey-client-2.34.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jersey-common-2.34.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jersey-server-2.34.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/leveldbjni-all-1.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/metrics-core-4.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/metrics-json-4.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/RoaringBitmap-0.9.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/antlr-runtime-3.5.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-math3-3.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-client-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-common-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-core-2.12.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/javassist-3.25.0-GA.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jul-to-slf4j-1.7.30.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/protobuf-java-2.5.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/snappy-java-1.1.8.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/transaction-api-1.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/bonecp-0.8.0.RELEASE.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-crypto-1.1.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-digester-1.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-lang3-3.12.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/curator-client-2.7.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-exec-2.3.9-core.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-metastore-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-jaxrs-1.9.13.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jakarta.inject-2.6.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/orc-mapreduce-1.6.12.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/scala-xml_2.12-1.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/shapeless_2.12-2.3.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/slf4j-log4j12-1.7.30.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-sql_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/threeten-extra-1.5.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/zookeeper-jute-3.6.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-compress-1.21.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-logging-1.1.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/curator-recipes-2.7.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-yarn-api-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-shims-0.23-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jcl-over-slf4j-1.7.30.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/parquet-column-1.12.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/parquet-common-1.12.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/parquet-hadoop-1.12.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/scala-library-2.12.15.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/scala-reflect-2.12.15.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-core_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-hive_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-repl_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-tags_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-yarn_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/api-asn1-api-1.0.0-M20.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/breeze-macros_2.12-1.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/cats-kernel_2.12-2.1.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-httpclient-3.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/flatbuffers-java-1.9.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-llap-common-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-service-rpc-3.1.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-storage-api-2.7.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jetty-sslengine-6.1.26.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/metrics-graphite-4.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/netty-all-4.1.68.Final.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/parquet-jackson-1.12.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/scala-compiler-2.12.15.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-mesos_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-mllib_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spire-util_2.12-0.17.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/xbean-asm9-shaded-4.20.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/apacheds-i18n-2.0.0-M15.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/arpack_combined_all-0.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/arrow-memory-core-2.0.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-beanutils-1.9.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-compiler-3.0.16.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/curator-framework-2.7.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/datanucleus-core-4.1.17.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-shims-common-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-core-asl-1.9.13.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-databind-2.12.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jakarta.ws.rs-api-2.1.6.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-client-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/macro-compat_2.12-1.1.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/parquet-encoding-1.12.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-graphx_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-sketch_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-unsafe_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/univocity-parsers-2.9.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/arrow-memory-netty-2.0.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/datanucleus-rdbms-4.1.19.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-annotations-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-yarn-client-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-yarn-common-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-kvstore_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spire-macros_2.12-0.17.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-collections-3.2.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-configuration-1.6.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/datanucleus-api-jdo-4.2.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-mapper-asl-1.9.13.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jakarta.servlet-api-4.0.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/json4s-ast_2.12-3.7.0-M11.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-catalyst_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-launcher_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/audience-annotations-0.5.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-shims-scheduler-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-vector-code-gen-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-annotations-2.12.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jakarta.xml.bind-api-2.3.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/json4s-core_2.12-3.7.0-M11.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-streaming_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spire-platform_2.12-0.17.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-apps-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-core-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-node-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-rbac-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/logging-interceptor-3.12.12.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/mesos-1.4.0-shaded-protobuf.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/osgi-resource-locator-1.0.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-kubernetes_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-tags_2.12-3.2.1-tests.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/aopalliance-repackaged-2.6.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/htrace-core-3.1.0-incubating.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/istack-commons-runtime-3.0.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jakarta.annotation-api-1.3.5.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jakarta.validation-api-2.0.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/json4s-scalap_2.12-3.7.0-M11.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-batch-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-mllib-local_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jersey-container-servlet-2.34.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/json4s-jackson_2.12-3.7.0-M11.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-common-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-events-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-policy-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-dataformat-yaml-2.12.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-datatype-jsr310-2.11.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-metrics-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-yarn-server-common-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-network-common_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-module-scala_2.12-2.12.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-discovery-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/parquet-format-structures-1.12.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-network-shuffle_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/apacheds-kerberos-codec-2.0.0-M15.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-mapreduce-client-app-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-extensions-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-networking-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-scheduling-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-mapreduce-client-core-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-yarn-server-web-proxy-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jersey-container-servlet-core-2.34.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-autoscaling-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-flowcontrol-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/scala-collection-compat_2.12-2.1.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-hive-thriftserver_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-certificates-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-coordination-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-storageclass-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/scala-parser-combinators_2.12-1.1.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-mapreduce-client-common-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-apiextensions-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-mapreduce-client-shuffle-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-mapreduce-client-jobclient-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-admissionregistration-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar kkk.WordCount Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 25/06/02 20:17:03 INFO SparkContext: Running Spark version 3.2.1 25/06/02 20:17:04 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 25/06/02 20:17:04 INFO ResourceUtils: ============================================================== 25/06/02 20:17:04 INFO ResourceUtils: No custom resources configured for spark.driver. 25/06/02 20:17:04 INFO ResourceUtils: ============================================================== 25/06/02 20:17:04 INFO SparkContext: Submitted application: WordCount 25/06/02 20:17:04 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0) 25/06/02 20:17:04 INFO ResourceProfile: Limiting resource is cpu 25/06/02 20:17:04 INFO ResourceProfileManager: Added ResourceProfile id: 0 25/06/02 20:17:04 INFO SecurityManager: Changing view acls to: root 25/06/02 20:17:04 INFO SecurityManager: Changing modify acls to: root 25/06/02 20:17:04 INFO SecurityManager: Changing view acls groups to: 25/06/02 20:17:04 INFO SecurityManager: Changing modify acls groups to: 25/06/02 20:17:04 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 25/06/02 20:17:05 INFO Utils: Successfully started service 'sparkDriver' on port 41615. 25/06/02 20:17:05 INFO SparkEnv: Registering MapOutputTracker 25/06/02 20:17:05 INFO SparkEnv: Registering BlockManagerMaster 25/06/02 20:17:05 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 25/06/02 20:17:05 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 25/06/02 20:17:05 INFO SparkEnv: Registering BlockManagerMasterHeartbeat 25/06/02 20:17:05 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-68486311-3f69-48e8-8f69-e7afffcf5979 25/06/02 20:17:05 INFO MemoryStore: MemoryStore started with capacity 258.5 MiB 25/06/02 20:17:05 INFO SparkEnv: Registering OutputCommitCoordinator 25/06/02 20:17:06 INFO Utils: Successfully started service 'SparkUI' on port 4040. 25/06/02 20:17:06 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://hadoopmaster:4040 25/06/02 20:17:06 INFO Executor: Starting executor ID driver on host hadoopmaster 25/06/02 20:17:06 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41907. 25/06/02 20:17:06 INFO NettyBlockTransferService: Server created on hadoopmaster:41907 25/06/02 20:17:06 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 25/06/02 20:17:06 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, hadoopmaster, 41907, None) 25/06/02 20:17:06 INFO BlockManagerMasterEndpoint: Registering block manager hadoopmaster:41907 with 258.5 MiB RAM, BlockManagerId(driver, hadoopmaster, 41907, None) 25/06/02 20:17:06 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, hadoopmaster, 41907, None) 25/06/02 20:17:06 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, hadoopmaster, 41907, None) 25/06/02 20:17:08 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 244.0 KiB, free 258.2 MiB) 25/06/02 20:17:09 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 23.4 KiB, free 258.2 MiB) 25/06/02 20:17:09 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on hadoopmaster:41907 (size: 23.4 KiB, free: 258.5 MiB) 25/06/02 20:17:09 INFO SparkContext: Created broadcast 0 from textFile at WordCount.scala:9 Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/home/hadoopmaster/words.txt at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:205) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:300) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:296) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:300) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:296) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:300) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:296) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:300) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:296) at org.apache.spark.Partitioner$.$anonfun$defaultPartitioner$4(Partitioner.scala:78) at org.apache.spark.Partitioner$.$anonfun$defaultPartitioner$4$adapted(Partitioner.scala:78) at scala.collection.immutable.List.map(List.scala:293) at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:78) at org.apache.spark.rdd.PairRDDFunctions.$anonfun$reduceByKey$4(PairRDDFunctions.scala:322) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:414) at org.apache.spark.rdd.PairRDDFunctions.reduceByKey(PairRDDFunctions.scala:322) at kkk.WordCount$.main(WordCount.scala:10) at kkk.WordCount.main(WordCount.scala) 25/06/02 20:17:09 INFO SparkContext: Invoking stop() from shutdown hook 25/06/02 20:17:09 INFO SparkUI: Stopped Spark web UI at http://hadoopmaster:4040 25/06/02 20:17:09 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 25/06/02 20:17:09 INFO MemoryStore: MemoryStore cleared 25/06/02 20:17:09 INFO BlockManager: BlockManager stopped 25/06/02 20:17:09 INFO BlockManagerMaster: BlockManagerMaster stopped 25/06/02 20:17:09 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 25/06/02 20:17:10 INFO SparkContext: Successfully stopped SparkContext 25/06/02 20:17:10 INFO ShutdownHookManager: Shutdown hook called 25/06/02 20:17:10 INFO ShutdownHookManager: Deleting directory /tmp/spark-213e3a91-227c-4e0a-8254-ab5c0f00786d Process finished with exit code 1
06-04
<think>嗯,用户遇到了Spark作业中经典的输入路径错误问题。从描述看,用户在本地路径file:/home/hadoopmaster/words.txt运行WordCount时触发了InvalidInputException。有趣的是,用户已经引用了站内资料说明HDFS路径要求,但似乎仍在尝试使用本地文件系统路径。用户可能是个Spark初学者,正在本地搭建环境测试WordCount示例。ta的困惑点在于:为什么教程里写的file://路径实际运行时报错?深层需求其实是理解Spark文件读取机制的核心差异——HDFS分布式存储vs本地文件系统。需要特别强调spark-shell默认行为这个关键点,因为很多新手会忽略shell环境预置的HDFS配置。注意到用户引用了5个相关资料,其中[1][2]明确指出HDFS路径要求,[5]还展示了正确的RDD操作代码。这说明用户做过功课但未能贯通应用。解决方案需要分层呈现:1)最直接的workaround是用spark.read.textFile("file:///...")重写路径2)更专业的做法是通过SparkConf配置fs.defaultFS覆盖默认HDFS地址3)生产环境规范应该先用hdfsdfs-put上传考虑增加演示substitute机制的场景代码,因为用户提到的是绝对路径失效问题。对于/home开头的Linux路径,要强调file://需要三个斜杠的特殊语法。最后用表格对比local模式与集群模式的文件协议差异会更直观——毕竟visuallearner更容易记住对比表格。</think> Spark 作业中出现 `Input path does not exist` 错误的根本原因是 **Spark 默认从 HDFS 读取文件**,而非本地文件系统(除非显式指定)。以下是详细解决方案: --- ### **问题原因分析** 1. **Spark 默认文件系统 (HDFS)** Spark Shell/作业默认使用 `hdfs://` 协议,路径指向 HDFS(如 `hdfs://localhost:9000/user/hadoop/`),若文件未上传至 HDFS 则报错。 2. **本地路径协议未生效** `file:/home/hadoopmaster/words.txt` 未触发 `file://` 协议解析,导致路径被错误映射到 HDFS。 --- ### **解决方案** #### **方法 1:上传文件到 HDFS(推荐)** ```bash # 上传本地文件到 HDFS hdfs dfs -put /home/hadoopmaster/words.txt /user/hadoop/words.txt # Spark 代码中改用 HDFS 路径 val textFile = sc.textFile("hdfs://localhost:9000/user/hadoop/words.txt") ``` ✅ **优势**:符合分布式计算规范,适用于集群环境[^1][^2]。 --- #### **方法 2:显式指定本地文件协议** 在 Spark 代码中 **强制使用 `file://` 前缀**: ```scala // 明确声明本地文件路径 val textFile = sc.textFile("file:///home/hadoopmaster/words.txt") ``` ⚠️ **注意**: - 仅限单机模式(Local Mode)或 Driver 节点本地文件 - 集群模式下需确保所有节点均存在相同路径 --- #### **方法 3:修改默认文件系统(测试环境适用)** 在 `spark-shell` 启动时覆盖默认配置: ```bash spark-shell --conf spark.hadoop.fs.defaultFS=file:/// ``` 启动后直接读取本地路径: ```scala val textFile = sc.textFile("/home/hadoopmaster/words.txt") // 自动解析为 file:// ``` --- ### **完整 WordCount 修正示例** ```scala // 正确读本地文件(需显式声明协议) val textFile = sc.textFile("file:///home/hadoopmaster/words.txt") val wordCounts = textFile .flatMap(line => line.split(" ")) .map(word => (word, 1)) .reduceByKey(_ + _) wordCounts.saveAsTextFile("output") // 结果保存到 HDFS 的 output 目录 ``` --- ### **关键验证步骤** 1. **检查文件是否存在** ```bash hdfs dfs -ls /user/hadoop/words.txt # 验证 HDFS ls /home/hadoopmaster/words.txt # 验证本地 ``` 2. **查看 Spark 日志** 日志中应显示正确协议: ```log INFO HadoopRDD: Input Path: file:///home/hadoopmaster/words.txt ``` --- ### **错误排查表** | 现象 | 原因 | 修复方式 | |------|------|----------| | `hdfs://localhost:9000/... not exist` | 路径被映射到 HDFS | 添加 `file://` 前缀或上传 HDFS | | `File does not exist` | 本地路径错误/权限不足 | 检查路径拼写及文件权限 | | 空结果集 | 文件无内容或分隔符错误 | 验证文件格式和 `split` 逻辑[^5] | > **生产环境建议**:始终使用 HDFS 存储输入数据,避免本地路径带来的集群兼容性问题[^2]。 ---
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值