第6.1.2章 hive on spark问题汇总

本文记录了使用 Hive on Spark 过程中遇到的各种问题及其解决方法,包括版本兼容性问题、日志配置问题、连接异常等,并提供了详细的异常信息和调试步骤。

将使用过程中遇到的问题,汇总下来以免1个月之后就忘了。程序人生是短暂,新人总会将前人拍倒再沙滩上,只能默默转型,将技术慢慢的移交给年轻人,不从正面竞争,才能保证自己立足之地的稳固。
hive与spark的匹配版本汇总,从这篇文章知道了hive和spark存在兼容性,如果想要知道hive引用hive的版本,看hive引用的pom就可以。我这里使用的hive版本是2.3.2,故选择spark2.0.2
1 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create spark client.
这个问题,是我通过hive on spark,将数据从hive写入到elasticsearch时任务启动时出现的。
hive-log4j2.properties中查看到hive的日志路径property.hive.log.dir = ${sys:java.io.tmpdir}/${sys:user.name}, 如果hive装在root用户,则
1
hive on spark 遇到的坑,这篇文章作者认为是hive与spark版本的问题,但是hive2.3.2和spark2.0.2在开发环境是验证通过的,测试环境使用同样的版本,却出现这个问题,我只好猜测是某个配置出了问题。详细异常信息如下:

Warning: Ignoring non-spark config property: hive.spark.client.rpc.threads=8
Warning: Ignoring non-spark config property: hive.spark.client.connect.timeout=1000
Warning: Ignoring non-spark config property: hive.spark.client.secret.bits=256
Warning: Ignoring non-spark config property: hive.spark.client.rpc.max.size=52428800
Running Spark using the REST application submission protocol.

	at org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:212) ~[hive-exec-2.3.2.jar:2.3.2]
	at org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:503) ~[hive-exec-2.3.2.jar:2.3.2]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]
2018-04-17T01:39:48,451  WARN [Driver] client.SparkClientImpl: Child process exited with code 137
2018-04-17T01:39:48,578 ERROR [6aceaa43-4b2a-4d69-82a7-1ad2bacd5e5f main] spark.SparkTask: Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client.
	at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:64)
	at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:115)
	at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:126)
	at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:103)
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183)
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
	at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client 'a0d9c70c-2852-4fd4-baf8-60164c002394'. Error: Child process exited before connecting back with error log Warning: Ignoring non-spark config property: hive.spark.client.server.connect.timeout=90000

2 Unsupported major.minor version 51.0
当执行命令./start-all.sh提示下面的异常

Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/spark/launcher/Main : Unsupported major.minor version 51.0
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:643)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
	at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
Could not find the main class: org.apache.spark.launcher.Main. Program will exit.

大数据工具:Spark配置遇到的坑,这篇文章分析可能是因为ssh登录导致了java环境的丢失,于是我在spark-env.sh中添加了下面的。

export JAVA_HOME=/usr/java/jdk1.7.0_79

这个时候出现新的问题,按照搭建Spark所遇过的坑这篇文章的解决方案
export SPARK_DIST_CLASSPATH=$(hadoop classpath),可以我已经添加了呀,这又是为啥呢?

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
	at org.apache.spark.deploy.master.Master$.main(Master.scala:1008)
	at org.apache.spark.deploy.master.Master.main(Master.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	... 2 more

这篇文章spark2.0.1使用过程中常见问题汇总讲了spark的java版本与作业编译的java版本不一致,可惜没告诉我们怎么调整一致的。
3 java.lang.IllegalStateException: unread block data
我按照解决:Spark-HBASE Error Caused by: java.lang.IllegalStateException: unread block data,但是问题依旧
2
Hive-Spark error - java.lang.IllegalStateException: unread block data
, 这篇文章提供了一些思路,但是他是针对Spark 1.4.1 and Hive 1.2.1,现在的版本中spark-env.sh已经没有SPARK_CLASSPATH的配置项了。
有人说在hive的hive-site.xml中添加,经尝试后,依然无效。

 <property>
    <name>spark.driver.extraClassPath</name>
    <value>$SPARK_HOME/lib/mysql-connector-java-5.1.34.jar:$SPARK_HOME/lib/hbase-annotations-1.1.4.jar:$SPARK_HOME/lib/hbase-client-1.1.4.jar:$SPARK_HOME/lib/hbase-common-1.1.4.jar:$SPARK_HOME/lib/hbase-hadoop2-compat-1.1.4.jar:$SPARK_HOME/lib/hbase-hadoop-compat-1.1.4.jar:$SPARK_HOME/lib/hbase-protocol-1.1.4.jar:$SPARK_HOME/lib/hbase-server-1.1.4.jar:$SPARK_HOME/lib/hive-hbase-handler-2.3.2.jar:$SPARK_HOME/lib/htrace-core-3.1.0-incubating.jar</value>
  </property>

  <property>
    <name>spark.executor.extraClassPath</name>
    <value>$SPARK_HOME/lib/mysql-connector-java-5.1.34.jar:$SPARK_HOME/lib/hbase-annotations-1.1.4.jar:$SPARK_HOME/lib/hbase-client-1.1.4.jar:$SPARK_HOME/lib/hbase-common-1.1.4.jar:$SPARK_HOME/lib/hbase-hadoop2-compat-1.1.4.jar:$SPARK_HOME/lib/hbase-hadoop-compat-1.1.4.jar:$SPARK_HOME/lib/hbase-protocol-1.1.4.jar:$SPARK_HOME/lib/hbase-server-1.1.4.jar:$SPARK_HOME/lib/hive-hbase-handler-2.3.2.jar:$SPARK_HOME/lib/htrace-core-3.1.0-incubating.jar</value>
  </property>

4 Worker: Failed to connect to master master:7077
继续上面的问题,跟踪spark的日志,启动spark的时候,出现下面的异常

18/04/17 19:35:17 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3eabb48a{/metrics/json,null,AVAILABLE}
18/04/17 19:35:17 WARN worker.Worker: Failed to connect to master master:7077
org.apache.spark.SparkException: Exception thrown in awaitResult
	at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
	at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:88)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:96)
	at org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters$1$$anon$1.run(Worker.scala:216)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Failed to connect to master/192.168.4.61:7077
	at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
	at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)
	at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197)
	at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:191)
	at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
	... 4 more
Caused by: java.net.ConnectException: Connection refused: master/192.168.4.61:7077
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
	... 1 more
18/04/17 19:35:32 INFO worker.Worker: Retrying connection to master (attempt # 1)
18/04/17 19:35:32 INFO worker.Worker: Connecting to master master:7077...
18/04/17 19:35:32 INFO client.TransportClientFactory: Successfully created connection to master/192.168.4.61:7077 after 1 ms (0 ms spent in bootstraps)
18/04/17 19:35:32 INFO worker.Worker: Successfully registered with master spark://master:7077

关于Spark报错不能连接到Server的解决办法(Failed to connect to master master_hostname:7077),这篇文章或许可以解决问题,但是他部署的是standalone模式,而我是yarn模式
这个问题可以在spark-env.sh中设置,在slave节点设置master地址,就可以解决了。

export SPARK_MASTER_HOST=192.168.4.61
2025-06-18 17:18:39,870 ERROR [d060e22e-cda6-4077-a0a4-e57fe3c83ce4 main] spark.SparkTask: Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark client for Spark session a3c4d743-e03a-4a74-bf14-1a02b609f7b3)' org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create Spark client for Spark session a3c4d743-e03a-4a74-bf14-1a02b609f7b3 at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.getHiveException(SparkSessionImpl.java:221) at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:92) at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:115) at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:136) at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:115) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:318) at org.apache.hadoop.util.RunJar.main(RunJar.java:232) Caused by: java.lang.NoClassDefFoundError: org/apache/spark/SparkConf at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.generateSparkConf(HiveSparkClientFactory.java:263) at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:98) at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:76) at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:87) ... 24 more Caused by: java.lang.ClassNotFoundException: org.apache.spark.SparkConf at java.net.URLClassLoader.findClass(URLClassLoader.java:387) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:359) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 28 more
06-20
<think>我们正在解决HiveSpark集成时出现的NoClassDefFoundError和ClassNotFoundException问题,特别是关于org/apache/spark/SparkConf的错误。根据引用[2]:出现java.lang.NoClassDefFoundError:org/apache/spark/SparkConf,通常是因为缺少Spark相关的依赖,或者版本不匹配。引用[2]还指出,需要确保引入的Spark依赖(如spark-core,spark-streaming,spark-mllib)的版本与Spark安装目录下jars目录中的jar包版本一致,尤其是Scala版本(如2.11.0)和Spark版本(如2.1.0)要匹配。另外,引用[3]提到了在提交Spark作业时使用--driver-class-path参数来添加额外的jar包,但同时也警告不要在spark-env.sh中配置SPARK_CLASSPATH的同时又使用--driver-class-path,否则会导致异常。因此,针对HiveSpark集成的问题,我们可以从以下几个方面着手:1.确保HiveSpark依赖版本与集群中安装的Spark版本一致(包括Spark版本和Scala版本)。2.检查Hive的配置中是否正确设置了Spark相关的环境变量和类路径。3.在提交作业时,注意避免重复设置类路径。具体步骤:步骤1:检查Spark版本和Scala版本在Spark的安装目录下,查看jars目录中的jar包,比如spark-core_2.11-2.1.0.jar,则说明Scala版本是2.11Spark版本是2.1.0。在Hive中,需要确保引入的Spark依赖也是相同的版本。步骤2:在Hive中配置Spark依赖如果Hive需要与Spark集成,通常需要在Hive的配置中指定Spark的依赖。可以通过以下方式:-Spark的jar包添加到Hive的类路径中。可以在Hive的配置文件(如hive-env.sh)中设置HIVE_AUX_JARS_PATH,指向包含Sparkjar包的目录(例如SPARK_HOME/jars)。-或者,在启动Hive之前,设置环境变量:exportHIVE_AUX_JARS_PATH=$SPARK_HOME/jars步骤3:避免类路径设置冲突引用[3]提到,不要同时在spark-env.sh中配置SPARK_CLASSPATH和在提交作业时使用--driver-class-path参数。在Hive中调用Spark时,同样要注意避免重复设置类路径。步骤4:检查HiveSpark相关配置在hive-site.xml中,需要正确设置Spark相关的属性,例如:-hive.execution.engine=spark-spark.home=/path/to/spark-spark.master=yarn或local(根据实际情况)步骤5:确保Hadoop和HBase的依赖(如果需要)也正确包含如果作业还涉及到HBase,那么需要将HBase的jar包也添加到类路径中,如引用[3]所示。但同样要注意版本匹配和避免重复。常见错误场景:-版本不匹配:Hive引用的Spark依赖版本与集群中安装的Spark版本不一致。-缺少依赖:没有将Spark的jar包添加到Hive的类路径中。-类路径冲突:重复设置了类路径。示例解决方案:假设我们的Spark安装目录为/opt/spark,Scala版本为2.11Spark版本为2.4.0。在Hive的配置文件hive-env.sh中添加:exportHIVE_AUX_JARS_PATH=/opt/spark/jars在hive-site.xml中设置:<property><name>hive.execution.engine</name><value>spark</value></property><property><name>spark.home</name><value>/opt/spark</value></property>然后重启Hive服务。如果问题仍然存在,可以尝试在启动Hive客户端时直接指定类路径:hive--auxpath/opt/spark/jars/*或者,如果是在Beeline中,可以设置:!connectjdbc:hive2://...--hiveconfhive.aux.jars.path=/opt/spark/jars/*注意:以上路径需要根据实际安装位置调整。如果出现其他类找不到的错误,比如HBase相关的类,同样需要将对应的jar包添加到类路径中,但要注意避免版本冲突。最后,确保集群中的所有节点都有相同的配置,特别是当Hive运行在分布式模式时。如果问题复杂,可能需要查看详细的错误日志,定位具体缺失的类。根据引用[1],Spark提供了“HadoopFree”构建,如果遇到Hadoop相关类的问题,可以考虑使用这种构建,但这里我们的问题主要是Spark类找不到,所以重点还是版本一致性和类路径配置。</think>### HiveSpark集成时出现NoClassDefFoundError问题的解决方案 当HiveSpark集成时出现`java.lang.NoClassDefFoundError: org/apache/spark/SparkConf`或类似类加载错误,通常由以下原因导致: #### 核心问题分析 1. **类路径缺失**:Hive执行引擎未找到Spark的核心JAR包(如spark-core) 2. **版本不匹配**:Hive引用的Spark库版本与集群实际安装的Spark版本不一致 3. **依赖冲突**:多个路径重复添加Spark依赖导致类加载冲突 4. **环境配置错误**:关键环境变量(如SPARK_HOME)未正确配置 --- ### 解决方案步骤 #### 步骤1:验证版本兼容性 - **检查版本匹配**: ```bash # 查看Hive内置Spark依赖版本 ls $HIVE_HOME/lib | grep spark-core # 查看集群实际Spark版本 ls $SPARK_HOME/jars | grep spark-core ``` 确保两个版本**完全一致**(包括Scala小版本),例如: `spark-core_2.12-3.3.0.jar` ≠ `spark-core_2.11-3.3.0.jar` #### 步骤2:正确配置类路径 - **方案A:通过环境变量配置**(推荐) 在`$HIVE_HOME/conf/hive-env.sh`中添加: ```bash export SPARK_HOME=/path/to/your/spark # 指向实际Spark安装目录 export HIVE_AUX_JARS_PATH=$SPARK_HOME/jars ``` - **方案B:提交时动态指定**(临时调试) ```bash hive --hiveconf hive.aux.jars.path=$SPARK_HOME/jars/* ``` #### 步骤3:避免类路径冲突 - **关键原则**:不要同时使用以下两种配置方式 - ❌ 禁止在`spark-env.sh`设置`SPARK_CLASSPATH` - ❌ 禁止在提交作业时使用`--driver-class-path` 否则会导致类加载冲突(参考引用[3]) #### 步骤4:验证Hive配置 在`hive-site.xml`中确保包含: ```xml <property> <name>hive.execution.engine</name> <value>spark</value> </property> <property> <name>spark.home</name> <value>/path/to/your/spark</value> <!--SPARK_HOME一致 --> </property> ``` #### 步骤5:处理特殊依赖 若涉及HBase等组件,需手动添加其JAR: ```bash export HIVE_AUX_JARS_PATH=$SPARK_HOME/jars:/path/to/hbase-client.jar:/path/to/htrace-core.jar ``` 但需确保**所有节点路径一致**(引用[3]) --- ### 典型错误场景修复 **案例1SparkConf类找不到** ```log java.lang.NoClassDefFoundError: org/apache/spark/SparkConf ``` **修复**:将`$SPARK_HOME/jars`目录下所有JAR添加到Hive类路径 **案例2:HBase相关类缺失** ```log java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration ``` **修复**: ```bash export HIVE_AUX_JARS_PATH=$SPARK_HOME/jars:$HBASE_HOME/lib/* ``` --- ### 验证方法 1. 启动Hive CLI: ```bash hive --service cli ``` 2. 执行Spark引擎测试: ```sql SET hive.execution.engine=spark; CREATE TABLE test(id int); INSERT INTO test VALUES(1); -- 触发Spark作业 ``` 3. 检查日志: ```bash tail -f /tmp/$USER/hive.log | grep -i 'spark\|error' ``` > **重要提示**:修改配置后需重启Hive Server和MetaStore服务
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

warrah

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值