配置的问题

博客内容提到了在使用Hive时遇到的'java.lang.Integer cannot be cast to [B'错误,该错误是由于Hive的向量化优化引擎在处理Parquet文件时的一个bug。为了解决这个问题,建议关闭Hive的向量化执行开关,设置'hive.vectorized.execution.enabled=false',并考虑将数据格式转换为ORC。同时,文章也提醒在Spark读取Hive配置并从MySQL获取数据写入Hive时,要配置元数据服务'hive.metastore.uris',以避免版本冲突和潜在问题。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

2020-09-06 17:59:10,424 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1599375175649_0007_m_000000_0: Error: java.io.IOException: java.io.IOException: java.lang.ClassCastException: java.lang.Integer cannot be cast to [B
	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
	at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:232)
	at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:205)
	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:191)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
	at org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
Caused by: java.io.IOException: java.lang.ClassCastException: java.lang.Integer cannot be cast to [B
	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
	at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
	at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
	at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
	... 12 more
Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to [B
	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.fillColumnVector(VectorizedListColumnReader.java:307)
	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.convertValueListToListColumnVector(VectorizedListColumnReader.java:340)
	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.readBatch(VectorizedListColumnReader.java:90)
	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedMapColumnReader.readBatch(VectorizedMapColumnReader.java:57)
	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:414)
	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:357)
	at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:93)
	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
	... 16 more

java.lang.Integer cannot be cast to [B  
此错是hive中向量化优化引擎对parquet文件支持中的存在的一个bug
 


解决方案:
关闭hive的向量化引擎优化开关
set hive.vectorized.execution.enabled=false;

将源表文件格式改成orc

spark读取hive配置文件,从MySQL读数据写入到hive

配置元数据服务hive.metastore.uris

不配置文件

spark会调用jar包内的程序读取MySQL,会把MySQL保存的hive版本更改为spark集成的版本,容易版本冲突

解决办法

配置元数据服务,更改版本

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值