Hive多表连接异常,java.lang.ArrayIndexOutOfBounds :140,官方Bug,在3.0.0版本已经被解决了

探讨Hive在多表连接及使用limit时出现的序列化Bug,涉及列裁剪导致的数据字段丢失问题,提供官方解决方案链接。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

后续官方解决方案:https://issues.apache.org/jira/browse/HIVE-14564

异常详细情况

2019-02-28 16:33:44,429 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 1
2019-02-28 16:33:44,429 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigning container Container: [ContainerId: container_e25_1551269222015_0034_01_000005, NodeId: bigdata001:45454, NodeHttpAddress: bigdata001:8042, Resource: <memory:2048, vCores:1>, Priority: 5, Token: Token { kind: ContainerToken, service: 192.168.30.230:45454 }, ] to fast fail map
2019-02-28 16:33:44,431 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned from earlierFailedMaps
2019-02-28 16:33:44,432 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_e25_1551269222015_0034_01_000005 to attempt_1551269222015_0034_m_000000_3
2019-02-28 16:33:44,432 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=<memory:43008, vCores:1>
2019-02-28 16:33:44,432 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start threshold not met. completedMapsForReduceSlowstart 1
2019-02-28 16:33:44,432 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:4 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:4 ContRel:0 HostLocal:0 RackLocal:1
2019-02-28 16:33:44,432 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved bigdata001 to /default-rack
2019-02-28 16:33:44,433 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1551269222015_0034_m_000000_3 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED
2019-02-28 16:33:44,433 INFO [ContainerLauncher #6] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_e25_1551269222015_0034_01_000005 taskAttempt attempt_1551269222015_0034_m_000000_3
2019-02-28 16:33:44,433 INFO [ContainerLauncher #6] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1551269222015_0034_m_000000_3
2019-02-28 16:33:44,434 INFO [ContainerLauncher #6] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: Opening proxy : bigdata001:45454
2019-02-28 16:33:44,441 INFO [ContainerLauncher #6] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1551269222015_0034_m_000000_3 : 13562
2019-02-28 16:33:44,441 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1551269222015_0034_m_000000_3] using containerId: [container_e25_1551269222015_0034_01_000005 on NM: [bigdata001:45454]
2019-02-28 16:33:44,442 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1551269222015_0034_m_000000_3 TaskAttempt Transitioned from ASSIGNED to RUNNING
2019-02-28 16:33:45,434 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1551269222015_0034: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:43008, vCores:1> knownNMs=3
2019-02-28 16:33:45,785 INFO [Socket Reader #1 for port 35318] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1551269222015_0034 (auth:SIMPLE)
2019-02-28 16:33:45,796 INFO [IPC Server handler 4 on 35318] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1551269222015_0034_m_27487790694405 asked for a task
2019-02-28 16:33:45,796 INFO [IPC Server handler 4 on 35318] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1551269222015_0034_m_27487790694405 given task: attempt_1551269222015_0034_m_000000_3
2019-02-28 16:33:47,740 INFO [IPC Server handler 1 on 35318] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1551269222015_0034_m_000000_3 is : 0.0
2019-02-28 16:33:47,743 FATAL [IPC Server handler 2 on 35318] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1551269222015_0034_m_000000_3 - exited : java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row [Error getting row data with exception java.lang.ArrayIndexOutOfBoundsException: -1746617499
    at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
    at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:183)
    at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
    at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:202)
    at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
    at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:354)
    at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:198)
    at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:184)
    at org.apache.hadoop.hive.ql.exec.MapOperator.toErrorMessage(MapOperator.java:588)
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:557)
    at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
 ]
    at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row [Error getting row data with exception java.lang.ArrayIndexOutOfBoundsException: -1746617499
    at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
    at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:183)
    at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
    at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:202)
    at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
    at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:354)
    at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:198)
    at org.apache.hadoop.hive.ser
### 解决Hive中 `Unable to instantiate` 的异常问题 遇到 `FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient` 错误通常意味着 Hive 无法实例化会话元存储客户端。这可能是由于种原因引起的,包括但不限于配置不正确、依赖库缺失或版本兼容性问题。 #### 配置调整 一种常见的解决方案是进入 `hive/conf` 文件夹并编辑 `hive-site.xml` 文件。具体操作为注释掉可能导致冲突的参数设置[^4]: ```xml <!-- <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> --> ``` 另一种有效的方法是在 `hive-site.xml` 中添加特定属性来指定 Metastore URI: ```xml <property> <name>hive.metastore.uris</name> <value>thrift://hive服务器的IP地址:9083</value> </property> ``` 上述更改有助于确保 Hive 客户端能够正确连接到远程Metastore服务[^5]。 #### 数据库重置尝试 如果之前的操作未能解决问题,则可以考虑清理现有的 MySQL 元数据库 (metastore),然后通过命令 `schematool -initSchema -dbType mysql` 来重新初始化它。不过需要注意的是,在执行此操作前应备份现有数据以防意外丢失。 #### 版本一致性检查 确认所使用的 HadoopHive 组件之间的版本相互匹配也很重要。不同版本之间可能存在API变化或其他差异,这些都可能引发此类错误。因此建议查阅官方文档以获取推荐搭配列,并据此调整环境中的软件版本[^1]。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值