Hive3.1.3 运行SQL报错解决方案

项目环境

  • jdk 1.8
  • hive 3.1.3
  • hadoop 3.3.5

问题描述

执行SQL时,只要触发mapreduce任务的,都会报错
Caused by: java.lang.NoSuchFieldException: parentOffset
具体日志如下:

Query ID = root_20240505213023_d9ac3bfb-99fa-488e-8b3a-aba349596f45
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1714899015727_0005, Tracking URL = http://rocky.shine.cn:8088/proxy/application_1714899015727_0005/
Kill Command = /opt/soft/hadoop/bin/mapred job  -kill job_1714899015727_0005
Hadoop job information for Stage-1: number of mappers: 13; number of reducers: 0
2024-05-05 21:30:31,585 Stage-1 map = 0%,  reduce = 0%
2024-05-05 21:31:32,564 Stage-1 map = 0%,  reduce = 0%
2024-05-05 21:31:38,078 Stage-1 map = 100%,  reduce = 0%
Ended Job = job_1714899015727_0005 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1714899015727_0005_m_000004 (and more) from job job_1714899015727_0005
Examining task ID: task_1714899015727_0005_m_000005 (and more) from job job_1714899015727_0005

Task with the most failures(4):
-----
Task ID:
  task_1714899015727_0005_m_000004

URL:
  http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1714899015727_0005&tipid=task_1714899015727_0005_m_000004
-----
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset
    at org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.<init>(SerializationUtilities.java:388)
    at org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:234)
    at org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:51)
    at org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:278)
    at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:413)
    at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:335)
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:435)
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:881)
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:874)
    at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:716)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:176)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:445)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:350)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
    at java.base/java.security.AccessController.doPrivileged(Native Method)
    at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
Caused by: java.lang.NoSuchFieldException: parentOffset
    at java.base/java.lang.Class.getDeclaredField(Class.java:2411)
    at org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.<init>(SerializationUtilities.java:382)
    ... 17 more


FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

原因分析:

使用IDE打开相关的hive分支,找到报错代码,如下:

public ArrayListSubListSerializer() {
    try {
        final Class<?> clazz = Class.forName("java.util.ArrayList$SubList");
        _parentField = clazz.getDeclaredField("parent");
        _parentOffsetField = clazz.getDeclaredField( "parentOffset" );
        _sizeField = clazz.getDeclaredField( "size" );
        _parentField.setAccessible( true );
        _parentOffsetField.setAccessible( true );
        _sizeField.setAccessible( true );
    } catch (final Exception e) {
        throw new RuntimeException(e);
    }
}


这里是通过反射获取 ArrayList$SubList 的成员变量parentOffset ,但是在jdk1.8的代码中,SubList 并没有 parentOffset,而是offset

private static class SubList<E> extends AbstractList<E> implements RandomAccess {
    private final ArrayList<E> root;
    private final SubList<E> parent;
    private final int offset;
    private int size;

    ... 


因此报错原因是Hive 源码BUG,目前在Hive 3.0.X,3.1.X 版本中,都有相同错误,因此使用官方编译好的3.0/3.1版本都会有这样的报错。


解决方案:

  • 避免使用这两个Hive版本
  • 修复这个bug,重新编译打包

写在后面

其实官方代码中,已经注意到了这个问题,并在3.2版本中(分支是 branch-3 ),做了修复,但是目前是 snapshot ,还没有编译好的二进制包,如果一定要使用3.X,可以使用源码自行编译,或者等3.2发布稳定版本。

以下是3.2版本中的部分代码:

public ArrayListSubListSerializer() {
  try {
    final Class<?> clazz = Class.forName("java.util.ArrayList$SubList");
    _parentField = getParentField(clazz);
    _parentOffsetField = getOffsetField(clazz);
    _sizeField = clazz.getDeclaredField( "size" );
    _parentField.setAccessible( true );
    _parentOffsetField.setAccessible( true );
    _sizeField.setAccessible( true );
  } catch (final Exception e) {
    throw new RuntimeException(e);
  }
}

...

private static Field getOffsetField(Class<?> clazz) throws NoSuchFieldException {
  try {
    // up to jdk8 (which also has an "offset" field (we don't need) - therefore we check "parentOffset" first
    return clazz.getDeclaredField( "parentOffset" );
  } catch (NoSuchFieldException e) {
    // jdk9+ only has "offset" which is the parent offset
    return clazz.getDeclaredField( "offset" );
  }
}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

码上优数

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值