Hive 使用 create table as 语句出现 java.io.IOException: Filesystem closed 异常

博主在新搭建的Hive环境测试时出现莫名错误,建表、查询起初正常,后出现文件IO异常。经排查,是特定类型sql导致Hive与Hdfs交互时文件未正常关闭。测试环境Hadoop和Hive版本均为3.1.3,通过修改hdfs配置文件、禁用缓存解决问题。

最近在使用新搭建的 Hive 环境进行测试的时候出现了莫名的错误信息 java.io.IOException: Filesystem closed,最一开始搭建的时候,进行建表、数据查询等操作都没有这个问题,直到最近才发现这个问题,具体的报错信息如下。

查看报错信息也没有找到具体的原因,只有文件 IO 异常。后来找到了原因,是 create table as select xxx 这种类型的 sql 导致 Hive 与 Hdfs 交互时文件未正常关闭,所以将问题以及解决方案在此记录一下。

**注:**测试环境使用的 Hadoop 和 Hive 的版本均为 3.1.3。

  • 错误栈
2023-05-17 15:21:55,304 ERROR [d59814d2-1cc5-40ca-8411-6bc07def1859 HiveServer2-Handler-Pool: Thread-971] parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2079)) - org.apache.hadoop.hive.ql.metadata.HiveException: Unable to determine if hdfs://hadoop101:8020/user/hive/warehouse/ad.db/tmp_coarse_parsed_log_3 is encrypted: java.io.IOException: Filesystem closed
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:2441)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getStrongestEncryptedTablePath(SemanticAnalyzer.java:2517)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getStagingDirectoryPathname(SemanticAnalyzer.java:2549)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2363)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2075)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:12033)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12129)
	at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330)
	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659)
	at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
	at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)
	at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768)
	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
	at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:197)
	at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:260)
	at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247)
	at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:541)
	at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:527)
	at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
	at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
	at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
	at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
	at com.sun.proxy.$Proxy39.executeStatementAsync(Unknown Source)
	at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:312)
	at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:562)
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
	at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: Filesystem closed
	at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:474)
	at org.apache.hadoop.hdfs.DFSClient.getEZForPath(DFSClient.java:2704)
	at org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2530)
	at org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2527)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:2546)
	at org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:356)
	at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.getEncryptionZoneForPath(Hadoop23Shims.java:1216)
	at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1211)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:2436)
	... 40 more

FAILED: SemanticException Unable to determine if hdfs://hadoop101:8020/user/hive/warehouse/ad.db/tmp_coarse_parsed_log_3 is encrypted: java.io.IOException: Filesystem closed
2023-05-17 15:21:55,304 ERROR [d59814d2-1cc5-40ca-8411-6bc07def1859 HiveServer2-Handler-Pool: Thread-971] ql.Driver (SessionState.java:printError(1250)) - FAILED: SemanticException Unable to determine if hdfs://hadoop101:8020/user/hive/warehouse/ad.db/tmp_coarse_parsed_log_3 is encrypted: java.io.IOException: Filesystem closed
org.apache.hadoop.hive.ql.parse.SemanticException: Unable to determine if hdfs://hadoop101:8020/user/hive/warehouse/ad.db/tmp_coarse_parsed_log_3 is encrypted: java.io.IOException: Filesystem closed
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2083)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:12033)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12129)
	at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330)
	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659)
	at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
	at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)
	at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768)
	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
	at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:197)
	at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:260)
	at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247)
	at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:541)
	at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:527)
	at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
	at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
	at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
	at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
	at com.sun.proxy.$Proxy39.executeStatementAsync(Unknown Source)
	at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:312)
	at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:562)
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
	at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to determine if hdfs://hadoop101:8020/user/hive/warehouse/ad.db/tmp_coarse_parsed_log_3 is encrypted: java.io.IOException: Filesystem closed
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:2441)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getStrongestEncryptedTablePath(SemanticAnalyzer.java:2517)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getStagingDirectoryPathname(SemanticAnalyzer.java:2549)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2363)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2075)
	... 36 more
Caused by: java.io.IOException: Filesystem closed
	at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:474)
	at org.apache.hadoop.hdfs.DFSClient.getEZForPath(DFSClient.java:2704)
	at org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2530)
	at org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2527)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:2546)
	at org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:356)
	at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.getEncryptionZoneForPath(Hadoop23Shims.java:1216)
	at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1211)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:2436)
	... 40 more
  • 解决方案

修改 hdfs 配置文件 $HADOOP_HOME/etc/hadoop/hdfs-site.xml,增加如下配置,禁用缓存解决这个问题。

<property>
    <name>fs.hdfs.impl.disable.cache</name>
    <value>true</value>
</property>
使用 Sqoop 将数据从关系型数据库(如 MySQL)导入 Hive 时,遇到 `java.io.IOException: Hive exited with status 1` 或 `java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf` 等错误通常是由以下原因引起的: ### 错误分析 - **Hive 脚本执行失败**:Sqoop 在导入 Hive 时会生成一个 `.hql` 文件并通过 Hive CLI 执行。如果 Hive CLI 执行过程中发生错误,会导致整个 Sqoop 作业失败,并抛出 `IOException`,其中包含 Hive 返回的状态码 1[^1]。 - **缺少 Hive 类路径依赖**:当运行 Sqoop 导入 Hive 的任务时,若缺少必要的 Hive JAR 包(例如 `hive-conf.jar`, `hive-exec.jar`),JVM 将无法加载所需的类,从而引发 `ClassNotFoundException` 异常[^2]。 - **Hive 配置文件缺失或配置不正确**:Hive 运行需要访问其配置文件(如 `hive-site.xml`)。如果这些文件未被正确放置在 Sqoop 的类路径中,则可能导致 Hive 初始化失败[^3]。 - **字段分隔符或行分隔符设置错误**:在导入过程中指定的字段或行分隔符与 Hive 表定义不匹配,也可能导致 Hive 脚本执行失败[^4]。 ### 解决方法 - **检查并复制 Hive 相关 JAR 包到 Sqoop lib 目录**:将 Hive 安装目录下 `lib` 文件夹中的所有 JAR 文件复制到 Sqoop 的 `lib` 目录中,确保 Sqoop 可以访问到所有必要的 Hive 库文件[^3]。 - **确认 Hive 配置文件存在且可访问**:将 Hive 的配置文件(尤其是 `hive-site.xml`)放置在 Sqoop 的 `conf` 目录中,或者通过命令行参数指定其位置,确保 Sqoop 能够正确读取 Hive 的配置信息。 - **验证 Hive 脚本内容**:检查 Sqoop 自动生成的 `.hql` 文件,确认其语法和逻辑是否正确,特别是表结构定义、字段类型以及分区信息等。 - **检查字段与行分隔符设置**:确保在 Sqoop 命令中指定的 `--fields-terminated-by` 和 `--lines-terminated-by` 参数与目标 Hive 表的存储格式一致。例如: ```bash sqoop import \ --connect jdbc:mysql://localhost:3306/dbname \ --username user \ -P \ --table tablename \ --hive-import \ --fields-terminated-by ',' \ --lines-terminated-by '\n' \ --null-string 'null' \ -m 1 ``` - **查看 Hive 日志定位具体错误**:Hive 执行失败时的日志通常会记录更详细的错误信息。可以通过查看 Hive 的日志文件(如 `/tmp/<user>/hive.log`)来进一步诊断问题。 ###
评论 1
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值