Hive执行load data [local] inpath 'path' [overwrite ] into table table_name报Invalid path问题

本文解析了使用Hive客户端beeline执行数据导入时遇到的一个奇怪错误,并详细说明了解决方案。当尝试从本地路径加载数据到Hive表时,会因为Hive服务器和客户端所在机器不一致而导致找不到文件的问题。文章提供了两种解决方案:一是将文件上传到Hive服务器所在机器;二是推荐的做法,即先将文件上传至HDFS,再通过HDFS URI进行加载。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

今天使用Hive客户端 beeline 执行数据导入,怪异的报错:

beeline>load data local inpath '/home/hhuang/2018-05-01_14.txt' overwrite into table data_idea_id_hourly  partition ( dt = '2018-05-27',hour = '14' );

Error: Error while compiling statement: FAILED: SemanticException Line 1:23 Invalid path ''/home/hhuang/2018-05-01_14.txt'': No files matching path file:/home/hhuang/2018-05-01_14.txt (state=42000,code=40000)

报错找不到文件,确定的诗当前机器确实存在这个文件。


分析原因:

When using the JDBC driver, the command executes on the HiveServer2 side. The file is evaluated to locally exist on the server, which is not true in your case (it exists on the local client program machine).

Try instead to load the file to HDFS first, and use a HDFS URI in the LOAD DATA statement to make the server find it.

从上面的解释可知, hive导入数据语句  load data [local] inpath ,是一个服务器端的指令,它是在服务器端执行。因此指定local时表明加载的文件为本地文件,但是这里的local,在hive中指的是 hiveserver 服务所在的机器,而不是hivecli 或 beeline客户端所在的机器(生产环境大都是 hiveserver 和 hivecli不在同一个机器)。

总结:

    解决方法一:把要加载的文件上传到 hiveserver所在的服务器(这一般是不可能的事情),然后执行  load data local inpath [path] [overwrite] into table table_name.

    解决方法二:先将本地文件上传至hdfs,然后使用 load data inpath [hdfspath] [overwrite] into table table_name.

方法二为推荐做法。

hive> load data local inpath '/opt/credit/Training_LogInfo.csv' overwrite into table loginfo_train; 2025-06-23 10:42:15,280 INFO conf.HiveConf: Using the default value passed in for log id: 405f75b4-1dac-4bd0-8c25-f5c18157de53 2025-06-23 10:42:15,281 INFO session.SessionState: Updating thread name to 405f75b4-1dac-4bd0-8c25-f5c18157de53 main 2025-06-23 10:42:15,282 INFO ql.Driver: Compiling command(queryId=aaa_20250623104215_7487dc43-b31c-4db4-a902-265ca15d93e7): load data local inpath '/opt/credit/Training_LogInfo.csv' overwrite into table loginfo_train 2025-06-23 10:42:15,290 INFO ql.Driver: Concurrency mode is disabled, not creating a lock manager 2025-06-23 10:42:15,296 INFO metastore.HiveMetaStore: 0: get_table : tbl=hive.credit.loginfo_train 2025-06-23 10:42:15,296 INFO HiveMetaStore.audit: ugi=aaa ip=unknown-ip-addr cmd=get_table : tbl=hive.credit.loginfo_train FAILED: SemanticException Line 1:23 Invalid path ''/opt/credit/Training_LogInfo.csv'': No files matching path file:/opt/credit/Training_LogInfo.csv 2025-06-23 10:42:15,349 ERROR ql.Driver: FAILED: SemanticException Line 1:23 Invalid path ''/opt/credit/Training_LogInfo.csv'': No files matching path file:/opt/credit/Training_LogInfo.csv org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:23 Invalid path ''/opt/credit/Training_LogInfo.csv'': No files matching path file:/opt/credit/Training_LogInfo.csv at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.applyConstraintsAndGetFiles(LoadSemanticAnalyzer.java:176) at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeLoad(LoadSemanticAnalyzer.java:341) at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:260) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) 2025-06-23 10:42:15,350 INFO ql.Driver: Completed compiling command(queryId=aaa_20250623104215_7487dc43-b31c-4db4-a902-265ca15d93e7); Time taken: 0.068 seconds 2025-06-23 10:42:15,350 INFO ql.Driver: Concurrency mode is disabled, not creating a lock manager 2025-06-23 10:42:15,350 INFO conf.HiveConf: Using the default value passed in for log id: 405f75b4-1dac-4bd0-8c25-f5c18157de53 2025-06-23 10:42:15,350 INFO session.SessionState: Resetting thread name to main
06-24
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值