因打包配置错误导致设置FsUrlStreamHandlerFactory无效的问题

当使用hdfs://前缀的URL时,Java默认无法识别,需配置FsUrlStreamHandlerFactory。但若打包配置错误,可能导致本地运行正常,打包后问题依旧。本文探讨了两种解决方案:方案一涉及StackOverflow上的建议,但配置后可能无效;方案二针对使用gradle shadow插件的工程,需配置ServiceFileTransformer以合并META-INF/services文件,参照相关文档进行设置后,重新打包问题得到解决。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

因打包配置错误导致设置FsUrlStreamHandlerFactory无效的问题

hdfs://前缀的url在Java中默认是无法识别的,所以需要添加

URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory());

才能识别。

但是如果由于打包设置不正确则会导致本地运行正常而打成jar包后仍然无法识别的问题。

问题现象
Exception in thread "main" java.net.MalformedURLException: unknown protocol: hdfs
    at java.net.URL.<init>(URL.java:592)
    at java.net.URL.<init>(URL.java:482)
    at java.net.URL.<init>(URL.java:431)
    at in.ksharma.hdfs.FileReader.main(FileReader.java:29)

或者

Exception in thread "main" java.io.IOException: No FileSystem for scheme: file
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:180)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
    at org.apache.mahout.classifier.naivebayes.NaiveBayesModel.materialize(NaiveBayesModel.java:100)

google半天之后发现几个解决办法

方案一
hadoop-2.X/share/hadoop/hdfs/hadoop-hdfs-2.X.jar to your classpath.

参考:https://stackoverflow.com/questions/25971333/malformedurlexception-on-reading-file-from-hdfs

配置后无效

方案二
<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-shade-plugin</artifactId>
  <version>2.3</version>
  <executions>
    <execution>
      <phase>package</phase>
      <goals>
        <goal>shade</goal>
      </goals>
      <configuration>
        <transformers>
          <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
        </transformers>
      </configuration>
    </execution>
  </executions>
</plugin>

需要在maven中配置打包插件,设置

 <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>

由于工程用的是gradle的shadow插件打包

故而查询shadow官网

Merging Service Descriptor Files
Java libraries often contain service descriptors files in the META-INF/services directory of the JAR. A service descriptor typically contains a line delimited list of classes that are supported for a particular service. At runtime, this file is read and used to configure library or application behavior.

Multiple dependencies may use the same service descriptor file name. In this case, it is generally desired to merge the content of each instance of the file into a single output file. The ServiceFileTransformer class is used to perform this merging. By default, it will merge each copy of a file under META-INF/services into a single file in the output JAR.

// Merging Service Files
shadowJar {
  mergeServiceFiles()
}

添加如上配置即等于maven中的

 <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>

参考链接:
https://www.codelast.com/原创-解决读写hdfs文件的错误:no-filesystem-for-scheme-hdfs/
https://imperceptiblethoughts.com/shadow/configuration/merging/#merging-service-descriptor-files

结论

重新打包后问题解决。正常运行。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值