java 实现文件比较算法,Java和Hash算法来比较文件

在2013年,对于文件的双倍匹配,如果需要接近零的错误概率,建议进行逐字节比较。若能接受极低的误报率,可以使用如SHA-256的哈希算法。Guava库提供了一行代码实现字节对比,而文件哈希值对比则可以避免文件内容详细比较的复杂性。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

I have to fingerprint files to match doublets. What is recommended with Java in 2013? Should I also compare the file size, or is this a unnecessary check?

The probability of false positive should be very close to 0

EDIT: Lots of answers, thanks. What is the standard of backup software today? SHA-256? higher? I guess md5 is not suitable?

解决方案

If the probability of false positives has to be zero, as opposed to "lower than the probability you will be struck by lightning," then no hash algorithm at all can be used; you must compare the files byte by byte.

For what it's worth, if you can use third-party libraries, you can use Guava to compare two files byte-by-byte with the one-liner

Files.asByteSource(file1).contentEquals(Files.asByteSource(file2));

which takes care of opening and closing the files as well as the details of comparison.

If you're willing to accept false positives that are less likely than getting struck by lightning, then you could do

Files.hash(file, Hashing.sha1()); // or md5(), or sha256(), or...

which returns a HashCode, and then you can test that for equality with the hash of another file. (That version also deals with the messiness of MessageDigest, of opening and closing the file properly, etcetera.)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值