常用的哈希函数

最新推荐文章于 2024-08-31 09:22:17 发布

陈伟鹏2016

最新推荐文章于 2024-08-31 09:22:17 发布

阅读量1k

点赞数

分类专栏：算法和数据结构 java 文章标签：算法 java unix 编程优化工具

算法和数据结构同时被 2 个专栏收录

16 篇文章

订阅专栏

java

2 篇文章

订阅专栏

本文介绍了八种经典的字符串哈希算法，包括RS、JS、PJW、ELF、BKDR、SDBM、DJB和AP等算法，并提供了详细的Java实现代码。这些算法在散列性能和分布均匀性方面各有特色。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

通用的哈希函数库有下面这些混合了加法和一位操作的字符串哈希算法。下面的这些算法在用法和功能方面各有不同，但是都可以作为学习哈希算法的实现的例子。(其他版本代码实现见下载）

1.RS

从Robert Sedgwicks的 Algorithms in C一书中得到了。原文作者已经添加了一些简单的优化的算法，以加快其散列过程。

[java]view plaincopy 
      
 
      
 public long RSHash(String str)  
    {  
       int b     = 378551;  
       int a     = 63689;  
       long hash = 0;  
       for(int i = 0; i < str.length(); i++)  
       {  
          hash = hash * a + str.charAt(i);  
          a    = a * b;  
       }  
       return hash;  
    }  

2.JS

Justin Sobel写的一个位操作的哈希函数。

[c-sharp]view plaincopy 
      
 
      
 public long JSHash(String str)  
    {  
       long hash = 1315423911;  
       for(int i = 0; i < str.length(); i++)  
       {  
          hash ^= ((hash << 5) + str.charAt(i) + (hash >> 2));  
       }  
       return hash;  
    }  

3.PJW

该散列算法是基于贝尔实验室的彼得J 温伯格的的研究。在Compilers一书中（原则，技术和工具），建议采用这个算法的散列函数的哈希方法。

[java]view plaincopy 
      
 
      
 public long PJWHash(String str)  
    {  
       long BitsInUnsignedInt = (long)(4 * 8);  
       long ThreeQuarters     = (long)((BitsInUnsignedInt  * 3) / 4);  
       long OneEighth         = (long)(BitsInUnsignedInt / 8);  
       long HighBits          = (long)(0xFFFFFFFF) << (BitsInUnsignedInt - OneEighth);  
       long hash              = 0;  
       long test              = 0;  
       for(int i = 0; i < str.length(); i++)  
       {  
          hash = (hash << OneEighth) + str.charAt(i);  
          if((test = hash & HighBits)  != 0)  
          {  
             hash = (( hash ^ (test >> ThreeQuarters)) & (~HighBits));  
          }  
       }  
       return hash;  
    }  

4.ELF

和PJW很相似，在Unix系统中使用的较多。

[java]view plaincopy 
      
 
      
 public long ELFHash(String str)  
    {  
       long hash = 0;  
       long x    = 0;  
       for(int i = 0; i < str.length(); i++)  
       {  
          hash = (hash << 4) + str.charAt(i);  
          if((x = hash & 0xF0000000L) != 0)  
          {  
             hash ^= (x >> 24);  
          }  
          hash &= ~x;  
       }  
       return hash;  
    }  

5.BKDR

这个算法来自Brian Kernighan 和 Dennis Ritchie的 The C Programming Language。这是一个很简单的哈希算法,使用了一系列奇怪的数字,形式如31,3131,31...31,看上去和DJB算法很相似。(参照我之前一篇博客，这个就是Java的字符串哈希函数)

[java]view plaincopy 
      
 
      
 public long BKDRHash(String str)  
    {  
       long seed = 131; // 31 131 1313 13131 131313 etc..  
       long hash = 0;  
       for(int i = 0; i < str.length(); i++)  
       {  
          hash = (hash * seed) + str.charAt(i);  
       }  
       return hash;  
    }  

6.SDBM

这个算法在开源的SDBM中使用，似乎对很多不同类型的数据都能得到不错的分布。

[java]view plaincopy 
      
 
      
 public long SDBMHash(String str)  
    {  
       long hash = 0;  
       for(int i = 0; i < str.length(); i++)  
       {  
          hash = str.charAt(i) + (hash << 6) + (hash << 16) - hash;  
       }  
       return hash;  
    }  

7.DJB

这个算法是Daniel J.Bernstein 教授发明的，是目前公布的最有效的哈希函数。

[java]view plaincopy 
      
 
      
 public long DJBHash(String str)  
    {  
       long hash = 5381;  
       for(int i = 0; i < str.length(); i++)  
       {  
          hash = ((hash << 5) + hash) + str.charAt(i);  
       }  
       return hash;  
    }  

8.DEK

由伟大的Knuth在《编程的艺术第三卷》的第六章排序和搜索中给出。

[java]view plaincopy 
      
 
      
 public long DEKHash(String str)  
    {  
       long hash = str.length();  
       for(int i = 0; i < str.length(); i++)  
       {  
          hash = ((hash << 5) ^ (hash >> 27)) ^ str.charAt(i);  
       }  
       return hash;  
    }  

9.AP

这是本文作者Arash Partow贡献的一个哈希函数，继承了上面以旋转以为和加操作。代数描述：

[java]view plaincopy 
   
 
   
 public long APHash(String str)  
    {  
       long hash = 0xAAAAAAAA;  
       for(int i = 0; i < str.length(); i++)  
       {  
          if ((i & 1) == 0)  
          {  
             hash ^= ((hash << 7) ^ str.charAt(i) * (hash >> 3));  
          }  
          else  
          {  
             hash ^= (~((hash << 11) + str.charAt(i) ^ (hash >> 5)));  
          }  
       }  
       return hash;  
    }