散列
1. 散列函数
如果输入的关键字是整数,则一般合理的方法就是直接返回 Key mod Tablesize
。散列的函数的选择需要仔细考虑。通常保证表的大小是素数,当输入的关键字是随机数字时,散列函数不仅计算起来简单而且关键字的分配也很均匀。
通常,关键字是字符串
一种选择方法是把字符串的ASCII码(或 Unicode码)值加起来。
public static int hash(String key, int tableSize) {
int hashVal = 0;
for (int i = 0; i < key.length(); i++) {
hashVal += key.charAt(i);
}
return hashVal % tableSize;
}
另一个散列函数,这个散列函数假设Key
至少有3个字符,值27表示英文字母表加一个空格的个数,而 729 是 272
public static int hash(String key, int tableSize) {
return (key.charAt(0) + 27 * key.charAt(1) + 729 * key.charAt(2)) % tableSize;
}
第3种尝试,程序根据Horner
法则计算一个(37的)多项式函数。例如,计算 hk=k0+37k1+372k2h_k = k_0 + 37 k_1+ 37^2 k_2hk=k0+37k1+372k2 的另一种方式是借助于公式 hk=(k2∗37+k1)∗37+k0h_k =(k_2 *37 + k_1)*37 + k_0hk=(k2∗37+k1)∗37+k0 进行。Horner
法则将其扩展到用于 n 次多项式
public static int hash2(String key, int tableSize) {
int hashVal = 0;
for (int i = 0; i < key.length(); i++) {
hashVal = 37 * hashVal + key.charAt(i);
}
hashVal %= tableSize;
if (hashVal < 0) {
hashVal += tableSize;
}
return hashVal;
}
分离链接法
解决冲突的第一种方法 分离链接法
public class SeparateChainingHashTable<E> {
private static final int DEFAULT_TABLE_SIZE = 101;
private List<E>[] lists;
private int currentSize;
private static int nextPrime(int n) {
if (n % 2 == 0)
n++;
for (; !isPrime(n); n += 2)
;
return n;
}
private static boolean isPrime(int n) {
if (n == 2 || n == 3)
return true;
if (n == 1 || n % 2 == 0)
return false;
for (int i = 3; i * i <= n; i += 2)
if (n % i == 0)
return false;
return true;
}
private void rehash() {
List<E>[] oldLists = lists;
lists = new List[nextPrime(2 * lists.length)];
for (int j = 0; j < lists.length; j++) {
lists[j] = new LinkedList<>();
}
currentSize = 0;
for (int i = 0; i < oldLists.length; i++) {
for (E item : oldLists[i])
insert(item);
}
}
public void insert(E x) {
List<E> whichList = lists[myHash(x)];
if (!whichList.contains(x)) {
whichList.add(x);
if (++currentSize > lists.length) {
rehash();
}
}
}
public void remove(E x) {
List<E> whichList = lists[myHash(x)];
if (whichList.contains(x)) {
whichList.remove(x);
currentSize--;
}
}
public boolean contains(E x) {
List<E> whichList = lists[myHash(x)];
return whichList.contains(x);
}
private int myHash(E x) {
int hashVal = x.hashCode();
hashVal %= lists.length;
if (hashVal < 0) {
hashVal += lists.length;
}
return hashVal;
}
public SeparateChainingHashTable() {
this(DEFAULT_TABLE_SIZE);
}
public SeparateChainingHashTable(int size) {
lists = new LinkedList[nextPrime(size)];
for (int i = 0; i < lists.length; i++) {
lists[i] = new LinkedList<>();
}
}
public void makeEmpty() {
for (int i = 0; i < lists.length; i++) {
lists[i].clear();
currentSize = 0;
}
}
public static void main(String [] args){
SeparateChainingHashTable<Integer> H=new SeparateChainingHashTable<>();
long startTime=System.currentTimeMillis();
final int NUMS=2000000;
final int GAP=37;
System.out.println("Checking... (no more output means success)");
for( int i = GAP; i != 0; i = ( i + GAP ) % NUMS )
H.insert( i );
for( int i = 1; i < NUMS; i+= 2 )
H.remove( i );
for( int i = 2; i < NUMS; i+=2 )
if( !H.contains( i ) )
System.out.println( "Find fails " + i );
for( int i = 1; i < NUMS; i+=2 )
{
if( H.contains( i ) )
System.out.println( "OOPS!!! " + i );
}
long endTime = System.currentTimeMillis( );
System.out.println( "Elapsed time: " + (endTime - startTime) );
}
}