Algorithms_6_HashTable

本文探讨了哈希表的基本概念及其与直接寻址数组的区别。介绍了哈希表如何通过减少存储需求来优化数据存储,并详细解释了几种常见的哈希函数实现方法,包括除法方法和乘法方法。此外,还讨论了如何处理哈希冲突,以及如何选择合适的哈希函数。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

A hash table is a generalization of the simpler notion of an ordinary array. Directly addressing into an ordinary array makes effective use of our ability to examine an arbitrary position in an array in big O(1) time. Direct addressing is applicable when we can afford to allocate an array that has one position for every possible key.

  When the number of keys actually stored is small relative to the total number of possible keys, Hash tables become an effective alternative to directly addressing an array, since a hash table typically use an array of size proportional to the number of keys actually stored.

  Instead of using the key as an array index directly, the array index is computed from the key.

  Direct-address table is a simple technique that works well when the universe U of keys is reasonably small.

  When the set k of keys stored in a dictionary is much smaller than the universal U of all possible keys, a hash table require much less storage than a direct-address table.

  With direct addressing, an element with key k is stored in slot k. With hashing the element is stored in slot h(k), so h(k) is the hash value of key k. There is one hitch: two keys may hash to the same slot, We call this situation a collision. In chaining, we put all the elements that hash to the same slot in a linked list.

  A good hash function satisfies the assumption of simple uniform hashing: each key is equally likely to hash to any of the m slots, independently of where any other key has hash to .

  The division method computes the hash value as the remainder when the key is divided by a specified prime number. This method frequently gives good results, assume that the prime number is chosen to be unrelated to any patterns in the distribution of keys.

 In the division method for creating hash function. We map a key k into one of m slots by taking the remainder of k divided by m. That is, h(k) = k mod m , etc, size m=12 k=100 h(k) =4

 When using the division method, we usually avoid certain value of m, For example, m should not be a power of 2 .

 A prime not too close to an exact power of 2 is often a good choise for m

 The multiplication method for creating hash functions operates in 2 steps. First, we multiply the key k by a constant A in the range 0<A<1 and extract the fractional part of kA. Then, we multiply this value by m and take the floor of the result.

 H(k) = m(kA mod 1)

 kA mod 1 means the fractional part of kA

 An advantage of the multiplication method is that the value of m is not critical. We typically choose it to be a power of 2, since we can then easily implement the function on most computers as follows.

 The main idea behind universal hashing is to select the hash function at random from a carefully designed class of functions at the beginning of exexution.

 Designing a universal class of hash functions by choosing a prime number p large enough so that every possible key k is in the range 0 to p-1, inclusive, Let Z1 denote the set {0,1,2…..p-1}and let Z2 denote the set {1,2,3……p-1}. Because we assume that the size of the universal of keys is greater than the number of slots in the hash table ,we have p>m

 Ha,b(k) = ((ak+b) mod p) mod m

 Etc p=17 m=6

 H3,4(8) =5

Perfecting hashing : The basic idea to create a perfect hashing scheme is simple. We use a  two-level hashing scheme with universal hashing at each level.

 
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值