《算法导论》第三版第11章 散列表 练习&思考题 个人答案

这篇博客详细解答了《算法导论》第三版第11章关于散列表的练习题,包括直接寻址表、散列函数、开放寻址法等方面,探讨了不同操作(搜索、插入、删除)的预期时间复杂度,并证明了某些散列函数的性质。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

11.1 直接寻址表

11.1-1

解:

DIRECT-ADDRESS-FINDMAX(T)
for i = T.length - 1 to 0
    if T[i] != NIL
        return T[i]

最坏情况O(m)O(m)O(m)

11.1-2

思路:1代表存在,0代表不存在;插入置位,删除复位。

11.1-3

思路:可以将寻址表的每一个元素指向包含相同关键字的一个双向循环链表。再使用第10章的相关知识完成。

11.1-4

解(来自参考答案):
We denote the huge array by TTT and, taking the hint from the book, we also have a stack implemented by an array SSS. The size of SSS equals the number of keys actually stored, so that SSS should be allocated at the dictionary’s maximum size. The stack

has an attribute S.topS.topS.top, so that only entries S[1..S.top]S[1..S.top]S[1..S.top] are valid.
The idea of this scheme is that entries of TTT and SSS validate each other. If key kkk is

actually stored in TTT, then T[k]T[k]T[k] contains the index, say jjj, of a valid entry in SSS, and

S[j]S[j]S[j] contains the value kkk. Let us call this situation, in which 1≤T[k]≤S.top1 \le T[k] \le S.top1T[k]S.top, S[T[k]]=kS[T[k]] = kS[T[k]]=k, and T[S[j]]=jT[S[j]] = jT[S[j]]=j, a validating cycle.
Assuming that we also need to store pointers to objects in our direct-address table, we can store them in an array that is parallel to either TTT or SSS. Since SSS is smaller than TTT, we’ll use an array S′S'S, allocated to be the same size as SSS, for these pointers. Thus, if the dictionary contains an object xxx with key kkk, then there is a validating cycle and S′[T[k]]S'[T[k]]S[T[k]] points to xxx.
The operations on the dictionary work as follows:

Initialization: Simply set S.top=0S.top = 0S.top=0, so that there are no valid entries in the stack.
SEARCH: Given key kkk, we check whether we have a validating cycle, i.e., whether 1≤T[k]≤S.top1 \le T [k] \le S.top1T[k]S.top and S[T[k]]=kS[T[k]] = kS[T[k]]=k. If so, we return S′[T[k]]S'[T[k]]S[T[k]], and otherwise we return NIL\text{NIL}NIL.
INSERT: To insert object xxx with key kkk, assuming that this object is not already in the dictionary, we increment S.topS.topS.top, set S[S.top]=kS[S.top] = kS[S.top]=k, set S′[S.top]=xS'[S.top] = xS[S.top]=x, and set T[k]=S.topT[k] = S.topT[k]=S.top.
DELETE: To delete object xxx with key kkk, assuming that this object is in the dictionary, we need to break the validating cycle. The trick is to also ensure that we don’t leave a “hole” in the stack, and we solve this problem by moving the top entry of the stack into the position that we are vacating-and then fixing up that entry’s validating cycle. That is, we execute the following sequence of assignments:

S[T[k]]=S[S.top]S′[T[k]]=S′[S.top]T[S[T[k]]]=T[k]T[k]=0S.top=S.top−1 \begin{aligned} & S[T[k]] = S[S.top] \\ & S'[T[k]] = S'[S.top] \\ & T[S[T[k]]] = T[k] \\ & T[k] = 0 \\ & S.top = S.top - 1 \end{aligned} S[T[k]]=S[S.top]S[T[k]]=S[S.top]T[S[T[k]]]=T[k]T[k]=0S.top=S.top1
Each of these operation - initialization, SEARCH\text{SEARCH}SEARCH, INSERT\text{INSERT}INSERT, and DELETE\text{DELETE}DELETE-takes O(1)O(1)O(1) time.

11.2 散列表

11.2-1

解:对不相同的kl组合求1/m的和,可得n(n−1)2m\frac{n(n-1)}{2m}2mn(n1)

11.2-2

解:中间过程略,最后结果是
0
1→10→19→28
2→20
3→12
4
5→5
6→33→15
7
8→17

11.2-3

解:
查找:期望时间不变,但查找的值越大所需时间越多(如果是单链表升序排列的话)
插入:期望时间不变,所需时间略多(需要执行一次时间复杂度是O(1)O(1)O(1)的插入操作)
删除:期望时间不变,但删除的值越大所需时间越大(如果是单链表升序排列的话)

11.2-4

思路:标志位用来标志该槽位是否被占用,如果没有被占用,两个指针分别指向前一个和后一个空槽位(如同一个双向链表);如果被占用,一个指针指向保存的元素。
解(来自参考答案):
The flag in each slot will indicate whether the slot is free.
(每个插槽中的标志将指示插槽是否空闲。)
A free slot is in the free list, a doubly linked list of all free slots in the table. The slot thus contains two pointers.
A used slot contains an element and a pointer (possibly NIL\text{NIL}NIL) to the next element that hashes to this slot. (Of course, that pointer points to another slot in the table.)
(空闲插槽位于空闲列表中,空闲列表是表中所有空闲插槽的双向链表。因此,槽包含两个指针。已使用的插槽包含一个元素和一个指向下一个散列到此插槽的元素的指针(可能是NIL\text {NIL}NIL)。(当然,该指针指向表中的另一个插槽。))
Operations(操作)

Insertion(插入):

If the element hashes to a free slot, just remove the slot from the free list and store the element there (with a NIL\text{NIL}NIL pointer). The free list must be doubly linked in order for this deletion to run in O(1)O(1)O(1) time.

If the element hashes to a used slot jjj, check whether the element xxx already there “belongs” there (its key also hashes to slot jjj).

If so, add the new element to the chain of elements in this slot. To do so, allocate a free slot (e.g., take the head of the free list) for the new element and put this new slot at the head of the list pointed to by the hashed-to slot (jjj).
If not, EEE is part of another slot’s chain. Move it to a new slot by allocating one from the free list, copying the old slot’s (jjj's) contents (element xxx and pointer) to the new slot, and updating the pointer in the slot that pointed to jjj to point to the new slot. Then insert the new element in the now-empty slot as usual.

To update the pointer to jjj, it is necessary to find it by searching the chain of elements starting in the slot xxx hashes to.

Deletion(删除):
Let jjj be the slot the element xxx to be deleted hashes to.

If xxx is the only element in jjj (jjj doesn’t point to any other entries), just free the slot, returning it to the head of the free list.
If xxx is in jjj but there’s a pointer to a chain of other elements, move the first pointed-to entry to slot jjj and free the slot it was in.
If xxx is found by following a pointer from jjj, just free xxx's slot and splice it out of the chain (i.e., update the slot that pointed to xxx to point to xxx's successor).

Searching(查找):
Check the slot the key hashes to, and if that is not the desired element, follow the chain of pointers from the slot.
All the operations take expected O(1)O(1)O(1) times for the same reason they do with the version in the book: The expected time to search the chains is O(1+α)O(1 + \alpha)O(1+α) regardless of where the chains are stored, and the fact that all the elements are stored in the table means that α≤1\alpha \le 1α1. If the free list were singly linked, then operations that involved removing an arbitrary slot from the free list would not run in O(1)O(1)O(1) time.

11.2-5

这不是很显然吗。。∣U∣m>n\frac{|U|}{m}>nmU>n必然至少有一个槽中有多于n个的元素,鸽笼原理?

11.2-6

思路:最长链长度为L,共有m条链,可以看成一个m行L列的矩阵,只要调用RANDOM(1, m)和RANDOM(1, L),直到找到一个包含元素的位置,需要mL/n(即L/α)次,再查找该元素即可。

11.3 散列函数

11.3-1

思路:比较链表中元素的散列值和给定关键字的散列值。

11.3-2

解:

sum = 0
for i = 1 to r
    sum = (sum *128 + s[i]) mod m // 使用sum作为散列值

11.3-3

解(来自参考答案):
First, we observe that we can generate any permutation by a sequence of interchanges of pairs of characters. One can prove this property formally, but informally, consider that both heapsort and quicksort work by interchanging pairs of elements and that they have to be able to produce any permutation of their input array. Thus, it suffices to show that if string xxx can be derived from string yyy by interchanging a single pair of characters, then xxx and yyy hash to the same value.
(首先,我们观察到我们可以通过一系列字符交换生成任何排列。可以正式地证明这个属性,但是非正式地,考虑堆排序和快速排序都可以通过交换元素对来工作,并且他们必须能够产生输入数组的任何排列。因此,足以证明如果字符串xxx可以通过交换一对字符从字符串yyy派生,那么xxxyyy将散列到相同的值。)
Let us denote the iiith character in xxx by xix_ixi, and similarly for yyy. The interpretation of xxx in radix 2p2^p2p is ∑i=0n−1xi2ip\sum_{i = 0}^{n - 1} x_i 2^{ip}i=0n1xi2ip, and so h(x)=(∑i=0n−1xi2ip)mod  (2p−1)h(x) = (\sum_{i = 0}^{n - 1} x_i 2^{ip}) \mod (2^p - 1)h(x)=(i=0n1xi2ip)mod(2p1). Similarly, h(y)=(∑i=0n−1yi2ip)mod  (2p−1)h(y) = (\sum_{i = 0}^{n - 1} y_i 2^{ip}) \mod (2^p - 1)h(y)=(i=0n1yi2ip)mod(2p1).
Suppose that xxx and yyy are identical strings of nnn characters except that the characters in positions aaa and bbb are interchanged: xa=ybx_a = y_bxa=yb and ya=xby_a = x_bya=xb. Without loss of generality, let a>ba > ba>b. We have
h(x)−h(y)=(∑i=0n−1xi2ip)mod  (2p−1)−(∑i=0n−1yi2ip)mod  (2p−1).h(x) - h(y) = \Big(\sum_{i = 0}^{n - 1} x_i 2^{ip}\Big) \mod (2^p - 1) - \Big(\sum_{i = 0}^{n - 1} y_i 2^{ip}\Big) \mod (2^p - 1).h(x)h(y)=(i=0n1xi2ip)mod(2p1)(i=0n1yi2ip)mod(2p1).
Since 0≤h(x)0 \le h(x)0h(x), h(y)&lt;2p−1h(y) &lt; 2^p - 1h(y)<2p1, we have that −(2p−1)&lt;h(x)−h(y)&lt;2p−1-(2^p - 1) &lt; h(x) - h(y) &lt; 2^p - 1(2p1)<h(x)h(y)<2p1. If we show that (h(x)−h(y))mod&ThinSpace;&ThinSpace;(2p−1)=0(h(x) - h(y)) \mod (2^p - 1) = 0(h(x)h(y))mod(2p1)=0, then h(x)=h(y)h(x) = h(y)h(x)=h(y).
Since the sums in the hash functions are the same except for indices aaa and bbb, we have
(h(x)−h(y))mod&ThinSpace;&ThinSpace;(2p−1)=((xa2ap+xb2bp)−(ya2ap+yb2bp))mod&ThinSpace;&ThinSpace;(2p−1)=((xa2ap+xb2bp)−(xb2ap+xa2bp))mod&ThinSpace;&ThinSpace;(2p−1)=((xa−xb)2ap−(xa−xb)2bp)mod&ThinSpace;&ThinSpace;(2p−1)=((xa−xb)(2ap−2bp))mod&ThinSpace;&ThinSpace;(2p−1)=((xa−xb)2bp(2(a−b)p−1))mod&ThinSpace;&ThinSpace;(2p−1). \begin{aligned} (h(x) - h(y)) \mod (2^p - 1) &amp; = ((x_a 2^{ap} + x_b 2^{bp}) - (y_a 2^{ap} + y_b 2^{bp})) \mod (2^p - 1) \\ &amp; = ((x_a 2^{ap} + x_b 2^{bp}) - (x_b 2^{ap} + x_a 2^{bp})) \mod (2^p - 1) \\ &amp; = ((x_a - x_b)2^{ap} - (x_a - x_b) 2^{bp}) \mod (2^p - 1) \\ &amp; = ((x_a - x_b)(2^{ap} - 2^{bp})) \mod (2^p - 1) \\ &amp; = ((x_a - x_b)2^{bp}(2^{(a - b)p} - 1)) \mod (2^p - 1). \end{aligned} (h(x)h(y))mod(2p1)=((xa2ap+xb2bp)(ya2ap+yb2bp))mod(2p1)=((xa2ap+xb2bp)(xb2ap+xa2bp))mod(2p1)=((xaxb)2ap(xaxb)2bp)mod(2p1)=((xaxb)(2ap2bp))mod(2p1)=((xaxb)2bp(2(ab)p1))mod(2p1).
By equation (A.5)\text{(A.5)}(A.5),
∑i=0a−b−12pi=2(a−b)p−12p−1,\sum_{i = 0}^{a - b - 1} 2^{pi} = \frac{2^{(a - b)p} - 1}{2^p - 1},i=0ab12pi=2p12(ab)p1,
and multiplying both sides by sp−1s^p - 1sp1, we get 2(a−b)p−1=(∑i=0a−b−12pi)(2p−1)2^{(a - b)p} - 1 = \big(\sum_{i = 0}^{a - b - 1} 2^{pi}\big)(2^p - 1)2(ab)p1=(i=0ab12pi)(2p1). Thus,
(h(x)−h(y))mod&ThinSpace;&ThinSpace;(2p−1)=((xa−xb)2bp(∑i=0a−b−12pi)(2p−1))mod&ThinSpace;&ThinSpace;(2p−1)=0, \begin{aligned} (h(x) - h(y))\mod(2^p - 1) &amp; = \Bigg((x_a - x_b)2^{bp}\Bigg(\sum_{i = 0}^{a - b - 1} 2^{pi}\Bigg)(2^p - 1)\Bigg) \mod (2^p - 1) \\ &amp; = 0, \end{aligned} (h(x)h(y))mod(2p1)=((xaxb)2bp(i=0ab12pi)(2p1))mod(2p1)=0,
since one of the factors is 2p−12^p - 12p1.
We have shown that (h(x)−h(y))mod&ThinSpace;&ThinSpace;(2p−1)=0(h(x) - h(y)) \mod (2^p - 1) = 0(h(x)h(y))mod(2p1)=0, and so h(x)=h(y)h(x) = h(y)h(x)=h(y).

11.3-4

解:
h(61)=700h(61) = 700h(61)=700
h(62)=318h(62) = 318h(62)=318
h(63)=936h(63) = 936h(63)=936
h(64)=554h(64) = 554h(64)=554
h(65)=172h(65) = 172h(65)=172

11.3-5

解(来自参考答案):
Let b=∣B∣b = |B|b=B and u=∣U∣u = |U|u=U. We start by showing that the total number of collisions is minimized by a hash function that maps u/bu / bu/b elements of UUU to each of the bbb values in BBB. For a given hash function, let uju_juj be the number of elements that map to j∈Bj \in BjB. We have u=∑j∈Buju = \sum_{j \in B} u_j

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值