LintCode 129. Rehashing (链表和指针结合好题)

哈希表扩容与再哈希

原创已于 2022-12-05 09:10:08 修改 · 349 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#LintCode

于 2019-10-24 23:38:20 首次发布

algorithm-design 专栏收录该内容

817 篇文章

订阅专栏

本文详细解析了哈希表在面对键值数量超出容量限制时的扩容策略及再哈希过程，通过实例展示了如何将哈希表的容量翻倍，并重新分布已有的键值对。

Rehashing
中文English
The size of the hash table is not determinate at the very beginning. If the total size of keys is too large (e.g. size >= capacity / 10), we should double the size of the hash table and rehash every keys. Say you have a hash table looks like below:

size=3, capacity=4

[null, 21, 14, null]
↓ ↓
9 null
↓
null
The hash function is:

int hashcode(int key, int capacity) {
return key % capacity;
}
here we have three numbers, 9, 14 and 21, where 21 and 9 share the same position as they all have the same hashcode 1 (21 % 4 = 9 % 4 = 1). We store them in the hash table by linked list.

rehashing this hash table, double the capacity, you will get:

size=3, capacity=8

index: 0 1 2 3 4 5 6 7
hash : [null, 9, null, null, null, 21, 14, null]
Given the original hash table, return the new hash table after rehashing .

Example
Example 1

Input : [null, 21->9->null, 14->null, null]
Output : [null, 9->null, null, null, null, 21->null, 14->null, null]
Notice
For negative integer in hash table, the position can be calculated as follow:

C++/Java: if you directly calculate -4 % 3 you will get -1. You can use function: a % b = (a % b + b) % b to make it is a non negative integer.
Python: you can directly use -1 % 3, you will get 2 automatically.

解法1：注意有两个循环，在新数组和旧数组中都要循环
代码如下：

/**
 * Definition of ListNode
 * class ListNode {
 * public:
 *     int val;
 *     ListNode *next;
 *     ListNode(int val) {
 *         this->val = val;
 *         this->next = NULL;
 *     }
 * }
 */
class Solution {
public:
    /**
     * @param hashTable: A list of The first node of linked list
     * @return: A list of The first node of linked list which have twice size
     */    
    vector<ListNode*> rehashing(vector<ListNode*> hashTable) {
        int capacity = hashTable.size(); 
        int capacity2 = capacity << 1;
        vector<ListNode *> result(capacity2, NULL);
        for (int i = 0; i < capacity; ++i) {
            ListNode * curNode = hashTable[i];
            while(curNode) {
                int val = curNode->val;
                ListNode * newNode = new ListNode(val);
                int newIndex = val < 0 ? (val % capacity2 + capacity2) % capacity2 : val % capacity2;
                if (result[newIndex] == NULL) {
                    result[newIndex] = newNode;
                } else {
                    ListNode * tmp = result[newIndex];
                    while(tmp->next) {
                        tmp = tmp->next;
                    }
                    tmp->next = newNode;
                }
                curNode = curNode->next;
            }
        }
        
        return result;
    }
};

二刷: 解法1是当有hash key有冲突时，把新节点加在链表末尾。这里时把新节点加在链表开头。这样可以省掉一个循环，而且由于cache coherence，新节点更容易在接下来的时间被访问到。

/**
 * Definition of ListNode
 * class ListNode {
 * public:
 *     int val;
 *     ListNode *next;
 *     ListNode(int val) {
 *         this->val = val;
 *         this->next = NULL;
 *     }
 * }
 */
class Solution {
public:
    /**
     * @param hashTable: A list of The first node of linked list
     * @return: A list of The first node of linked list which have twice size
     */    
    vector<ListNode*> rehashing(vector<ListNode*> hashTable) {
        int capacity = hashTable.size();
        int newCapacity = capacity << 1;
        vector<ListNode*> newHashTable(newCapacity, nullptr);
        for (int i = 0; i < capacity; i++) {
            if (hashTable[i] == nullptr) continue;
            ListNode *ptr = hashTable[i];
            
            while (ptr) {
                int value = ptr->val;
                int newPlace = (value % newCapacity + newCapacity) % newCapacity;
                if (newHashTable[newPlace] == nullptr) {
                    newHashTable[newPlace] = new ListNode(value);
                } else {
                    ListNode *origHead = newHashTable[newPlace];
                    newHashTable[newPlace] = new ListNode(value);
                    newHashTable[newPlace]->next = origHead;
                }
                ptr = ptr->next;
            }
        }
        return newHashTable;
    }
};