Hash Table - Reviews

本文介绍了哈希表的基本概念、工作原理及其实现方法,包括哈希函数、冲突处理等关键技术,并通过一个简单的C++示例展示了哈希表的插入、检索和删除操作。

Intro

The concept, usage and implementations of Hash table are always used in Software Engineer interviews. From the interview guidance of Google, there is an requirement of hash table. It is said "Hashtables: Arguably the single most important data structure known to mankind." There is indeed a bunch of knowledge and techniques for hashtables (hash function, collision, etc.), but from the interview perspective, it is not possible to test the thorough and complete skills of hashtables in a short interview. Take this advantage, in this post, I'd like to learn the basics of hash tables, and try to implement sample code.

What is Hash Table?

It is a very common but often occurred question in IT interviews. I generalize the concept in my own words: " Hash table, is a  data structure, which stores  key-value pairs, the access of value by key can be  O(1) time, a hash function is used to map the key to the index of the value."

You can find many many definitions of hash table, generally speaking, you can imagine hash table is an array, originally we access an element in array by using index, e.g. A[1], A[2]. However in hash table, we access  element by the  key,  e.g. A["Monday"], D["Marry"].  The great advantage of it is the speed to look up an element (O(1) time). 

How does Hash table works

Firstly, hash tables can be implemented based on many data structures, e.g. Linked list, array and linked list, binary search tree, etc. The idea is to store the <key, value> pair and build a way to access it. For better understanding, just consider an array, we put the <key, value> in a specific order. The way to locate the <key, value> using the key is called hashing. We can consider a hash function takes the key as the input, and output the location of the <key, value> in the array. A simple hash function is to used "mod" operation.  Use the "key mod array size" to get the hash, the index of the desired value. 


An example

Let's see a simple example.
We have a storage of  size 5:
idx      key       value
0         -1           0
1         -1           0
2         -1           0
3         -1           0
4         -1           0
key=-1 means the slot is empty.
The hash function is   hash(key) = key % 5;
First we insert <12, 12>  (first is key, second is value)
Compute the hash(12) = 2;
Store the <key, value> into the storage of idx 2.
idx      key       value
0         -1           0
1         -1           0
2         12          12
3         -1           0
4         -1           0
Next we insert <29,29>, hash(29)=4;
idx      key       value
0         -1           0
1         -1           0
2         12          12
3         -1           0
4         29          29
Then we insert <27,27>, where the hash code is 2. When we check the location 2, it is already in use. 
It is called a  collision, where different key are mapped into same hash code. To deal with the collision, there are many methods, such as, chaining (use a linked list for each location), and rehashing (second function is used to map to another location). Usually we need to know at least these two kinds of methods.
Here we use the rehashing.  

The rehashing function is:  rehash(key) = (key+1)%5;
So, continue the above step, rehash(2) = 3; location 3 is empty, then store the <27,27> to location 3.
idx      key       value
0         -1           0
1         -1           0
2         12          12
3         27          27
4         29          29

If we further insert <32,32>, hash(32) = 2; location 2 is in use, rehash(2) = 3, location 3 is also in use,
Then rehash again, rehash(3) = 4, no available, rehash(4) = 0, OK! Store <32, 32 > in 0th slot.

idx      key       value
0         32          32
1         -1           0
2         12          12
3         27          27
4         29          29

That is the basic way of insert operation for a hash table.

To retrieve the value, e.g. we want to find the value of key <27, ?>, hash(27) = 2, check the key stored in location 2 , which is 12 !=27, then rehashing is need, rehash(2) = 3,  the key is 27, then return the value 27.

A simple implementation (in C++)

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
#include <iostream>
 
 
using namespace std;
 
const int sz = 5;
 
struct data{
  int id;
  int val;
};
 
class Hashtable{
  data dt[sz];
  int numel;
public :
  Hashtable();
  int hash( int &id);
  int rehash( int &id);
  int insert(data &d);
  int remove (data &d);
  int retrieve( int &id);
  void output();
};
 
 
Hashtable::Hashtable(){
  for ( int i=0;i<sz;i++){
    dt[i].id = -1;
dt[i].val = 0;
  }
  numel = 0;
}
 
int Hashtable::hash( int &id){
  return id%sz;
}
 
int Hashtable::rehash( int &id){
  return (id+1)%sz;
}
 
int Hashtable::insert(data &d){
  if (numel<sz){
    int hashid = hash(d.id);
if (hashid>=0 && hashid < sz){
  if (dt[hashid].id==-1 || dt[hashid].id==-2){
    dt[hashid].id = d.id;
    dt[hashid].val = d.val;
            numel++;
    return 0;
  } else {
    cout << "collision! rehashing..." <<endl;
    int i=0;
    while (i<sz){
      hashid = rehash(hashid);
  if (dt[hashid].id==-1 || dt[hashid].id==-2){
    dt[hashid].id = d.id;
    dt[hashid].val = d.val;
    numel++;
    return 0;
      }
  if (i==sz){ return -1;}
  i++;
}
  }
}
  } else { return -1;}
}
 
int Hashtable:: remove (data &d){
  int hashid = hash(d.id);
if (hashid>=0 && hashid < sz){
  if (dt[hashid].id==d.id){
    dt[hashid].id = -2;
    dt[hashid].val = 0;
    numel--;
    return 0;
  } else {
    int i=0;
    while (i<sz){
      hashid = rehash(hashid);
  if (dt[hashid].id==d.id){
    dt[hashid].id = -2;
    dt[hashid].val = 0;
    numel--;
    return 0;
      }
  if (i==sz){ return -1;}
  i++;
}
  }
}
}
 
int Hashtable::retrieve( int &id){
  int hashid = hash(id);
  if (hashid>=0 && hashid < sz){
    if (dt[hashid].id==id){
  return dt[hashid].val;
} else {
   int i=0;
    while (i<sz){
      hashid = rehash(hashid);
  if (dt[hashid].id==id){
    return dt[hashid].val;
  }
  if (i==sz){ return 0;}
  i++;
}
}
  }
}
 
void Hashtable::output(){
  cout << "idx  id  val" << endl;
  for ( int i=0;i<sz;i++){
    cout << i << "    " << dt[i].id << "    " << dt[i].val << endl;
  }
}
 
 
int main(){
  Hashtable hashtable;
  data d;
  d.id = 27;
  d.val = 27;
  hashtable.insert(d);
  hashtable.output();
  
  
  d.id = 99;
  d.val = 99;
  hashtable.insert(d);
  hashtable.output();
  
  d.id = 32;
  d.val = 32;
  hashtable.insert(d);
  hashtable.output();
  
  d.id = 77;
  d.val = 77;
  hashtable.insert(d);
  hashtable.output();
  
  //retrieve data
  int id = 77;
  int val = hashtable.retrieve(id);
  cout << endl;
  cout << "Retrieving ... " << endl;
  cout << "hashtable[" << id<< "]=" << val << endl;
  cout << endl;
  
  
  //delete element
  d.id = 32;
  d.val = 32;
  hashtable. remove (d);
  hashtable.output();
  
  d.id = 77;
  d.val = 77;
  hashtable. remove (d);
  hashtable.output();
  
      
  return 0;
}


原文地址如下:

http://yucoding.blogspot.com/2013/08/re-viewhash-table-basics.html



-- 用户表 CREATE TABLE IF NOT EXISTS users ( id INT PRIMARY KEY AUTO_INCREMENT, username VARCHAR(50) NOT NULL UNIQUE, password VARCHAR(100) NOT NULL, email VARCHAR(100), phone VARCHAR(20), address TEXT, create_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, status INT DEFAULT 1 COMMENT '0:禁用 1:正常', role VARCHAR(20) DEFAULT 'USER' COMMENT 'USER:普通用户 ADMIN:管理员', points INT DEFAULT 0, total_spent DECIMAL(10,2) DEFAULT 0.00 ); -- 商品分类表 CREATE TABLE IF NOT EXISTS categories ( id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(50) NOT NULL, description TEXT, create_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, status INT DEFAULT 1 ); -- 商品表 CREATE TABLE IF NOT EXISTS products ( id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(100) NOT NULL, description TEXT, price DECIMAL(10,2) NOT NULL, stock INT DEFAULT 0, image VARCHAR(255), category_id INT, create_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, status INT DEFAULT 1 COMMENT '0:下架 1:上架', sales_count INT DEFAULT 0 COMMENT '销量', rating DECIMAL(3,2) DEFAULT 0.00 COMMENT '平均评分', update_time TIMESTAMP NULL, FOREIGN KEY (category_id) REFERENCES categories(id) ); -- 购物车表 CREATE TABLE IF NOT EXISTS cart_items ( id INT PRIMARY KEY AUTO_INCREMENT, user_id INT NOT NULL, product_id INT NOT NULL, quantity INT NOT NULL DEFAULT 1, create_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, update_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, FOREIGN KEY (user_id) REFERENCES users(id), FOREIGN KEY (product_id) REFERENCES products(id), UNIQUE KEY unique_user_product (user_id, product_id) ); -- 订单表 CREATE TABLE IF NOT EXISTS orders ( id INT PRIMARY KEY AUTO_INCREMENT, user_id INT NOT NULL, order_no VARCHAR(50) NOT NULL UNIQUE, total_amount DECIMAL(10,2) NOT NULL, status INT DEFAULT 0 COMMENT '0:待付款 1:已付款 2:已发货 3:已完成 4:已取消', address TEXT, phone VARCHAR(20), receiver_name VARCHAR(50), create_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, pay_time TIMESTAMP NULL, ship_time TIMESTAMP NULL, complete_time TIMESTAMP NULL, payment_method VARCHAR(20) COMMENT '支付方式:ALIPAY/WEIXIN/BANK', payment_status INT DEFAULT 0 COMMENT '0:未支付 1:已支付 2:支付失败', FOREIGN KEY (user_id) REFERENCES users(id) ); -- 订单项表 CREATE TABLE IF NOT EXISTS order_items ( id INT PRIMARY KEY AUTO_INCREMENT, order_id INT NOT NULL, product_id INT NOT NULL, product_name VARCHAR(100) NOT NULL, price DECIMAL(10,2) NOT NULL, quantity INT NOT NULL, subtotal DECIMAL(10,2) NOT NULL, FOREIGN KEY (order_id) REFERENCES orders(id), FOREIGN KEY (product_id) REFERENCES products(id) ); -- 商品评价表 CREATE TABLE IF NOT EXISTS product_reviews ( id INT PRIMARY KEY AUTO_INCREMENT, user_id INT NOT NULL, product_id INT NOT NULL, order_id INT NOT NULL, rating INT NOT NULL CHECK (rating >= 1 AND rating <= 5), content TEXT, images TEXT COMMENT '评价图片,多个图片用逗号分隔', create_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, status INT DEFAULT 1 COMMENT '0:隐藏 1:显示', FOREIGN KEY (user_id) REFERENCES users(id), FOREIGN KEY (product_id) REFERENCES products(id), FOREIGN KEY (order_id) REFERENCES orders(id) ); -- 支付记录表 CREATE TABLE IF NOT EXISTS payment_records ( id INT PRIMARY KEY AUTO_INCREMENT, order_id INT NOT NULL, payment_no VARCHAR(100) NOT NULL UNIQUE, amount DECIMAL(10,2) NOT NULL, payment_method VARCHAR(20) NOT NULL, status INT DEFAULT 0 COMMENT '0:待支付 1:支付成功 2:支付失败 3:已退款', create_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, pay_time TIMESTAMP NULL, transaction_id VARCHAR(100) COMMENT '第三方支付交易号', FOREIGN KEY (order_id) REFERENCES orders(id) ); -- 用户权限表 CREATE TABLE IF NOT EXISTS user_permissions ( id INT PRIMARY KEY AUTO_INCREMENT, user_id INT NOT NULL, permission VARCHAR(50) NOT NULL COMMENT '权限名称', create_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, FOREIGN KEY (user_id) REFERENCES users(id), UNIQUE KEY unique_user_permission (user_id, permission) );数据库表结构图
07-08
【无人机】基于改进粒子群算法的无人机路径规划研究[和遗传算法、粒子群算法进行比较](Matlab代码实现)内容概要:本文围绕基于改进粒子群算法的无人机路径规划展开研究,重点探讨了在复杂环境中利用改进粒子群算法(PSO)实现无人机三维路径规划的方法,并将其与遗传算法(GA)、标准粒子群算法等传统优化算法进行对比分析。研究内容涵盖路径规划的多目标优化、避障策略、航路点约束以及算法收敛性和寻优能力的评估,所有实验均通过Matlab代码实现,提供了完整的仿真验证流程。文章还提到了多种智能优化算法在无人机路径规划中的应用比较,突出了改进PSO在收敛速度和全局寻优方面的优势。; 适合人群:具备一定Matlab编程基础和优化算法知识的研究生、科研人员及从事无人机路径规划、智能优化算法研究的相关技术人员。; 使用场景及目标:①用于无人机在复杂地形或动态环境下的三维路径规划仿真研究;②比较不同智能优化算法(如PSO、GA、蚁群算法、RRT等)在路径规划中的性能差异;③为多目标优化问题提供算法选型和改进思路。; 阅读建议:建议读者结合文中提供的Matlab代码进行实践操作,重点关注算法的参数设置、适应度函数设计及路径约束处理方式,同时可参考文中提到的多种算法对比思路,拓展到其他智能优化算法的研究与改进中。
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值