Java中HashMap如何工作的?

本文深入探讨了Java HashMap的工作原理,包括哈希表基础、存储与检索逻辑、碰撞处理、重装机制及多线程访问下可能出现的问题。同时,文章详细解释了如何使用HashMap、关键概念如hashCode()和equals()的用法,以及如何优化HashMap性能。此外,还对比了HashMap与HashTable的区别,并讨论了在多线程环境下HashMap可能遇到的race condition问题。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

How HashMap works in Java

How HashMap  works in Java
How HashMap works in Java or sometime how get method work in HashMap is common questions on Java interviews now days. Almost everybody who worked in Java knows about HashMap, where to use HashMap or difference between Hashtable and HashMap then why this interview question becomes so special? Because of the depth it offers. It has become very popular java interview question in almost any senior or mid-senior level Java interviews. Investment banks mostly prefer to ask this question and some time even ask to implement your own HashMap based upon your coding aptitude. Introduction of ConcurrentHashMap and other concurrent collections has also made this questions as starting point to delve into more advanced feature. let's start the journey.

Questions start with simple statement 


"Have you used HashMap before" or   "What is HashMap? Why do we use it “
Almost everybody answers this with yes and then interviewee keep talking about common facts about HashMap like HashMap accept null while Hashtable doesn't, HashMap is not synchronized, HashMap is fast and so on along with basics like its stores key and value pairs etc. This shows that person has used HashMap  and quite familiar with the functionality HashMap offers but interview takes a sharp turn from here and next set of follow-up questions gets more detailed about fundamentals involved with HashMap in Java . Interviewer struck back with questions like

"Do you Know how HashMap works in Java” or "How does get () method of HashMap works in Java"
And then you get answers like I don't bother its standard Java API, you better look code on Java source or Open JDK; I can find it out in Google at any time etc. But some interviewee definitely answer this and will say "HashMap works on principle of hashing, we have put(key, value) and get(key) method for storing and retrieving Objects from HashMap. When we pass Key and Value object  to put() method on Java HashMap, HashMap implementation calls hashCode method on Key object and applies returned hashcode into its own hashing function to find a bucket location for storing Entry object, important point to mention is that HashMap in Java stores both key and value object as Map.Entry in bucket which is essential to understand the retrieving logic. If people fails to recognize this and say it only stores Value in the bucket they will fail to explain the retrieving logic of any object stored in Java HashMap . This answer is very much acceptable and does make sense that interviewee has fair bit of knowledge on how hashing works and how HashMap  works in Java. But this is just start of story and confusion increases when you put interviewee on scenarios faced by Java developers on day by day basis. Next question could be about collision detection and collision resolution in Java HashMap  e.g. 

"What will happen if two different objects have same hashcode?”
Now from here onwards real confusion starts, Some time candidate will say that since hashcode is equal, both objects are equal and HashMap  will throw exception or not store them again etc, Then you might want to remind them about equals() and hashCode() contract  that two unequal object in Java can have same hashcode. Some will give up at this point and few will move ahead and say "Since hashcode is same, bucket location would be same and collision will occur in HashMap, Since HashMap use LinkedList to store object, this entry (object of Map.Entry comprise key and value )  will be stored in LinkedList. Great this answer make sense though there are many collision resolution methods available this is simplest and HashMap in Java does follow this. But story does not end here and interviewer asks

"How will you retrieve Value object  if two Keys will have same hashcode?”
how HashMap works internally in JavaInterviewee will say we will call get() method and then HashMap uses Key Object's hashcode to find out bucket location and retrieves Value object but then you need to remind him that there are two Value objects are stored in same bucket , so they will say about traversal in LinkedList until we find the value object , then you ask how do you identify value object because you don't  have value object to compare ,Until they know that HashMap  stores both Key and Value in LinkedList node or as Map.Entry they won't be able to resolve this issue and will try and fail.

But those bunch of people who remember this key information will say that after finding bucket location , we will call keys.equals() method to identify correct node in LinkedList and return associated value object for that key in Java HashMap . Perfect this is the correct answer.

In many cases interviewee fails at this stage because they get confused between hashCode() and equals() or keys and values object in Java HashMap  which is pretty obvious because they are dealing with the hashcode() in all previous questions and equals() come in picture only in case of retrieving value object from HashMap in Java. Some good developer point out here that using immutable, final object with proper equals() and hashcode() implementation would act as perfect Java HashMap  keys and improve performance of Java HashMap  by reducing collision. Immutability also allows caching there hashcode of different keys which makes overall retrieval process very fast and suggest that String and various wrapper classes e.g. Integer very good keys in Java HashMap.

Now if you clear this entire Java HashMap interview,  You will be surprised by this very interesting question " What happens On HashMap in Java if the size of the HashMap  exceeds a given threshold defined by load factor ?". Until you know how HashMap  works exactly you won't be able to answer this question. If the size of the Map exceeds a given threshold defined by load-factor e.g. if load factor is .75 it will act to re-size the map once it filled 75%. Similar to other collection classes like ArrayList,  Java HashMap re-size itself by creating a new bucket array of size twice of previous size of HashMap , and then start putting every old element into that new bucket array. This process is called rehashing because it also applies hash function to find new bucket location. 

If you manage to answer this question on HashMap in Java you will be greeted by "do you see any problem with resizing of HashMap  in Java" , you might not be able to pick the context and then he will try to give you hint about multiple thread accessing the Java HashMap and potentially looking for race condition on HashMap  in Java

So the answer is Yes there is potential race condition exists while resizing HashMap in Java, if two thread at the same time found that now HashMap needs resizing and they both try to resizing. on the process of resizing of HashMap in Java , the element in bucket which is stored in linked list get reversed in order during there migration to new bucket because java HashMap  doesn't append the new element at tail instead it append new element at head to avoid tail traversing. If race condition happens then you will end up with an infinite loop. Though this point you can potentially argue that what the hell makes you think to use HashMap  in multi-threaded environment to interviewer :)

 
Few more question on HashMap in Java which is contributed by readers of Javarevisited blog  :
1) Why String, Integer and other wrapper classes are considered good keys ?
String, Integer and other wrapper classes are natural candidates of HashMap key, and String is most frequently used key as well because String is immutable and final,and overrides equals and hashcode() method. Other wrapper class also shares similar property. Immutabiility is required, in order to prevent changes on fields used to calculate hashCode() because if key object return different hashCode during insertion and retrieval than it won't be possible to get object from HashMap. Immutability is best as it offers other advantages as well like thread-safety, If you can  keep your hashCode same by only making certain fields final, then you go for that as well. Since equals() and hashCode() method is used during reterival of value object from HashMap, its important that key object correctly override these methods and follow contact. If unequal object return different hashcode than chances of collision will be less which subsequently improve performance of HashMap.

2) Can we use any custom object as key in HashMap ?
This is an extension of previous questions. Ofcourse you can use any Object as key in Java HashMap provided it follows equals and hashCode contract and its hashCode should not vary once the object is inserted into Map. If custom object is Immutable than this will be already taken care because you can not change it once created.

3) Can we use ConcurrentHashMap in place of Hashtable ?
This is another question which getting popular due to increasing popularity of ConcurrentHashMap. Since we know Hashtable is synchronized but ConcurrentHashMap provides better concurrency by only locking portion of map determined by concurrency level. ConcurrentHashMap is certainly introduced as Hashtable and can be used in place of it but Hashtable provide stronger thread-safety than ConcurrentHashMap. See my post difference between Hashtable and ConcurrentHashMap for more details.
 

Personally, I like this question because of its depth and number of concept it touches indirectly, if you look at questions asked during interview this HashMap  questions has verified
  • Concept of hashing
  • Collision resolution in HashMap
  • Use of equals () and hashCode () and there importance in HashMap?
  • Benefit of immutable object?
  • Race condition on HashMap  in Java
  • Resizing of Java HashMap

Just to summarize here are the answers which does makes sense for above questions

How HashMap  works in Java
HashMap  works on principle of hashing, we have put() and get() method for storing and retrieving object form HashMap .When we pass an both key and value to put() method to store on HashMap , it uses key object hashcode() method to calculate hashcode and they by applying hashing on that hashcode it identifies bucket location for storing value object. While retrieving it uses key object equals method to find out correct key value pair and return value object associated with that key. HashMap  uses linked list in case of collision and object will be stored in next node of linked list.
Also HashMap  stores both key+value tuple in every node of linked list.

What will happen if two different HashMap  key objects have same hashcode?
They will be stored in same bucket but no next node of linked list. And keys equals () method will be used to identify correct key value pair in HashMap .

In terms of usage Java HashMap is very versatile and I have mostly used HashMap as cache in electronic trading application I have worked . Since finance domain used Java heavily and due to performance reason we need caching HashMap and ConcurrentHashMap  comes as very handy there. You can also check following articles form Javarevisited to learn more about HashMap and Hashtable in Java :
<think>我们正在讨论Java HashMap工作原理,根据用户提供的引用内容,我们可以总结出HashMap的核心实现机制。下面将结合引用内容详细解释HashMap工作原理,并按照要求组织答案。 核心要点: 1. 数据结构:数组+链表+红黑树(JDK8及以后) 2. 哈希计算:通过扰动函数减少碰撞 3. 索引计算:通过哈希值计算数组下标 4. 解决冲突:链表和红黑树 5. 扩容机制:当元素数量超过阈值(容量*负载因子)时,扩容为原来的2倍 我们将按照以下结构进行说明: 一、底层数据结构 二、哈希计算与索引定位 三、put操作流程 四、get操作流程 五、扩容机制 六、树化与退化 注意:引用内容中提供了多个细节,我们将整合这些细节并标注引用。 开始回答:</think>### Java HashMap 实现原理与工作机制详解 HashMapJava 集合框架中最核心的哈希表实现,基于 **数组 + 链表 + 红黑树** 的复合数据结构(JDK 8+优化),通过哈希算法实现高效键值对存取。以下是其核心原理和工作流程: --- #### 一、底层数据结构(JDK 8+) 1. **主体结构**: ```java transient Node<K,V>[] table; // 主数组(哈希桶数组) ``` - **数组**(哈希桶):存储链表头节点或红黑树根节点。 - **链表节点**(`Node`):处理哈希冲突时形成链表。 ```java static class Node<K,V> { final int hash; // 键的哈希值 final K key; // 键 V value; // 值 Node<K,V> next; // 下一节点指针 } ``` - **红黑树节点**(`TreeNode`):链表过长时转换为树结构(JDK 8 优化)。 ```java static final class TreeNode<K,V> extends Node<K,V> { TreeNode<K,V> parent, left, right, prev; boolean red; // 红黑树颜色标记 } ``` 2. **数据结构演进**: | 场景 | 行为 | 阈值条件 | |---------------------|--------------------------|------------------------------| | 普通冲突 | 数组槽位挂载链表 | 链表长度 < 8 | | 冲突严重 | 链表转换为红黑树 | 链表长度 ≥ 8 **且** 数组长度 ≥ 64 | | 删除/扩容后节点减少 | 红黑树退化为链表 | 树节点数 ≤ 6 | --- #### 二、核心工作机制 1. **哈希计算(确定桶索引)**: - **扰动函数**:减少哈希碰撞(整合高/低16位特征)[^4]: ```java static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); } ``` - **索引定位**: $$ \text{index} = (n - 1) \ \&\ \text{hash} $$ 其中 $n$ 是数组长度(**总是 $2^n$**),`&` 操作等价于取模运算 `hash % n`,但效率更高[^3]。 2. **PUT 操作流程**: ```mermaid graph TD A[计算 key 的 hash 值] --> B[计算桶索引 index] B --> C{桶是否为空?} C -- 空 --> D[直接插入新节点] C -- 非空 --> E{首节点是否匹配?} E -- 是 --> F[更新值] E -- 否 --> G{结构是链表还是树?} G -- 链表 --> H[遍历链表查找匹配节点] H -- 找到 --> F H -- 未找到 --> I[尾部插入新节点] I --> J{链表长度 ≥8?} J -- 是 --> K[树化] G -- 树 --> L[调用红黑树插入方法] ``` 3. **GET 操作流程**: - 计算 `key` 的哈希值和桶索引。 - 若桶首节点匹配则直接返回。 - 否则遍历链表或红黑树查找匹配节点。 4. **扩容机制(Resize)**: - **触发条件**:元素数量 > 容量 × 负载因子(默认 0.75)。 - **操作步骤**: 1. 创建新数组(容量翻倍为 $2n$)。 2. 遍历旧数组,重新计算节点在新数组中的位置。 3. **JDK 8 优化**:利用高位标记避免重新哈希[^4]: $$ \text{newIndex} = \text{原索引} \quad \text{OR} \quad \text{原索引} + \text{旧容量} $$ 4. 链表/树节点按需拆分到新桶。 --- #### 三、关键特性 1. **允许 `null` 键值**:`key=null` 时哈希值固定为 0。 2. **非线程安全**:多线程操作需用 `ConcurrentHashMap`。 3. **迭代无序**:遍历顺序不保证与插入顺序一致。 4. **初始容量优化**: ```java // 保证数组长度是 2^n(如初始传17,实际分配32) int capacity = 1; while (capacity < initialCapacity) capacity <<= 1; // 位运算优化[^3] ``` --- #### 四、性能分析 | 操作 | 平均时间复杂度 | 最坏时间复杂度 | |------------|----------------|----------------| | `get()` | $O(1)$ | $O(\log n)$ |(树化后) | `put()` | $O(1)$ | $O(\log n)$ | | `remove()` | $O(1)$ | $O(\log n)$ | > **提示**:哈希碰撞会显著降低性能,需确保 `key.hashCode()` 分布均匀[^1]。 --- ### 相关问题 1. **为什么 HashMap 要求重写 `equals()` 时必须重写 `hashCode()`?** 2. **JDK 8 中红黑树转换阈值为什么是 8?退化阈值为什么是 6?** 3. **多线程下 HashMap 的死循环问题是如何产生的?** 4. **`ConcurrentHashMap` 如何实现线程安全?与 `HashMap` 有何区别?** 5. **为什么 HashMap 的数组长度总是 2 的幂次方?** [^1]: HashMap 底层实现原理详解 [^2]: Java HashMap 核心工作原理 [^3]: HashMap 哈希计算优化 [^4]: HashMap 数据结构与扩容机制
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值