前言
之前的HashMap源码阅读系列给我们呈现了HashMap内部的基本结构,还有扩容条件及过程分析等,今天我们就来进行一个整体的总结,主要包括与JDK1.7的对比和主要方法的流程总结。
正文
对比JDK1.7的优化
1、hash方法的扰动
hash()方法使用来获取key的hash值的,我们知道是哈希函数映射的比较均匀的话,碰撞的概率会小一些。
JDK1.7中的hash方法:
/**
* Retrieve object hash code and applies a supplemental hash function to the
* result hash, which defends against poor quality hash functions. This is
* critical because HashMap uses power-of-two length hash tables, that
* otherwise encounter collisions for hashCodes that do not differ
* in lower bits. Note: Null keys always map to hash 0, thus index 0.
*/
final int hash(Object k) {
int h = hashSeed;
if (0 != h && k instanceof String) {
return sun.misc.Hashing.stringHash32((String) k);
}
h ^= k.hashCode();
// This function ensures that hashCodes that differ only by
// constant multiples at each bit position have a bounded
// number of collisions (approximately 8 at default load factor).
h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);
}
JDK1.8中的hash方法:
/**
* Computes key.hashCode() and spreads (XORs) higher bits of hash
* to lower. Because the table uses power-of-two masking, sets of
* hashes that vary only in bits above the current mask will
* always collide. (Among known examples are sets of Float keys
* holding consecutive whole numbers in small tables.) So we
* apply a transform that spreads the impact of higher bits
* downward. There is a tradeoff between speed, utility, and
* quality of bit-spreading. Because many common sets of hashes
* are already reasonably distributed (so don't benefit from
* spreading), and because we use trees to handle large sets of
* collisions in bins, we just XOR some shifted bits in the
* cheapest possible way to reduce systematic lossage, as well as
* to incorporate impact of the highest bits that would otherwise
* never be used in index calculations because of table bounds.
*/
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
经过对比,可以看出JDK 1.7中的扰动函数会对key进行四次扰动,而在JDK1.8中只进行了一次,并且对key的hashCode值进行了异或运算。
通过前面文章的分析我们知道,HashMap使用hash&(length-1)得到节点的索引位置的,就是取hash值的后几位来进行运算,所以只有低位的数参与运算,高位的数被忽略,但是hash方法中的异或运算使得hashCode的每一位都参与了获取索引位置的运算,这样hash函数会映射的更均匀,也降低了hash碰撞的概率。
2、加入了红黑树
JDK1.7中HashMap在key的hash值相同时,会形成链表,当链表上节点的数量越来越多时,HashMap的查询效率降低,查询性能由O(1)变成O(n),这是HashMap的缺陷。
JDK1.8中HashMap加入了红黑树来弥补这个缺陷,当链表上节点个数超过8时,HashMap会将链表转化为红黑树结构。红黑树结构的引入使得原本被降低的查询性能由O(n)提升至O(lgn)。
3、扩容方式
JDK1.7中HashMap通过rehash方法来进行扩容:
下面贴上resize与其调用的关键方法:
/**
* Rehashes the contents of this map into a new array with a
* larger capacity. This method is called automatically when the
* number of keys in this map reaches its threshold.
*
* If current capacity is MAXIMUM_CAPACITY, this method does not
* resize the map, but sets threshold to Integer.MAX_VALUE.
* This has the effect of preventing future calls.
*
* @param newCapacity the new capacity, MUST be a power of two;
* must be greater than current capacity unless current
* capacity is MAXIMUM_CAPACITY (in which case value
* is irrelevant).
*/
void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}
Entry[] newTable = new Entry[newCapacity];
transfer(newTable, initHashSeedAsNeeded(newCapacity));
table = newTable;
threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
}
transfer方法:
/**
* Transfers all entries from current table to newTable.
*/
void transfer(Entry[] newTable, boolean rehash) {
int newCapacity = newTable.length;
for (Entry<K,V> e : table) {
while(null != e) {
Entry<K,V> next = e.next;
if (rehash) {
e.hash = null == e.key ? 0 : hash(e.key);
}
int i = indexFor(e.hash, newCapacity);
e.next = newTable[i];
newTable[i] = e;
e = next;
}
}
}
简单概括一下扩容流程:
- 计算新hash桶的大小和阈值
- 根据新hash桶的大小生成新的hash桶数组
- 对当前hash桶中的元素进行转移
- 遍历hash桶
- 遍历制定下标hash桶中的待转移节点
- 根据节点hash值算出在新hash桶中的下标
- 使用头插法将待转移的节点插入到新hash桶中的单链表上
- 将新hash桶以及新hash桶的大小以及阈值设置到当前HashMap对象
JDK1.8中HashMap的扩容方法:
/**
* Initializes or doubles table size. If null, allocates in
* accord with initial capacity target held in field threshold.
* Otherwise, because we are using power-of-two expansion, the
* elements from each bin must either stay at same index, or move
* with a power of two offset in the new table.
*
* @return the table
*/
final Node<K,V>[] resize() {
Node<K,V>[] oldTab = table;
int oldCap = (oldTab == null) ? 0 : oldTab.length;
int oldThr = threshold;
int newCap, newThr = 0;
if (oldCap > 0) {
if (oldCap >= MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return oldTab;
}
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
oldCap >= DEFAULT_INITIAL_CAPACITY)
newThr = oldThr << 1; // double threshold
}
else if (oldThr > 0) // initial capacity was placed in threshold
newCap = oldThr;
else { // zero initial threshold signifies using defaults
newCap = DEFAULT_INITIAL_CAPACITY;
newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
}
if (newThr == 0) {
float ft = (float)newCap * loadFactor;
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
(int)ft : Integer.MAX_VALUE);
}
threshold = newThr;
@SuppressWarnings({"rawtypes","unchecked"})
Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
table = newTab;
if (oldTab != null) {
for (int j = 0; j < oldCap; ++j) {
Node<K,V> e;
if ((e = oldTab[j]) != null) {
oldTab[j] = null;
if (e.next == null)
newTab[e.hash & (newCap - 1)] = e;
else if (e instanceof TreeNode)
((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
else { // preserve order
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
do {
next = e.next;
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
else {
if (hiTail == null)
hiHead = e;
else
hiTail.next = e;
hiTail = e;
}
} while ((e = next) != null);
if (loTail != null) {
loTail.next = null;
newTab[j] = loHead;
}
if (hiTail != null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
}
}
}
}
return newTab;
}
简单概括一下扩容流程:
- 计算新hash桶的大小和阈值
- 根据新hash桶的大小生成新的hash桶数组,如果当前hash桶为空,构造一个长度默认的hash桶(JDK 1.7中在扩容前会先检查当前hash桶是否为空)
- 将新hash桶以及新hash桶的大小以及阈值设置到当前HashMap对象
- 对当前hash桶中的元素进行转移
- 遍历hash桶
- 遍历指定下标hash桶中的待转移节点
- 如果指定下标hash桶中待转移节点只有一个,直接计算在新hash桶中的落点并转移到新hash桶中
- 如果指定下标hash桶中存储的是树,按照树的结构来转移(暂不做介绍)
- 如果指定下标hash桶中存的是链表
- 创建低位链表头尾指针和高位链表头尾指针
- 将待转移元素按照尾插法插入到低位链表和高位链表中
- 将低位hash桶和高位hash桶分别指向低位链表和高位链表
- 返回新的hash桶
4、避免多线程下的死循环
在JDK1.7中,HashMap在扩容时采用头插法进行元素转移,在多线程条件下,有可能发生两个节点形成闭环的链表,这样将会导致扩容时发生死循环。
然而在JDK1.8中,HashMap采用了尾插法进行元素转移,各节点原有顺序不会被头尾颠倒,规避了死循环的出现,但仍然是线程不安全的,因为在扩容时,HashMap先指向了新节点数组,再进行元素转移,所以多线程竞争下,可能发生get方法取不到值得情况。因此,需要保证线程安全得情况下仍建议使用ConcurrentHashMap。
主要方法的流程
在这里,我们介绍下HashMap主要方法的流程。
1、put方法
- 计算需要put的key的hash值
- 判断hash桶是不是空,为空先进行扩容
- 判断该key值对应在hash桶上是否存在节点,不存在则直接在hash桶中创建节点
- 对比头节点的hash值和key是否和需要查询的key一致,如果一致直接覆盖头节点
- 判断头节点在红黑树中还是链表中
- 如果在红黑树中,则在红黑树中查找该节点,如果在链表中,则遍历链表查询该节点
- 如果在链表(红黑树)中存在节点的hash值和key和需要put的key一致,进行覆盖操作
- 如果不存在,创建新的节点,添加到链表(红黑树)中
- 如果当前HashMap中元素数量超过阈值,进行扩容
- 如果put()方法覆盖了某个节点,则返回这个节点的value,否则返回null
2、get方法
下面我们给出get方法及getNode方法的源码:
/**
* Returns the value to which the specified key is mapped,
* or {@code null} if this map contains no mapping for the key.
*
* <p>More formally, if this map contains a mapping from a key
* {@code k} to a value {@code v} such that {@code (key==null ? k==null :
* key.equals(k))}, then this method returns {@code v}; otherwise
* it returns {@code null}. (There can be at most one such mapping.)
*
* <p>A return value of {@code null} does not <i>necessarily</i>
* indicate that the map contains no mapping for the key; it's also
* possible that the map explicitly maps the key to {@code null}.
* The {@link #containsKey containsKey} operation may be used to
* distinguish these two cases.
*
* @see #put(Object, Object)
*/
public V get(Object key) {
Node<K,V> e;
return (e = getNode(hash(key), key)) == null ? null : e.value;
}
getNode方法:
/**
* Implements Map.get and related methods
*
* @param hash hash for key
* @param key the key
* @return the node, or null if none
*/
final Node<K,V> getNode(int hash, Object key) {
Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
if ((tab = table) != null && (n = tab.length) > 0 &&
(first = tab[(n - 1) & hash]) != null) {
if (first.hash == hash && // always check first node
((k = first.key) == key || (key != null && key.equals(k))))
return first;
if ((e = first.next) != null) {
if (first instanceof TreeNode)
return ((TreeNode<K,V>)first).getTreeNode(hash, key);
do {
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
} while ((e = e.next) != null);
}
}
return null;
}
- 计算需要查询的key的hash值
- 判断hash桶是不是空以及该hash值对应在hash桶上是否存在节点,不存在直接返回null
- 对比头节点的hash值和key是否和需要查询的key一致,如果一致直接返回头节点
- 判断头节点在红黑树中还是链表中
- 如果在红黑树中,则在红黑树中查找该节点
- 如果在链表中,则遍历链表查询该节点
参考
[1]HashMap源码分析,基于1.8,对比1.7:https://blog.youkuaiyun.com/leon_cx/article/details/81947991
[2]HashMap扩容死循环:https://mp.weixin.qq.com/s?__biz=MzUxODU3ODUyNA==&mid=2247483858&idx=1&sn=5c23c55fd93db97fc269124012cd5ec4&chksm=f98789e1cef000f7b628e2844f4984551a3a352f3ad2096f5ac0e1bcda81f0146ba759b83012&token=363892372&lang=zh_CN#rd