HashMap源码解析

最新推荐文章于 2023-04-12 15:35:40 发布

原创最新推荐文章于 2023-04-12 15:35:40 发布 · 264 阅读

0 ·

CC 4.0 BY-SA版权

java基础专栏收录该内容

2 篇文章

订阅专栏

本文介绍了HashMap，它基于哈希表实现，元素为key - value对，用单链表解决冲突，容量不足会自动增长。剖析源码可知，put数据时先确定桶位置，桶为空则直接插入，不为空则接在链表后或更新。总结了其数据结构、数组容量特点、hash函数、冲突处理和扩容条件。

1、定义

HashMap是基于哈希表的 Map 接口的实现，每一个元素是一个key-value对，其内部通过单链表解决冲突问题，容量不足（超过了阀值）时，同样会自动增长。

HashMap是基于hashing的原理，我们使用put(key, value)存储对象到HashMap中，使用get(key)从HashMap中获取对象。当我们给put()方法传递键和值时，先对Key调用hashCode方法，来计算hash值，返回的hash值用来找bucket对象，来放entry键值对。
☆☆☆数据结构示意图如下：
在这里插入图片描述

2、剖析源码

1.创建一个map对象的时候，首先会往map中put数据。接下来一起去底层瞧瞧~

Map<String, String> map = new HashMap<>();
map.put("name", "zhang");

//hashmap中put一个key和value
public V put(K key, V value) {
     return putVal(hash(key), key, value, false, true);
}

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
         //Node 类，实现Entry接口
         //tab 数组  p 链表
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        // 初始化桶数组 table，table 被延迟到插入新数据时再进行初始化
        if ((tab = table) == null || (n = tab.length) == 0)
        	/**
        	如果为空，则调用resize()方法初始化桶数组大小和计算扩容大小。
        	若第一次调用resize()方法中有两行代码：
        	newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
            源码常量：static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
            源码常量：static final float DEFAULT_LOAD_FACTOR = 0.75f;
            可以看出，hashmap默认桶·数组为16，扩容点位16*0.75f=12
        	**/
            n = (tab = resize()).length;
            // 如果桶中不包含键值对节点引用，则将新键值对节点的引用存入桶中即可
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            // 如果键的值以及节点 hash 等于链表中的第一个键值对节点时，则将 e 指向该键值对
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
                // 如果桶中的引用类型为 TreeNode，则调用红黑树的插入方法
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
            	// 对链表进行遍历，并统计链表长度
                for (int binCount = 0; ; ++binCount) {
               		// 链表中不包含要插入的键值对节点时，则将该节点接在链表的最后
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        /**
                        源码：static final int TREEIFY_THRESHOLD = 8;
                        如果链表长度大于或等于树化阈值，则进行树化操作
                        **/
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        	//桶中链表结构最多为7个，大于或等于8个时，使用红黑树
                            treeifyBin(tab, hash);
                        break;
                    }
                    // 条件为 true，表示当前链表包含要插入的键值对，终止遍历
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            // 判断要插入的键值对是否存在 HashMap 中
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                // onlyIfAbsent 表示是否仅在 oldValue 为 null 的情况下更新键值对的值
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        // 键值对数量超过阈值时，则进行扩容
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

代码分析： 1.首先先确定插入的键值对属于哪个桶；2.定位桶之后，判断桶是否为空，如果为空，直接将键值对插入。如果不为空，则需将键值对接到链表后一个位置，或者更新key相同的键值对；通过上面代码，发现在 JDK 1.8 中，HashMap 引入了红黑树优化过长链表。

2.put后数据后，get(key)某个值，这个就相对简单了许多。继续畅游底层~

	 Map<String, String> map = new HashMap<>();
     map.put("name", "zhang");    
     String name = map.get("name");

public V get(Object key) {
   Node<K,V> e;
   return (e = getNode(hash(key), key)) == null ? null : e.value;
}

final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        // 定位键值对所在桶的位置
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
            	//如果 first 是 TreeNode 类型，则调用黑红树查找方法
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                // 对链表进行查找
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

3、总结

1.hashmap数据结构包括了数组、链表、红黑树。
2.数组容量2的倍数【resize()方法扩容体现出】，目的是提高运算速度；增加散列度，降低冲突；减少内存碎片。

//左移一位。故数组容量2的倍数
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
     oldCap >= DEFAULT_INITIAL_CAPACITY)
     newThr = oldThr << 1;

3.hash函数：hashcode的高16位与低16位进行异或，目的是增加散列度，降低冲突。

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

4.插入冲突：通过单链表解决冲突，如果链表长度等于或超过TREEIFY_THRESHOLD = 8;进行单链表和红黑树的转换，进而提高查询速度。
5.扩容：扩容条件：实际节点数大于等于容量的3/4