HashMap

最新推荐文章于 2024-04-14 21:05:40 发布

原创最新推荐文章于 2024-04-14 21:05:40 发布 · 171 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#hashmap #java

java 专栏收录该内容

66 篇文章

订阅专栏

本文深入解析HashMap的基本原理，包括Key-Value存储、get和put操作，哈希函数的应用，解决冲突的链表法，以及扩容（Rehash）机制。重点讲解了HashMap的默认长度、负载因子和如何确保Hash均匀性。

HashMap

HashMap简介

HashMap是一个用于存储Key-Value键值对的集合，每一个键值对也叫做Entry。这些个键值对（Entry）分散存储在一个数组当中，这个数组就是HashMap的主干。HashMap数组每一个元素的初始值都是Null。

HashMap中的Get和Set方法

put方法

map.put("apple", 0);//插入一个key为“apple”的元素，接着利用一个哈希函数确定Entry的插入位置
index = hash(key) //计算出插入位置

源码中：
/**
     * Associates the specified value with the specified key in this map.
     * If the map previously contained a mapping for the key, the old
     * value is replaced.
     *
     * @param key key with which the specified value is to be associated
     * @param value value to be associated with the specified key
     * @return the previous value associated with <tt>key</tt>, or
     *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
     *         (A <tt>null</tt> return can also indicate that the map
     *         previously associated <tt>null</tt> with <tt>key</tt>.)
     */
    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

HashMap长度是有限的，当插入Entry越来越多是，会出现index冲突，这时可以用链表解决。

HashMap数组的每一个元素不止是一个Entry对象，也是一个链表的头节点。每一个Entry对象通过Next指针指向它的下一个Entry节点。当新来的Entry映射到冲突的数组位置时，只需要插入到对应的链表即可：

get方法

map.get("apple");//首先会把输入的key做一次Hash映射，得到相应的index
index = hash(key) //计算出index

 /**
     * Returns the value to which the specified key is mapped,
     * or {@code null} if this map contains no mapping for the key.
     *
     * <p>More formally, if this map contains a mapping from a key
     * {@code k} to a value {@code v} such that {@code (key==null ? k==null :
     * key.equals(k))}, then this method returns {@code v}; otherwise
     * it returns {@code null}.  (There can be at most one such mapping.)
     *
     * <p>A return value of {@code null} does not <i>necessarily</i>
     * indicate that the map contains no mapping for the key; it's also
     * possible that the map explicitly maps the key to {@code null}.
     * The {@link #containsKey containsKey} operation may be used to
     * distinguish these two cases.
     *
     * @see #put(Object, Object)
     */
    public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }

对于Hash冲突，同一个位置有可能匹配到多个Entry，这时候就需要顺着对应链表的头节点，一个一个向下来查找。
插入的节点一般放到头结点，HashMap的发明者认为最后插入的Entry被查找的可能性更大。

HashMap默认长度

HashMap默认长度16,每次自动扩展或是手动初始化时，长度必须是2的幂。初始化长度为16，是为了服务从key映射到index的Hash算法。

    /**
     * The default initial capacity - MUST be a power of two.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

为了实现高效的Hash算法，HashMap的发明者使用了为运算的方式。

1.计算book的hashcode，结果为十进制的3029737，二进制的101110001110101110 1001。

2.假定HashMap长度是默认的16，计算Length-1的结果为十进制的15，二进制的1111。

3.把以上两个结果做与运算，101110001110101110 1001 & 1111 = 1001，十进制是9，所以 index=9。

综上所述：

Hash算法最终得到得index结果，完全取决于HashCode值得最后几位。

#进行位运算,Length是HashMap的长度
index = HashCode(key) & (Length - 1)

注意：

长度16或者其他2的幂，Length-1的值是所有二进制位全为1，这种情况下，index的结果等同于HashCode后几位的值。只要输入的HashCode本身分布均匀，Hash算法的结果就是均匀的。

高并发下的HashMap

Rehash

Rehash是HashMap在扩容时候的一个步骤。HashMap的容量是有限的。当经过多次元素插入，使得HashMap达到一定饱和度时，Key映射位置发生冲突的几率会逐渐提高。这时候，HashMap需要扩展它的长度，也就是进行Resize。

  # HashMap的当前长度，其长度是2的幂。
  /**
     * The default initial capacity - MUST be a power of two.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
  # HashMap负载因子，默认值是0.75f
    /**
     * The load factor used when none specified in constructor.
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

衡量HashMap是否进行Resize的条件如下：

HashMap.Size >= Capacity * LoadFactor