知识点5----数据结构键值对 HashMap、LinkedHashMap（LruCache缓存）

费城之鹰

已于 2022-04-24 09:53:53 修改

阅读量1.5k

点赞数

CC 4.0 BY-SA版权

分类专栏：知识归档文章标签： hash

于 2022-04-09 10:38:22 首次发布

本文链接：https://blog.youkuaiyun.com/jakezhang1990/article/details/124025345

知识归档专栏收录该内容

19 篇文章

订阅专栏

本文详细解读Android API 30 HashMap与LinkedHashMap的区别，涉及key-value规则、容量限制、查询效率、删除操作、LruCache原理及实践。重点讲解散列算法和两种数据结构的使用场景对比。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

特别：–我们这里看的HashMap是Android API30 platform（也就是Android 11.0(R））中的android.jarHashMap，并不是jdk中的HasMap，二者还是有一些细微的差别的。

HashMap

键值对、key-value、entry实体、散列链表

HashMap是键值对的数据结构，存储的时候使用的是散列式存储。
java.util.HashMap的实现是一个标准的hash表+链表结构。

public class HashMap<K,V> extends AbstractMap<K,V>
    implements Map<K,V>, Cloneable, Serializable {
    /**
     * The default initial capacity - MUST be a power of two.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * The load factor used when none specified in constructor.
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * The bin count threshold for using a tree rather than list for a
     * bin.  Bins are converted to trees when adding an element to a
     * bin with at least this many nodes. The value must be greater
     * than 2 and should be at least 8 to mesh with assumptions in
     * tree removal about conversion back to plain bins upon
     * shrinkage.
     */
    static final int TREEIFY_THRESHOLD = 8;

    /**
     * The bin count threshold for untreeifying a (split) bin during a
     * resize operation. Should be less than TREEIFY_THRESHOLD, and at
     * most 6 to mesh with shrinkage detection under removal.
     */
    static final int UNTREEIFY_THRESHOLD = 6;

    /**
     * The smallest table capacity for which bins may be treeified.
     * (Otherwise the table is resized if too many nodes in a bin.)
     * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
     * between resizing and treeification thresholds.
     */
    static final int MIN_TREEIFY_CAPACITY = 64;

    /**
     * Basic hash bin node, used for most entries.  (See below for
     * TreeNode subclass, and in LinkedHashMap for its Entry subclass.)
     */
    static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }

        public final K getKey()        { return key; }
        public final V getValue()      { return value; }
        public final String toString() { return key + "=" + value; }

        public final int hashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;
        }

        public final boolean equals(Object o) {
            if (o == this)
                return true;
            if (o instanceof Map.Entry) {
                Map.Entry<?,?> e = (Map.Entry<?,?>)o;
                if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                    return true;
            }
            return false;
        }
    }

/* ---------------- Fields -------------- */

    /**
     * The table, initialized on first use, and resized as
     * necessary. When allocated, length is always a power of two.
     * (We also tolerate length zero in some operations to allow
     * bootstrapping mechanics that are currently not needed.)
     */
    transient Node<K,V>[] table;

    /**
     * Holds cached entrySet(). Note that AbstractMap fields are used
     * for keySet() and values().
     */
    transient Set<Map.Entry<K,V>> entrySet;

    /**
     * The number of key-value mappings contained in this map.
     */
    transient int size;

    /**
     * The number of times this HashMap has been structurally modified
     * Structural modifications are those that change the number of mappings in
     * the HashMap or otherwise modify its internal structure (e.g.,
     * rehash).  This field is used to make iterators on Collection-views of
     * the HashMap fail-fast.  (See ConcurrentModificationException).
     */
    transient int modCount;

    /**
     * The next size value at which to resize (capacity * load factor).
     *
     * @serial
     */
    // (The javadoc description is true upon serialization.
    // Additionally, if the table array has not been allocated, this
    // field holds the initial array capacity, or zero signifying
    // DEFAULT_INITIAL_CAPACITY.)
    int threshold;

    /**
     * The load factor for the hash table.
     *
     * @serial
     */
    final float loadFactor;
}

public abstract class AbstractMap<K, V> implements java.util.Map<K,V> {

/**
 *
 * @param <K> the type of keys maintained by this map
 * @param <V> the type of mapped values
 *
 * @author  Josh Bloch
 * @see HashMap
 * @see TreeMap
 * @see Hashtable
 * @see SortedMap
 * @see Collection
 * @see Set
 * @since 1.2
 */
public interface Map<K, V> {}

注意点

1，为什么HashMap的key不可以重复，value可以重复？

看看Map接口中key和value的数据结构就可以知道为什么了，key和value的数据结构是在根节点Map接口中定义的：

Set<K> keySet();
Collection<V> values();

可以从源码里看到，根因是HashMap的key是Set数据结构，值是Collection的。

2，HashMap的key和value是否可以为null？

这个根因也可以在Map接口中看到。

HashMap对象的key、value值均可为null。

HahTable对象的key、value值均不可为null。

且两者的的key值均不能重复，若添加key相同的键值对，后面的value会自动覆盖前面的value，但不会报错。

3，HashMap最大可以存储多少键值对？HashMap的最大容量，最小容量？

理论上是只要内存没有溢出是可以无限存储的，但是通过查看Map接口的源码可以发现，最多存储Integer.MAX_VALUE这么多。

    /**
     * Returns the number of key-value mappings in this map.  If the
     * map contains more than {@code Integer.MAX_VALUE} elements, returns
     * {@code Integer.MAX_VALUE}.
     *
     * @return the number of key-value mappings in this map
     */
    int size();

size()方法，最大返回Integer.MAX_VALUE，当数量超过这个值依然返回这个值。

查看HashMap的源码可以看到：

    /**
     * The default initial capacity - MUST be a power of two.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
  
    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;
    
    /**
     * The load factor used when none specified in constructor.
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

容量最大其实是2的30次方，Integer.MAX_VALUE的值是2的32次方。初始化容量是2的4次方，16个。
最小容量就是16。
DEFAULT_LOAD_FACTOR 是扩容因子，当开辟的容量的使用率达到了0.75就会进行扩容。

4，HashMap的查询效率，简单解释一下HashMap的散列算法。

与数量无关，HashMap读取性能主要取决于放入HashMap中的key对象的hashCode方法的实现，即此方法返回值导致的Hash冲突的比例，冲突越多，性能越差。

java.util.HashMap的实现是一个标准的hash表+链表结构：

HashMap中持有一个数组成员变量table
table的每个元素都是一个链表（用于解决冲突）
链表的每个元素Entry都存储着一对key/value

当执行HashMap.put(key, value)的时候：

HashMap会根据key.hashCode()方法来决定这一对key/value存储在数组table中的位置
若那个位置上已经有一个链表（发生冲突），则接到链表尾端，否则作为链表头存储

散列算法：

也就是说，标准的hash表是一个数组，一直垂直方向的数组，列表是横向的，垂直的数组就像通讯录中的a/b/c/d.....字母索引一样，横向的列表就像那个字母开头的一个个人的名字，当C开头的人还没有的时候，计算hash值存到横向的第一个散列列表，当有有多个人C开头的人的时候，就会在计算了一次hash值后再进行二次计算hash，存储在C开头的散列列表的后面。

那么当执行HashMap.get(key)的时候，查询过程也是一样的两步：

根据key.hashCode()方法确定在table中的位置
遍历所在位置的链表，若找到equals的key则返回，否则返回null

所以key.hashCode()导致产生冲突的数量决定了这张HashMap的查询性能。

    /**
     * The table, initialized on first use, and resized as
     * necessary. When allocated, length is always a power of two.
     * (We also tolerate length zero in some operations to allow
     * bootstrapping mechanics that are currently not needed.)
     */
    transient Node<K,V>[] table;

table是一个键值对数组，HashMap中所有的键值对都是存储在这个数组中的，transient 表示支持反序列化。

5,HashMap的删除需要通过迭代器iteration ，否则容易出现异常。

iteration 进行Map的增删改查是最安全的，因为iteration 是Map的内部类接口，避免了异步造成的数据不同步。
如果不适用迭代器进行Map的删除操作，特别是在并发情况下，很容易出现例如，一个5个元素的map，已经删除了一个元素，在取值的时候依然去取第五个元素的情况。

6，HashMap的key不可以再是一个map，但是value可以再是一个map。

7，两个key的hash值相同，key会是相同的么？

两个key，key1和key2的hash值相同，key不一定相同，因为key的hash值是通过一套hash算法计算出来的，相同就是所谓的碰撞，
但是，相同的两个key，hash值肯定是相同的。

key值相同hash值肯定相同； hash值相同key值可能不同；

8，

Lrucache、LinkedHashMap

package android.util;

import android.compat.annotation.UnsupportedAppUsage;

import java.util.LinkedHashMap;
import java.util.Map;

public class LruCache<K, V> {
    @UnsupportedAppUsage
    private final LinkedHashMap<K, V> map;

    /** Size of this cache in units. Not necessarily the number of elements. */
    private int size;
    private int maxSize;

    private int putCount;
    private int createCount;
    private int evictionCount;
    private int hitCount;
    private int missCount;

    /**
     * @param maxSize for caches that do not override {@link #sizeOf}, this is
     *     the maximum number of entries in the cache. For all other caches,
     *     this is the maximum sum of the sizes of the entries in this cache.
     */
    public LruCache(int maxSize) {
        if (maxSize <= 0) {
            throw new IllegalArgumentException("maxSize <= 0");
        }
        this.maxSize = maxSize;
        this.map = new LinkedHashMap<K, V>(0, 0.75f, true);
    }

LinkedHashMap继承了HashMap，LinkedHashMap中的内部类LinkedHashMapEntry继承了HashMap的HashMapEntry。 LinkedHashMapEntry继承了HashMapEntry，所以HashMapEntry中的属性和方法在LinkedHashMapEntry中也都是有的。 HashMap的内部的HashMapEntry是实现了Map接口内的interface Entry<K, V> 接口。

public class LinkedHashMap<K,V>
    extends HashMap<K,V>
    implements Map<K,V>
{
    /**
     * HashMap.Node subclass for normal LinkedHashMap entries.
     */
    static class LinkedHashMapEntry<K,V> extends HashMap.Node<K,V> {
        LinkedHashMapEntry<K,V> before, after;
        LinkedHashMapEntry(int hash, K key, V value, Node<K,V> next) {
            super(hash, key, value, next);
        }
    }

    private static final long serialVersionUID = 3801124242820219131L;

    /**
     * The head (eldest) of the doubly linked list.
     */
    transient LinkedHashMapEntry<K,V> head;

    /**
     * The tail (youngest) of the doubly linked list.
     */
    transient LinkedHashMapEntry<K,V> tail;

    /**
     * The iteration ordering method for this linked hash map: <tt>true</tt>
     * for access-order, <tt>false</tt> for insertion-order.
     *
     * @serial
     */
    final boolean accessOrder;
}

1， Lrucache的核心实现就是通过LinkedHashMap实现的

在Lruchache中创建了一个LinkedHashMap对象。
LruCache其实是对LinkedHashMap的二次封装，使用Lru算法进行内存缓存，缓存是通过使用LinkedHashMap进行put存储或者remove清除释放。

2，LruCache(int maxSize)这里的maxSize取多少？

这个maxSize就是缓存的大小，一般推介的大小是取堆内存的八分之一。

3，Lru算法解单说明

Lru的中心思想就是把图片进行缓存，当发生内存较高的时候，就需要进行释放内存了，释放哪些？释放已经缓存的最早的老的那张很久没有使用的那张图片移除出来，进行释放，这个释放，需要回调一个方法entryRemoved，这个方法是protected的，没有实现，需要我们自己去重写它，比如MyLruCache，我们就需要重写这个
entryRemoved，移除元素，进行图片释放。

    /**
     * Called for entries that have been evicted or removed. This method is
     * invoked when a value is evicted to make space, removed by a call to
     * {@link #remove}, or replaced by a call to {@link #put}. The default
     * implementation does nothing.
     *
     * <p>The method is called without synchronization: other threads may
     * access the cache while this method is executing.
     *
     * @param evicted true if the entry is being removed to make space, false
     *     if the removal was caused by a {@link #put} or {@link #remove}.
     * @param newValue the new value for {@code key}, if it exists. If non-null,
     *     this removal was caused by a {@link #put} or a {@link #get}. Otherwise it was caused by
     *     an eviction or a {@link #remove}.
     */
    protected void entryRemoved(boolean evicted, K key, V oldValue, V newValue) {}

4，怎么确定那张最老的没有使用的元素？

要移除的是那张最近很少使用的最老的元素，这个元素怎么确定？
有两种排序存储模式，第一种：和队列相似，基于顺序存储的插入排序模式。第二种：基于访问排序模式。

基于顺序存储的插入排序模式。队列头，肯定是先进来的元素，它肯定是最老的元素；最后进来的，肯定是最新的元素。
基于访问排序模式。
每访问一个元素，就把这个元素放到队尾，移除的时候把队头移除掉，最先访问的元素，所以，都是最后才被移除出去的。

    /**
     * The iteration ordering method for this linked hash map: <tt>true</tt>
     * for access-order, <tt>false</tt> for insertion-order.
     *
     * @serial
     */
    final boolean accessOrder;

可以看到LinkedHashMap源码中，定义了一个boolean类型的变量accessOrder，true表示使用访问排序，false表示插入排序。

LinkedHashMap是散列列表+双向回环列表。

HashMap与LinkedHashMap的区别联系？

HashMap是数组+单项列表的散列存储。
LinkedHashMap是数组+双向回环列表。

链接：
Java集合的常见用法你知道多少 https://blog.youkuaiyun.com/weixin_35874254/article/details/112366244
LruCache、HashMap和LinkedHashMap的使用 https://blog.youkuaiyun.com/WLX10428/article/details/119053394