java-Map类源码分析

最新推荐文章于 2022-04-26 00:23:49 发布

原创最新推荐文章于 2022-04-26 00:23:49 发布 · 322 阅读

0 ·

CC 4.0 BY-SA版权

源码专栏收录该内容

4 篇文章

订阅专栏

本文深入探讨了Java中的Map接口及其主要实现类，包括HashMap、LinkedHashMap、IdentityHashMap和TreeMap。重点讲解了这些类的内部结构、特性、插入和查找操作的实现原理，特别是HashMap的哈希表结构、LinkedHashMap的双向链表以及TreeMap的红黑树实现。此外，还讨论了WeakHashMap的弱引用特性及其在缓存系统中的应用。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

一、接口API

1、Map<K,V>

int size();

boolean isEmpty();

boolean containsKey(Object key);

boolean containsValue(Object value);

V get(Object key);

V put(K key, V value);

V remove(Object key);

void putAll(Map<? extends K, ? extends V> m);

void clear();

Set<K> keySet();

Collection<V> values();

Set<Map.Entry<K, V>> entrySet();

interface Entry<K,V> {

K getKey();

V getValue();

V setValue(V value);

boolean equals(Object o);

int hashCode();

}

boolean equals(Object o);

int hashCode();

2、SortedMap<K,V>

Comparator<? super K> comparator();

SortedMap<K,V> subMap(K fromKey, K toKey);

SortedMap<K,V> headMap(K toKey);

SortedMap<K,V> tailMap(K fromKey);

K firstKey();

K lastKey();

Set<K> keySet();

Collection<V> values();

Set<Map.Entry<K, V>> entrySet();

有序的键值对，由Comparator<? super K>知，顺序由K值确定。

3、NavigableMap<K,V>

Map.Entry<K,V> lowerEntry(K key);

K lowerKey(K key);

Map.Entry<K,V> floorEntry(K key);

K floorKey(K key);

Map.Entry<K,V> ceilingEntry(K key);

K ceilingKey(K key);

Map.Entry<K,V> higherEntry(K key);

K higherKey(K key);

Map.Entry<K,V> firstEntry();

Map.Entry<K,V> lastEntry();

Map.Entry<K,V> pollFirstEntry();

Map.Entry<K,V> pollLastEntry();

NavigableMap<K,V> descendingMap();

NavigableSet<K> navigableKeySet();

NavigableSet<K> descendingKeySet();

NavigableMap<K,V> subMap(K fromKey, boolean fromInclusive,

K toKey, boolean toInclusive);

NavigableMap<K,V> headMap(K toKey, boolean inclusive);

NavigableMap<K,V> tailMap(K fromKey, boolean inclusive);

SortedMap<K,V> subMap(K fromKey, K toKey);

SortedMap<K,V> headMap(K toKey);

SortedMap<K,V> tailMap(K fromKey);

二、抽象类源码解析

1、AbstractMap<K,V>

public abstract class AbstractMap<K,V> implements Map<K,V>

1、Entry

AbstractMap提供了Entry的两个内部实现类SimpleEntry和SimpleImmutableEntry，前者value可变，后者value值不可变（Immutable意为不可变的）。但是两者的key值均不可变。

三、实现类源码解析

1、HashMap<K,V>

public class HashMap<K,V>

extends AbstractMap<K,V>

implements Map<K,V>, Cloneable, Serializable

1、Entry

static class Entry<K,V> implements Map.Entry<K,V>

HashMap的内部静态类Entry有四个属性：

final K key;

V value;

Entry<K,V> next;

int hash;

注意next属性，这表示这个Entry可以形成一条单向的链表。

2、HashMap的数据结构

HashMap的最基本的数据结构实际上是一个Entry数组

transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;每一个元素Entry又根据它的next属性形成一个单向的链表。

当向HashMap中添加一个新的Entry时，首先会根据entry的key值计算一个哈希值

int hash = hash(key);

注意：该hash值是一个整数，且不同的key值可能会计算出一个相同的哈希值。具有相同hash值的不同的Entry会构成一条链，并且存到table数组的某一个元素位置上。这个存储的index由hash计算得到。

接下来，会根据这个计算得到的hash值获得该hash值在table中的存储下标i

int i = indexFor(hash, table.length);

然后，会定位到table[i]，并且遍历这一组entry，如果key值已存在，会覆盖原先的value，同时返回oldValue。

如果key值不存在，会新增一个Entry，并把这个entry放在相同hash值entry链的首位置。

注意，此时会判断是否需要扩展table数组。扩展的阈值是capacity * loadFactor，数组的长度每次扩展一倍resize(2 * table.length);

3、HashMap.keySet()是怎么实现的？

（1）、首先看keySet()的源码：

public Set<K> keySet() {

Set<K> ks = keySet;

return (ks != null ? ks : (keySet = new KeySet()));

}

keySet是HashMap的父类AbstractMap中声明的：

/**

* Each of these fields are initialized to contain an instance of the

* appropriate view the first time this view is requested. The views are

* stateless, so there's no reason to create more than one of each.

transient volatile Set<K> keySet = null;

transient volatile Collection<V> values = null;

并没有进行真正的赋值。

问题只能出在keySet = new KeySet()这个地方了。

（2）、再看KeySet类的实现

private final class KeySet extends AbstractSet<K> {

public Iterator<K> iterator() {

return newKeyIterator();

}

public int size() {

return size;

}

public boolean contains(Object o) {

return containsKey(o);

}

public boolean remove(Object o) {

return HashMap.this.removeEntryForKey(o) != null;

}

public void clear() {

HashMap.this.clear();

}

注意到，keySet类包括其所有的父类都没有任何属性用来同步存放key的集合。到这，应该意识到，我们是钻了牛角尖了。这种keySet()方法将key的集合返回的想法本身就是错误的。那么问题出在哪呢？看下边的分析。

（3）、看KeySet的toString()方法（继承自AbstractCollection）

public String toString() {

Iterator<E> it = iterator();

if (! it.hasNext())

return "[]";

StringBuilder sb = new StringBuilder();

sb.append('[');

for (;;) {

E e = it.next();

sb.append(e == this ? "(this Collection)" : e);

if (! it.hasNext())

return sb.append(']').toString();

sb.append(',').append(' ');

}

我们在使用print方法时，其实调用的是对象的toString方法。而我们发现toString实际上调用的iterator()来实现。

（4）、问题解决了么？

很显然，没有解决。我们在打印keySet时，是调用的toString()方法。但是我们用增强for循环，拿到的确实是keySet的元素。不过一样的道理，其实问题处在这个增强for循环上，它的底层实现也是通过调用的KeySet类的iterator()方法来实现的。对Set的增强for循环实际上是通过set的iterator()来实现的。

至此，问题解决。

（5）、看KeySet类的迭代器（继承的HashIterator）

（6）、反思：为什么这么设计呢？

回到最初的起点（看下边的蓝色注释）：

keySet是HashMap的父类AbstractMap中声明的：

/**

* Each of these fields are initialized to contain an instance of the

* appropriate view the first time this view is requested. The views are

* stateless, so there's no reason to create more than one of each.

transient volatile Set<K> keySet = null;

transient volatile Collection<V> values = null;

Jdk设计人员认为entry、key、value只是针对同一个HashMap不同的view方式，既然看的是同一个“物体”，很显然也没有必要另外开辟内存存储key、value，只需要采取不同的方式去看HashMap的table就好了。其实HashMap的values()、entrySet()方法也是同样的实现方式。而且三者用的是同一个迭代器HashIterator类。不同的地方在于：

private final class KeyIterator extends HashIterator<K> {

public K next() {

return nextEntry().getKey();

}

private final class ValueIterator extends HashIterator<V> {

public V next() {

return nextEntry().value;

}

private final class EntryIterator extends HashIterator<Map.Entry<K,V>> {

public Map.Entry<K,V> next() {

return nextEntry();

}

2、LinkedHashMap<K,V>

public class LinkedHashMap<K,V>

extends HashMap<K,V>

implements Map<K,V>

1、LinkedHashMap继承于HashMap，因此HashMap的特性它均存在。

2、LinkedHashMap还额外提供了一个“封闭式”的单项循环链表结构。（有序性）

private transient Entry<K,V> header;

@Override

void init() {

header = new Entry<>(-1, null, null, null);

header.before = header.after = header;

}

3、注意，虽然LinkedHashMap额外提供了这么一种“封闭式”的单项循环链表结构。但并没有在内存中存储两份，而是通过给Entry加了两个属性before、after来实现的。

4、提供了两种排序方式

/**

* The iteration ordering method for this linked hash map: <tt>true</tt>

* for access-order, <tt>false</tt> for insertion-order.

* @serial

private final boolean accessOrder;

当accessOrder为false时，是按插入顺序排序，这个容易理解。LinkedHashMap重写了addEntry();

当accessOrder为true时,是按访问顺序排序。这里的访问顺序也就是我们理解的访问顺序。其实，LinkedHashMap也会按插入顺序排序，只是在访问时，对被访问元素进行重排。

3、IdentityHashMap<K,V>

public class IdentityHashMap<K,V>

extends AbstractMap<K,V>

implements Map<K,V>, java.io.Serializable, Cloneable

1、IdentityHashMap与之前HashMap的区别在于：后者如果两个key值equals()相等，认为已经重复。而前者要求 key1==key2时（及引用完全相同），才认为已经重复。换句话说，IdentityHashMap可以存储key值相同，但引用不同的“重复”的键。

2、IdentityHashMap的存储结构是一个Object[]，数组的长度是2的指数。它的存储比较有意思的是:其他的HashMap一般以Entry为基本存储单元，而LinkedHashMap则是将key、value作为两个元素存储到table的相邻位置上。

tab[i] = k;

tab[i + 1] = value;

3、看put()方法的源码：

public V put(K key, V value) {

Object k = maskNull(key);

Object[] tab = table;

int len = tab.length;

int i = hash(k, len);

Object item;

//如果key存在，新值覆盖旧值

while ( (item = tab[i]) != null) {

if (item == k) {

V oldValue = (V) tab[i + 1];

tab[i + 1] = value;

return oldValue;

}

i = nextKeyIndex(i, len);

}

//如果key不存在，新增

modCount++;

tab[i] = k;

tab[i + 1] = value;

if (++size >= threshold)

resize(len); // len == 2 * current capacity.

return null;

}

看源码可知：首先会根据key和table的长度计算一个i值，然后会从i，i+2，i+4...的位置取出key与要存入的key比对，如果相同，则新值换旧值，不等，继续查找下一个i+n(n为偶数)。如果发现下一个key为null，说明当前table不存在该key，for循环结束，新增。

4、取值时同理：

public V get(Object key) {

Object k = maskNull(key);

Object[] tab = table;

int len = tab.length;

int i = hash(k, len);

while (true) {

Object item = tab[i];

if (item == k)

return (V) tab[i + 1];

if (item == null)

return null;

i = nextKeyIndex(i, len);

}

先计算i值，然后循环+2，直到取出或者发现为null。

4、TreeMap<K,V>

public class TreeMap<K,V>

extends AbstractMap<K,V>

implements NavigableMap<K,V>, Cloneable, java.io.Serializable

1、区分Comparator<T>和Comparable<T>接口

顾名思义：Comparator<T>表示的是一个比较器，是一个专门用来比较T或T的子类的比较器。而Comparable<T>表示可比较的，是一种比较的能力，任何实现了该接口的类的对象之间都可以通过他的比较方法来比较。看一下两者提供的方法：

public interface Comparator<T> {

int compare(T o1, T o2);//用来比较两个对象

boolean equals(Object obj);

}

public interface Comparable<T> {

public int compareTo(T o);//用来比较自身与另一个对象

}

2、再来看一下TreeMap的两个构造器的实现：

/**

* Constructs a new, empty tree map, using the natural ordering of its

* keys. All keys inserted into the map must implement the {@link

* Comparable} interface. Furthermore, all such keys must be

* <em>mutually comparable</em>: {@code k1.compareTo(k2)} must not throw

* a {@code ClassCastException} for any keys {@code k1} and

* {@code k2} in the map. If the user attempts to put a key into the

* map that violates this constraint (for example, the user attempts to

* put a string key into a map whose keys are integers), the

* {@code put(Object key, Object value)} call will throw a

* {@code ClassCastException}.

public TreeMap() {

comparator = null;

}

TreeMap提供了一个默认构造器，在构造器中将comparator 属性置为null；该种方式，不要求给TreeMap传入一个比较器，但是要求存入的key必须实现Comparable 接口。正如红色注释所言。

第二个构造器：

/**

* Constructs a new, empty tree map, ordered according to the given

* comparator. All keys inserted into the map must be <em>mutually

* comparable</em> by the given comparator: {@code comparator.compare(k1,

* k2)} must not throw a {@code ClassCastException} for any keys

* {@code k1} and {@code k2} in the map. If the user attempts to put

* a key into the map that violates this constraint, the {@code put(Object

* key, Object value)} call will throw a

* {@code ClassCastException}.

* @param comparator the comparator that will be used to order this map.

* If {@code null}, the {@linkplain Comparable natural

* ordering} of the keys will be used.

public TreeMap(Comparator<? super K> comparator) {

this.comparator = comparator;

}

该构造器传入一个比较器用来比较存入key的大小。如红色注释。

也就是说：TreeMap提供了两种比较key的方式，一种是通过传入的比较器，如果没有传入比较器的话，则是通过key实现的Comparable接口。这点从TreeMap内部的compare方法也可以看到：

final int compare(Object k1, Object k2) {

return comparator==null ? ((Comparable<? super K>)k1).compareTo((K)k2)

: comparator.compare((K)k1, (K)k2);

}

3、在弄明白TreeMap之前，我们需要首先了解红黑树的结构。

红黑树是二叉树的一种，从一个根节点（即TreeMap的root节点）开始，每一个节点（包括根节点）都会向下延伸两个节点，一个左节点，一个右节点。其中的左节点比父节点小，右节点比父节点大。注意，这里的左右节点都是指子节点中左右节点，而不是与自己平级的左右节点。如17的左节点是15，而不是8.

下边我们来看一下Entry类：

static final class Entry<K,V> implements Map.Entry<K,V> {

K key;

V value;

Entry<K,V> left = null;

Entry<K,V> right = null;

Entry<K,V> parent;

boolean color = BLACK;

每一个Entry类都是红黑树上的一个节点，它包括自身（key-value）,父节点parent，左节点left，右节点right，颜色color。注意这里的左右节点其实都是自身的子节点了。

4、看TreeMap的put方法，分析TreeMap的数据结构：

Put方法大致分了三类情况：

public V put(K key, V value) {

Entry<K,V> t = root;

第一种root为null：此时直接新建Entry并将引用给root。

if (t == null) {

compare(key, key); // type (and possibly null) check

root = new Entry<>(key, value, null);

size = 1;

modCount++;

return null;

}

int cmp;

Entry<K,V> parent;

// split comparator and comparable paths

此时只是新值覆盖旧值

Comparator<? super K> cpr = comparator;

第二种key值已经存在，此时会从root开始向下遍历，直到发现相等的key值（注意在TreeMap中判断key是否存在是通过compare方法），新值换旧值。如果不存在，则只是记录下parent这个新的Entry的位置信息，为第三种情况做好准备。

注意这第二种分了两类，一类是comparator为null，一类是comparator不为null。

if (cpr != null) {

do {

parent = t;

cmp = cpr.compare(key, t.key);

if (cmp < 0)

t = t.left;

else if (cmp > 0)

t = t.right;

else

return t.setValue(value);

} while (t != null);

}

else {

if (key == null)

throw new NullPointerException();

Comparable<? super K> k = (Comparable<? super K>) key;

do {

parent = t;

cmp = k.compareTo(t.key);

if (cmp < 0)

t = t.left;

else if (cmp > 0)

t = t.right;

else

return t.setValue(value);

} while (t != null);

}

第三种key值不存在。根据第二种情况下记录下的parent信息，将新建的Entry放到这棵红黑树上。

// 创建新Entry

Entry<K,V> e = new Entry<>(key, value, parent);

if (cmp < 0)

parent.left = e;

else

parent.right = e;

fixAfterInsertion(e);

size++;

modCount++;

return null;

}

至此，我们将新增的键值对按照正确的顺序放到了红黑树上，但是问题并没有解决。因为这种方法只是保证了顺序的准确性，却不能保证树的平衡性，即有可能所有的数据全都挂在了root的左边，这种失衡会导致查询时的速度问题。那么在TreeMap中，红黑树的平衡性是怎么实现的呢？

fixAfterInsertion(e);是通过这个方法实现的。参考红黑树的左旋右旋。

5、containsValue()方法

由于TreeMap这棵红黑树是根据key的大小排序的，所以在判断是否存在某个value时，不能从根节点沿着某条路径向下寻找。而是只能按着某种方式遍历。这里采取的遍历方式是按照key由小到大的顺序来实现的。方法successor()正是寻找当前key的下一个key（按从小到大的顺序排序）。

static <K,V> TreeMap.Entry<K,V> successor(Entry<K,V> t) {

if (t == null)

return null;

else if (t.right != null) {

Entry<K,V> p = t.right;

while (p.left != null)

p = p.left;

return p;

} else {

Entry<K,V> p = t.parent;

Entry<K,V> ch = t;

while (p != null && ch == p.right) {

ch = p;

p = p.parent;

}

return p;

}

5、WeakHashMap<K,V>

public class WeakHashMap<K,V>

extends AbstractMap<K,V>

implements Map<K,V>

1、弱引用

java中存在四种引用方式：强引用、软引用、弱引用、虚引用：

强引用比较常见，如 Object obj=new Object();这种引用方式，只要引用存在，对象就不会被GC回收；

软引用：SoftReference<String> sr = new SoftReference<String>(new String("hello"));

这种引用方式在发生OutOfMemory（OOM）时，GC会回收这部分对象。（即使引用存在）

弱引用：WeakReference<String> sr = new WeakReference<String>(new String("hello"));

这种引用方式即使没有发生OOM，只要GC回收，这部分对象也会被回收。

2、静态内部类Entry

private static class Entry<K,V> extends WeakReference<Object> implements Map.Entry<K,V>

看一下Entry类的属性和构造器：

V value;

int hash;

Entry<K,V> next;

/**

* Creates new entry.

Entry(Object key, V value, ReferenceQueue<Object> queue, int hash, Entry<K,V> next){

super(key, queue);

this.value = value;

this.hash = hash;

this.next = next;

}

可以看到Entry类继承了WeakReference类，同时构造器中调用了父类——WeakReference的构造方法super(key, queue),由此，key变成了一个弱引用。但是注意，value却并不是弱引用，它依旧是强引用，保存在Entry对象的value属性中。

由于还未看到Reference体系的源码，这里大体说一下。在此处调用了super(key, queue)的构造方法。即将key转化成一个弱引用（即Reference类中的private T referent属性），同时传入一个队列queue。当GC发生作用时，key所引用的对象会被回收，同时把key这个引用放入queue这个队列中。