图解 ConCurrentHashMap ——从源码层面，弄清楚它是怎么控制并发的

every__day

已于 2022-03-16 20:48:02 修改

阅读量1.4k

点赞数 4

分类专栏：并发编程文章标签： Concurrent addCount 并发扩容线程安全

于 2021-03-10 23:20:22 首次发布

本文链接：https://blog.youkuaiyun.com/every__day/article/details/114293107

版权

在上上篇文章《HashMap 与 ConCurrentHashMap的简单原理》中，

笼统介绍了，这两个Map 共同的数据结构。

在上篇《HashMap 源码解析》，详细解析了HashMap 的源码。

本篇分析 ConCurrentHashMap 的源码，侧重讲解与 HashMap 不同的地方。

如果前两篇文章不熟悉，出门左拐，先看那两篇。

本文源代码取 java 1.8 版本。

~~先提醒下，本文分析的超级详细，文章特别的长！！~~

一、添加元素


  public V put(K key, V value) {
   
      return putVal(key, value, false);
  }

  /** Implementation for put and putIfAbsent */
  final V putVal(K key, V value, boolean onlyIfAbsent) {
   
      if (key == null || value == null) throw new NullPointerException();
      int hash = spread(key.hashCode());
      int binCount = 0;
      for (Node<K,V>[] tab = table;;) {
   
          Node<K,V> f; int n, i, fh;
          if (tab == null || (n = tab.length) == 0)
              tab = initTable();
          else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
   
              if (casTabAt(tab, i, null,
                           new Node<K,V>(hash, key, value, null)))
                  break;                   // no lock when adding to empty bin
          }
          else if ((fh = f.hash) == MOVED)
              tab = helpTransfer(tab, f);
          else {
   
              V oldVal = null;
              synchronized (f) {
   
                  if (tabAt(tab, i) == f) {
   
                      if (fh >= 0) {
   
                          binCount = 1;
                          for (Node<K,V> e = f;; ++binCount) {
   
                              K ek;
                              if (e.hash == hash &&
                                  ((ek = e.key) == key ||
                                   (ek != null && key.equals(ek)))) {
   
                                  oldVal = e.val;
                                  if (!onlyIfAbsent)
                                      e.val = value;
                                  break;
                              }
                              Node<K,V> pred = e;
                              if ((e = e.next) == null) {
   
                                  pred.next = new Node<K,V>(hash, key,
                                                            value, null);
                                  break;
                              }
                          }
                      }
                      else if (f instanceof TreeBin) {
   
                          Node<K,V> p;
                          binCount = 2;
                          if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                         value)) != null) {
   
                              oldVal = p.val;
                              if (!onlyIfAbsent)
                                  p.val = value;
                          }
                      }
                  }
              }
              if (binCount != 0) {
   
                  if (binCount >= TREEIFY_THRESHOLD)
                      treeifyBin(tab, i);
                  if (oldVal != null)
                      return oldVal;
                  break;
              }
          }
      }
      addCount(1L, binCount);
      return null;
  }

if (key == null || value == null) throw new NullPointerException();

这行说明它与HashMap 的一点不同。

ConCurrentHashMap key 和 value 都不可以是null，而 HashMap 则无此限制。


  int hash = spread(key.hashCode());

  static final int spread(int h) {
   
      return (h ^ (h >>> 16)) & HASH_BITS;
  }
  static final int HASH_BITS = 0x7fffffff; // usable bits of normal node hash

这段是它的哈希函数，也就是求数组下标的，解析 HashMap 源码时讲过。

不明白可以看《hash & (n - 1)》。

HASH_BITS = 0x7fffffff; 这个数字，转化为二进制，是31个1，

和它进行与运算，那也那结果一定大于0。这个很重要！！

正常结点的 hash 大于 0 。

初始化

  if (tab == null || (n = tab.length) == 0)
      tab = initTable();

这段是数组为空，初始化数组，相当于上节讲的 resize() 方法，等会再详细说。

目标位置为空，直接设置


   else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
   
       if (casTabAt(tab, i, null,
                    new Node<K,V>(hash, key, value, null)))
           break;                   // no lock when adding to empty bin
   }

(n - 1) & hash 这个是哈希函数，用来算下标，上篇讲过。


   static final <K,V> Node<K,V> tabAt(Node<K,V>[] tab, int i) {
   
       return (Node<K,V>)U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE);
   }

   static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i,
                                       Node<K,V> c, Node<K,V> v) {
   
       return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v);
   }

这两个方法，是直接操作 Unsafe 类，

tabAt 是返回数组指定下标的元素，

casTabAt 是 CAS 方式，在指定下标处设值。

这里再讲的仔细点 ((long)i << ASHIFT) + ABASE 这个算出来是什么？

    Class<?> ak = Node[].class;
    ABASE = U.arrayBaseOffset(ak); // 起始位置
    int scale = U.arrayIndexScale(ak); // 一个元素的大小（int 4字节，long 8 字节）
    if ((scale & (scale - 1)) != 0)
        throw new Error("data type scale not a power of two");
    ASHIFT = 31 - Integer.numberOfLeadingZeros(scale);

    ((long)i << ASHIFT) + ABASE  // 相当于数组的寻址公式

在《为什么数组下标从0开始》，这篇文章中，说过，

数组的寻址公式是 a[i]_address = base_address + i*data_type_size。


  public static void main(String[] args) throws Exception {
   
      Field f = Unsafe.class.getDeclaredField("theUnsafe");
      f.setAccessible(true);
      Unsafe U = (Unsafe) f.get(null);
      Class<String[]> ak = String[].class;
      int base = U.arrayBaseOffset(ak);
      log.info("base:{}", base); // 16，即起始是16
      int scale = U.arrayIndexScale(ak);
      log.info("scale:{}", scale); // 4，即偏移量是 4
      int shift = 31 - Integer.numberOfLeadingZeros(scale);
      log.info("shift:{}",shift); // 2 
      for(int i = 0; i < 5; i++){
   
          long result = ((long) i << shift) + base; // i 扩大4倍，加上 base
          log.info("result:{}",result);
      }
  }

我写了个demo，来模拟这个过程， String 类型的数组，

((long) i << shift) + base; 在本例中就是 i << 2 + 16

寻址公式，应该是 16 + i * 4 这俩一个效果。

base 为什么是16？

数组对象，对象头8字节、指针4字节、数组长度 4字节。所以从16开始。
在这里插入图片描述
其实 new 一个数组对象出来，内存会开辟一块连续的空间，

前面是对象头、指针、记录长度，最后才是数据。

啰啰嗦嗦讲这么多，(Node<K,V>)U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE)

这个本质是寻址公式，也就是 tab[i]。

HashMap 用的是 tab[i]，简单明了，明明白白。

ConCurrentHashMap 用的 native Object getObjectVolatile(Object var1, long var2);

效果都是查看数组某下标处的元素，后者更多是从并发角度来考虑的。

transient volatile Node<K,V>[] table; 虽然用了 volatile，线程间可见，

网上说，数组是线程间可见，但数组元素未必。

ConCurrentHashMap 从并发角度考虑，用了更为底层的方法来查看元素。
.

插入元素遇到扩容

  else if ((fh = f.hash) == MOVED)
      tab = helpTransfer(tab, f);
      
  static final int MOVED     = -1; // hash for forwarding nodes

这里先记住，当 hash 值是 -1时，说明正在扩容。

也就是说，插入元素时，正好在扩容，就调用 helpTransfer(tab, f); 一起扩容

即A线程触发了扩容，此时B线程插入元素，

那么B线程和A线程一起来完成扩容。

开始我看这段的时候，也懵，B线程来插入元素的，跑去扩容，那还插入不？

当然B还是要插入的，为什么？

  for (Node<K,V>[] tab = table;;) {
    
	……
  }

翻上去看下，这是个无限循环。

B参与扩容之后，会再循环，最终肯定会执行它的插入操作。

helpTransfer(tab, f); 这个帮助扩容的方法，等会再细讲。
.

存在哈希冲突

   else {
   
       V oldVal = null;
       synchronized (f) {
   
       		……
       }
       if (binCount != 0) {
   
           if (binCount >= TREEIFY_THRESHOLD)
               treeifyBin(tab, i);
           if (oldVal != null)
               return oldVal;
           break;
       }
   }

遇到哈希冲突时，代码的逻辑与 Hashmap 的差不多，要么按链表处理，要么按红黑树处理。

不同的是有 synchronized 关键字，即加锁处理。

f 是什么？前面说了 f = tabAt(tab, i = (n - 1) & hash) 是数组中该下标的元素。

在这里插入图片描述
在《HashMap 与 ConCurrentHashMap基本原理》中，说过其加锁的事儿，

这个粒度很细，对数组某下标元素加锁，不影响数组的其它位置。

即兼顾效率，又保证安全性。Doug Lea 真牛。

addCount(1L, binCount); 这行代码类似是扩容，等会儿再细讲。

至此，put() 方法大逻辑讲完了，与 HashMap 极其相似。

其中并发作了充分的控制，总结下有以下几点

初始化会并发控制
扩容会并发控制
查看数组某下标元素，使用 Unsafe 类中的 native 方法
扩容遇到并发，协助扩容
哈希冲突时，对相应数组下标元素加锁

二、数组初始化

上面说过，put 方法招行时，若数组未初始化，会调用 initTable() 方法


 if (tab == null || (n = tab.length) == 0)
     tab = initTable();


  private final Node<K,V>[] initTable() {
   
      Node<K,V>[] tab; int sc;
      while ((tab = table) == null || tab.length == 0) {
   
          if ((sc = sizeCtl) < 0)
              Thread.yield(); // lost initialization race; just spin
          else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
   
              try {
   
                  if ((tab = table) == null || tab.length == 0) {
   
                      int n = (sc > 0) ? sc : DEFAULT_CAPACITY;
                      @SuppressWarnings("unchecked")
                      Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n];
                      table = tab = nt;
                      sc = n - (n >>> 2);
                  }
              } finally {
   
                  sizeCtl = sc;
              }
              break;
          }
      }
      return tab;
  }

这里有一个全局变量，是用来标识初始化的

    /**
     * Table initialization and resizing control.  When negative, the
     * table is being initialized or resized: -1 for initialization,
     * else -(1 + the number of active resizing threads).  Otherwise,
     * when table is null, holds the initial table size to use upon
     * creation, or 0 for default. After initialization, holds the
     * next element count value upon which to resize the table.
     */
    private transient volatile int sizeCtl;

如果 sizeCtl = -1，说明是在初始化，如果 -(n+1) 说明有 n 个线程在扩容。

ConcurrentHashMap 初始化时，会设置sizeCtl

  public ConcurrentHashMap(int initialCapacity,
                           float loadFactor, int concurrencyLevel) {
   
      if (!(loadFactor > 0.0f) || initialCapacity < 0 || concurrencyLevel <= 0)
          throw new IllegalArgumentException();
      if (initialCapacity < concurrencyLevel)   // Use at least as many bins
          initialCapacity = concurrencyLevel;   // as estimated threads
      long size = (long)(1.0 + (long)initialCapacity / loadFactor);
      int cap = (size >= (long)MAXIMUM_CAPACITY) ?
          MAXIMUM_CAPACITY : tableSizeFor((int)size);
      this.sizeCtl = cap;
  }

看过上篇《HashMap源码分析》，这段代码应该能看懂， sizeCtl 是 2 的 n 次方

明白了这些，下面这段就不用解释了

   if ((sc = sizeCtl) < 0)
       Thread.yield(); // lost initialization race; just spin

接着看下句 else if (U.compareAndSwapInt(this, SIZECTL, sc, -1))

这是用 CAS 方法，将参数 sizeCtl 设置为 -1，若不成功，则进入下一次的循环。

  try {
   
      if ((tab = table) == null || tab.length == 0) {
   
          int n = (sc > 0) ? sc : DEFAULT_CAPACITY;
          @SuppressWarnings("unchecked")
          Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n];
          table = tab = nt;
          sc = n - (n >>> 2);
      }
  } finally {
   
      sizeCtl =

最低0.47元/天解锁文章