细说ConcurrentHashMap计数规则

最新推荐文章于 2022-08-30 15:16:12 发布

原创最新推荐文章于 2022-08-30 15:16:12 发布 · 1.8k 阅读

3 ·

CC 4.0 BY-SA版权

文章标签：

#java

Java 专栏收录该内容

8 篇文章

订阅专栏

这篇博客详细解析了ConcurrentHashMap如何在多线程环境下实现安全的元素计数。通过sumCount方法结合全局baseCount和CounterCell数组来计算元素个数。作者通过addCount方法和fullAddCount方法的介绍，揭示了如何利用CounterCell减少CAS自旋的开销，并通过线程绑定的Probe值来定位并更新计数。在存在竞争时，ConcurrentHashMap会进行扩容以降低冲突，保证高效计数。

对于ConcurrentHashMap而言，需要保证的是任何操作的线程安全，包括对集合元素个数的统计。

一般在多线程下要统计一个全局数量大小，可以通过cas+循环（或者直接用Atomic相关的类）的方式实现，但是ConcurrentHashMap作者却并不是这样实现的。

先看看如何获取集合中的元素个数，通过调用size()方法：

public int size() {
    long n = sumCount();
    return ((n < 0L) ? 0 :
            (n > (long)Integer.MAX_VALUE) ? Integer.MAX_VALUE :
            (int)n);
}

ConcurrentHashMap获取整个集合的元素个数利用的是sumCount()方法来进行计算，在普通的HashMap中就是直接返回一个全局的size值。

final long sumCount() {
    CounterCell[] as = counterCells; CounterCell a;
    long sum = baseCount;
    if (as != null) {
        for (int i = 0; i < as.length; ++i) {
            if ((a = as[i]) != null)
                sum += a.value;
        }
    }
    return sum;
}

从sumCount方法中可以看出，结果总数来源于两部分，第一部分：全局baseCount变量值，这个值作用和size是差不多的；第二部分：CounterCell数组中的各个value。

ConcurrentHashMap的作者希望通过CounterCell数组来减少多线程环境下cas自旋所造成的损耗。（这个想法真的可以好好学习下🤔）

其实，CounterCell对象内部就只有一个用volatile修饰的value。

@sun.misc.Contended static final class CounterCell {
    volatile long value;
    CounterCell(long x) { value = x; }
}

接下去就深入看下ConcurrentHashMap是如何来实现计数的了。。。。

**addCount()**方法会在调用put()之类的方法后，如果是新增了节点就会执行该方法，x一般都是1。

private final void addCount(long x, int check) {
    CounterCell[] as; long b, s;
    // 当计数盒子为空的时候，直接修改baseCount
  	// 如果修改baseCount失败，将值保存到计数盒子
    if ((as = counterCells) != null ||
        !U.compareAndSwapLong(this, BASECOUNT, b = baseCount, s = b + x)) {
        CounterCell a; long v; int m;
      	// cas操作是否成功，默认true
        boolean uncontended = true;
      	// 如果计数盒子不为空，修改计数盒子中的值
      	// 如果计数盒子为null或者修改计数盒子值失败，
        if (as == null || (m = as.length - 1) < 0 ||
            (a = as[ThreadLocalRandom.getProbe() & m]) == null ||
            !(uncontended =
              U.compareAndSwapLong(a, CELLVALUE, v = a.value, v + x))) {
            fullAddCount(x, uncontended);
            return;
        }
        if (check <= 1)
            return;
        s = sumCount();
    }
    if (check >= 0) {
        Node<K,V>[] tab, nt; int n, sc;
        while (s >= (long)(sc = sizeCtl) && (tab = table) != null &&
               (n = tab.length) < MAXIMUM_CAPACITY) {
            int rs = resizeStamp(n);
            if (sc < 0) {
                if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
                    sc == rs + MAX_RESIZERS || (nt = nextTable) == null ||
                    transferIndex <= 0)
                    break;
                if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
                    transfer(tab, nt);
            }
            else if (U.compareAndSwapInt(this, SIZECTL, sc,
                                         (rs << RESIZE_STAMP_SHIFT) + 2))
                transfer(tab, null);
            s = sumCount();
        }
    }
}

addCount优先是修改counterCells中的值，当counterCells还未初始化或者后续修改counterCells中的值失败了，才会尝试通过cas操作修改全局BASECOUNT的值。

其中，counterCells的初始化、扩容以及存在竞争时对counterCells中的值cas修改失败都是通过**fullAddCount(x, uncontended)**方法进行处理。

fullAddCount具体实现如下（核心大作）；

// wasUncontended代表上一次执行cas往cell中增加值是否成功，不执行也为true
private final void fullAddCount(long x, boolean wasUncontended) {
    int h;
  	// 如果当前线程获取到的hash值为0，初始化ThreadLocalRandom并获取新的hash，最后将cas状态标志位恢复为true
    if ((h = ThreadLocalRandom.getProbe()) == 0) {
        ThreadLocalRandom.localInit();      // force initialization
        h = ThreadLocalRandom.getProbe();
        wasUncontended = true;
    }
    boolean collide = false;                // True if last slot nonempty
    for (;;) {
        CounterCell[] as; CounterCell a; int n; long v;
        if ((as = counterCells) != null && (n = as.length) > 0) {
          	// 对应计数下标处CounterCell对象不存在
            if ((a = as[(n - 1) & h]) == null) {
                if (cellsBusy == 0) {            // Try to attach new Cell
                    CounterCell r = new CounterCell(x); // Optimistic create
                  	// 初始化前先将cellsBusy置为1
                    if (cellsBusy == 0 &&
                        U.compareAndSwapInt(this, CELLSBUSY, 0, 1)) {
                        boolean created = false;
                        try {               // Recheck under lock
                            CounterCell[] rs; int m, j;
                          	// 二次确认对应数组下标位置处元素不存在
                            if ((rs = counterCells) != null &&
                                (m = rs.length) > 0 &&
                                rs[j = (m - 1) & h] == null) {
                                rs[j] = r;
                                created = true;
                            }
                        } finally {
                            cellsBusy = 0;
                        }
                        if (created)
                            break;
                      	// 如果元素不是由当前线程初始化完成的，重新获取数组下标元素
                        continue;           // Slot is now non-empty
                    }
                }
              	// 其他线程在进行CounterCell数组修改操作，
                collide = false;
            }
          	// 当wasUncontended为false，代表之前的cas操作失败了，需要先重新计算下标值
            else if (!wasUncontended)       // CAS already known to fail
                wasUncontended = true;      // Continue after rehash
            // 1. 之前对象为null，其他线程初始化后进行cas操作
            // 2. 之前cas执行失败， 重新计算下标指后进行cas操作
            else if (U.compareAndSwapLong(a, CELLVALUE, v = a.value, v + x))
                break;
          	// 如果counterCells被其他线程修改了 或者 数组长度大于等于CPU数 就不进行扩容
            else if (counterCells != as || n >= NCPU)
                collide = false;            // At max size or stale
          	// 将扩容标志位置为true，如果下次重新计算下标值后的cas操作还是失败了就进行扩容
            else if (!collide)
                collide = true;
          	// 如果当前循环中连续对CounterCell数组不为null（也可以一开始为null，但是进行初始化发现其他线程已经进行了初始化）的对象进行cas操作失败就进行扩容
            else if (cellsBusy == 0 &&
                     U.compareAndSwapInt(this, CELLSBUSY, 0, 1)) {
                try {
                  	// 将counterCells数组长度扩大一倍（保持2的幂）
                    if (counterCells == as) {// Expand table unless stale
                        CounterCell[] rs = new CounterCell[n << 1];
                        for (int i = 0; i < n; ++i)
                            rs[i] = as[i];
                        counterCells = rs;
                    }
                } finally {
                    cellsBusy = 0;
                }
                collide = false;
                continue;                   // Retry with expanded table
            }
          	// 可能重新计算的场景：
          	//      1. 上一次cas操作执行失败
          	//      2. counterCells中途被其他线程修改了 或者 数组长度大于等于CPU数
          	//      3. 打算扩容前再获取一次新的hash
            h = ThreadLocalRandom.advanceProbe(h);
        }
        // 为原数组空间进行初始化
        else if (cellsBusy == 0 && counterCells == as &&
                 U.compareAndSwapInt(this, CELLSBUSY, 0, 1)) {
            boolean init = false;
            try {                           // Initialize table
                if (counterCells == as) {
                    CounterCell[] rs = new CounterCell[2];
                    rs[h & 1] = new CounterCell(x);
                    counterCells = rs;
                    init = true;
                }
            } finally {
                cellsBusy = 0;
            }
            if (init)
                break;
        }
      	// 如果前面都失败了，最后再进行一次cas修改BASECOUNT的值
        else if (U.compareAndSwapLong(this, BASECOUNT, v = baseCount, v + x))
            break;                          // Fall back on using base
    }
}

标志变量

cellsBusy ：表示是否可以进行counterCells对象的修改操作；

1 —— counterCells正在初始化（包括整个数组的初始化以及数组元素CounterCell对象的初始化）

0 —— 无初始化，可进行初始化
wasUncontended ：

true —— 默认，代表不存在cas操作失败

false —— 调用fullAddCount之前的cas操作执行失败
collide ：扩容标志

true —— 可以扩容

false —— 不可以扩容

如何确定要往counterCells哪个下标处进行+1操作的？

counterCells数组的设计理念和用来存储Map中元素的数组是一样的。首先，数组的长度一定是2的次方（counterCells起始大小是2）；其次，计算下表是利用某个值和数组长度减一进行&操作求得。在HashMap中这个值就是hashcode，而在ConcurrentHashMap中是ThreadLocalRandom中的Probe。一般情况下，该值对应不同的线程都是不同的，同一个线程获取到的也同时相同的值，这样可以保证一个线程都是对同一个counterCells数组中的对象进行计数操作，提高cas操作的成功率。

总结：

ConcurrentHashMap计数思路是通过引入一个类似HashMap中用来存储节点的数组，利用数组来减少多线程对同一变量写操作的竞争。ConcurrentHashMap利用和线程绑定的Probe的值来快速计算对应的数组下标，如果下标处对象为null，通过了乐观锁+双重检查的形式对对象或者整个数组进行初始化。一般线程会对应counterCells中的某个数组下标对象进行累加，如果不存在别的线程的竞争，cas往往都会执行成功。当别的线程计算出的下标值是同一个，就存在对counterCells中的对象的竞争，此时，执行cas操作失败的线程会重新计算一个新的下标然后继续累加，如果还是存在竞争，继续更换下标，当多次失败的话，ConcurrentHashMap就认为当前数组中的大部分对象都有对应的线程在执行，会对整个数组进行扩容，原数组下标处的对象不变（可以认为counterCells中的对象往往只有一个线程对应，如果有多个线程，各个线程的cas操作也能高效的执行，否则，其中部分线程就会更换数组下标）。