AbstractQueuedSynchronizer独占锁源码分析

最新推荐文章于 2024-04-21 17:55:58 发布

虎虎他爹

最新推荐文章于 2024-04-21 17:55:58 发布

阅读量155

点赞数

CC 4.0 BY-SA版权

分类专栏： java 文章标签： java

本文链接：https://blog.youkuaiyun.com/weixin_43727164/article/details/121147716

java 专栏收录该内容

6 篇文章

订阅专栏

本文深入解析Java并发库中的AbstractQueuedSynchronizer（AQS）及其在ThreadPoolExecutor中的应用。通过分析AQS的内部结构如等待队列和同步状态，阐述独占锁的工作原理，并通过实例演示如何自定义一个基于AQS的锁。此外，详细剖析了lock()、unlock()方法的源码实现，展示了线程在获取和释放锁过程中的状态变化和并发控制机制。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

前言：

关于AbstractQueuedSynchronizer的学习，之前总是断断续续，最近重新翻ThreadPoolExecutor的源码时，发现内部类Worker也是一个实现了AbstractQueuedSynchronizer的自定义同步器，于是决定把AbstractQueuedSynchronizer源码再仔细的翻一遍，同时记录下学习中的疑问，以及自己对于疑问的一些收获，希望对大家有所帮助。水平有限，如果文章中有错误的地方，也请不吝指正，本人也将第一时间把错误的地方给更正过来。

本文只涉及到独占锁，共享锁的实现会在后续文章分析

学习AbstractQueuedSynchronizer需要用到的几个重要变量

	// 注意这里都使用了volatile关键字
	// 等待队列的头结点
    private transient volatile Node head;
    // 等待队列的尾节点
    private transient volatile Node tail;
    // 同步状态
    private volatile int state;

static final class Node {
        /** Marker to indicate a node is waiting in shared mode */
        static final Node SHARED = new Node();
        /** Marker to indicate a node is waiting in exclusive mode */
        static final Node EXCLUSIVE = null;

        /** waitStatus value to indicate thread has cancelled */
        static final int CANCELLED =  1;
        /** waitStatus value to indicate successor's thread needs unparking */
        static final int SIGNAL    = -1;
        /** waitStatus value to indicate thread is waiting on condition */
        static final int CONDITION = -2;
        /**
         * waitStatus value to indicate the next acquireShared should
         * unconditionally propagate
         */
        static final int PROPAGATE = -3;
       	// 线程状态,就是上面那几个，初始是0
        volatile int waitStatus;
        // 前驱节点
        volatile Node prev;
		// 后继节点
        volatile Node next;
		// 对线程的封装
        volatile Thread thread;
        // 使用这个变量来表示节点是共享节点还是独占节点，独占节点就是默认的null，
        // 共享节点会给这个字段new个Node，然后和最上面那俩字段比较，就可以知道节点的模式。很妙
        // 还有一个地方使用就是Condition单独维护了一个Node节点单向链表，用这个表示下一个Node节点，不过这个跟这一节的内容没啥关联。
        Node nextWaiter;
    }

AbstractQueuedSynchronizer的等待队列是CLH队列的变种，可以理解为“等待获取锁的线程队列”。而Node就是队列里的节点，Node其实就是对thread的一个封装。

独占锁原理

独占锁就是使用一个state作为标志位。当一个线程尝试占用一个资源的时候，就会尝试把这个state设置为1，其他线程来获取这个资源的时候，发现state标志位为1，就明白这个资源被占用了，就会等待，等前一个资源把state标志位设置为0，其他线程会继续争抢这个资源。
我比较喜欢用抢坑位来比喻锁竞争，一个人占了一个坑位蹲坑，就会把门关上，关上的门上会有一个红色的有人的标志，提醒后面来蹲坑的人坑位有人了，在外面等着就行了。这个人蹲完之后，打开门唱着歌欢快的走了，打开的门就会有一个绿色的无人的标志，提示后面来蹲坑的人有坑位了。
在文明的城市里，大家来蹲坑比较文明，会讲究先来后到排个队，先来的先蹲，后来的排队等着，这就是公平锁。
在不文明的城市里，每个来蹲坑的人都会尝试直接打开门进去蹲能不能行，不行再到后面去排队，这就是非公平锁。

原理就是利用volatile关键字和cas，具体细节推荐看《java并发编程的艺术》，这本书讲的很不错，我就不再赘述。

自定义同步器：

先知道怎么用，再了解为什么，所以我们先写一个自定义同步器的简单实现，只需要实现这几个AbstractQueuedSynchronizer框架的顶层方法就可以了。

public class MyLock extends AbstractQueuedSynchronizer {

    private static final long serialVersionUID = -1L;

    public void lock() {
        acquire(1);
    }

    public boolean tryLock() {
        return tryAcquire(1);
    }

    public void unlock() {
        release(1);
    }

    public boolean isLocked() {
        return isHeldExclusively();
    }

    @Override
    protected boolean tryAcquire(int arg) {
        if (compareAndSetState(0, 1)) {
            setExclusiveOwnerThread(Thread.currentThread());
            return true;
        }
        return false;
    }

    @Override
    protected boolean tryRelease(int arg) {
        setExclusiveOwnerThread(null);
        setState(0);
        return true;
    }

    @Override
    protected boolean isHeldExclusively() {
        return getState() != 0;
    }
}

自定义lock的简单使用，

@Slf4j
public class MyLockTest {
    public static void main(String[] args) {
        MyLock lock = new MyLock();
        new Thread(()->{
            lock.lock();
            log.info(Thread.currentThread().getName() + "拿到了锁");
            try {
                try {
                    Thread.sleep(2000);
                } catch (InterruptedException e) {

                }
            }finally {
                lock.unlock();
                log.info(Thread.currentThread().getName() + "释放了锁");
            }
        }).start();

        new Thread(()->{
            lock.lock();
            log.info(Thread.currentThread().getName() + "拿到了锁");
            try {
                try {
                    Thread.sleep(5000);
                } catch (InterruptedException e) {

                }
            }finally {
                lock.unlock();
                log.info(Thread.currentThread().getName() + "释放了锁");
            }
        }).start();
    }
}

源码分析

首先，从lock()入口方法点进去，调用的是AbstractQueuedSynchronizer的acquire(int arg)方法

acquire(int arg)

	public final void acquire(int arg) {
        if (!tryAcquire(arg) &&
            acquireQueued(addWaiter(Node.EXCLUSIVE), arg))
            selfInterrupt();
    }

该方法首先调用tryAcquire(arg)去获取锁。

tryAcquire(arg)

	protected boolean tryAcquire(int arg) {
        throw new UnsupportedOperationException();
    }

该方法在AbstractQueuedSynchronizer中并没有具体的实现，具体的实现在自己的同步器中

@Override
    protected boolean tryAcquire(int arg) {
        if (compareAndSetState(0, 1)) {
            setExclusiveOwnerThread(Thread.currentThread());
            return true;
        }
        return false;
    }

这个实现比较简单，就是利用cas方法去设置锁状态

	protected final boolean compareAndSetState(int expect, int update) {
        // See below for intrinsics setup to support this
        return unsafe.compareAndSwapInt(this, stateOffset, expect, update);
    }

如果tryAcquire(arg)获取到了锁，直接返回。如果没有获取到锁，就会继续调用acquireQueued(addWaiter(Node.EXCLUSIVE), arg)。

addWaiter(Node.EXCLUSIVE), arg)

	private Node addWaiter(Node mode) {
        Node node = new Node(Thread.currentThread(), mode);
        // Try the fast path of enq; backup to full enq on failure
        Node pred = tail;
        // 判断尾节点，尾节点不为null就用compareAndSetTail把新节点加到尾节点后面，然后返回
        if (pred != null) {
            node.prev = pred;
            if (compareAndSetTail(pred, node)) {
                pred.next = node;
                return node;
            }
        }
        // 如果尾节点为null，或者compareAndSetTail(pred, node)返回fasle（注意，这个方法有可能没有更换尾节点成功）
        // 则走enq(node)方法
        enq(node);
        return node;
    }

	private Node enq(final Node node) {
		// 自旋去设置尾节点
        for (;;) {
            Node t = tail;
            // 如果没有尾节点，说明链表是空，则先初始化链表
            if (t == null) { // Must initialize
                if (compareAndSetHead(new Node()))
                    tail = head;
            } else {
            // 否则就一直自旋，直到尾节点设置成功，这是乐观锁的应用
                node.prev = t;
                if (compareAndSetTail(t, node)) {
                    t.next = node;
                    return t;
                }
            }
        }
    }

接下来调用acquireQueued(final Node node, int arg)

	final boolean acquireQueued(final Node node, int arg) {
		// 标记是否拿到资源，也就是是否获取到锁， 默认是true，也就是没拿到
        boolean failed = true;
        try {
        	// 这个变量标记线程是否在获取锁的过程中发生过中断
            boolean interrupted = false;
            for (;;) {
            	// node.predecessor()方法是拿到node的前驱节点，源码可以自己点进去看一下
                final Node p = node.predecessor();
                // 判断前驱节点是否是头结点，如果是，则去获取一下资源
                // 从head的判断可以看出，这是一个公平锁。前驱节点不是head，说明有人比自己更早入队列。
                if (p == head && tryAcquire(arg)) {
                	// 成功获取到则把当前节点设置为头节点
                    setHead(node);
                    // 这一步源码里给了标注，释放前驱节点，让gc去回收
                    p.next = null; // help GC
                    // 把获取资源失败的标记给设置为false
                    failed = false;
                    // 返回中断标志，官方说明：@return {@code true} if interrupted while waiting
                    return interrupted;
                }
                // 根据语义也可以猜到，接下来是先判断是否应该去休息在请求资源失败后，以及去休息并且检查中断
                if (shouldParkAfterFailedAcquire(p, node) &&
                    parkAndCheckInterrupt())
                    interrupted = true;
            }
        } finally {
            if (failed)
                cancelAcquire(node);
        }
    }

接下来是shouldParkAfterFailedAcquire(p, node)方法

	private static boolean shouldParkAfterFailedAcquire(Node pred, Node node) {
		// 这个方法是为了把前驱节点的状态标志位设置为SIGNAL，然后自己去睡眠。设置为SIGNAL的线程释放资源的时候会唤醒下一			  个可用的线程
		// 这里取前驱结点的状态
        int ws = pred.waitStatus;
        // 前驱节点是SIGNAL状态直接返回
        if (ws == Node.SIGNAL)
            /*
             * This node has already set status asking a release
             * to signal it, so it can safely park.
             */
            return true;
        //	前驱节点如果是取消状态，可以理解为不可用状态，迭代去寻找前一个最近的可用节点，然后把前驱节点的next节点更换为自己，相当于把CANCELLED状态的Node节点从链表中删除
        if (ws > 0) {
            /*
             * Predecessor was cancelled. Skip over predecessors and
             * indicate retry.
             */
            do {
                node.prev = pred = pred.prev;
            } while (pred.waitStatus > 0);
            pred.next = node;
        } else {
            /*
             * waitStatus must be 0 or PROPAGATE.  Indicate that we
             * need a signal, but don't park yet.  Caller will need to
             * retry to make sure it cannot acquire before parking.
             */
             // 节点可用，调用compareAndSetWaitStatus方法区更改前驱结点状态为SIGNAL
            compareAndSetWaitStatus(pred, ws, Node.SIGNAL);
        }
        // 注意这里是返回false的，为什么不返回true，留个疑问，后面会讲解、
        return false;
    }

	private final boolean parkAndCheckInterrupt() {
		// 阻塞，如果被中断，返回线程中断标志位
        LockSupport.park(this);
        return Thread.interrupted();
    }

这里有几个点需要注意：

if (shouldParkAfterFailedAcquire(p, node) &&
                    parkAndCheckInterrupt())
                    interrupted = true;

1.在这个方法里，线程被中断，并没有直接return，会继续自旋继续获取资源，只有在拿到资源后，才会调用Thread.interrupted()判断线程是否被中断过然后返回结果。然后在acquire(int arg)方法里，线程中断会返回true，就会执行selfInterrupt()方法进行自我中断操作。

public final void acquire(int arg) {
        if (!tryAcquire(arg) &&
            acquireQueued(addWaiter(Node.EXCLUSIVE), arg))
            // 在这里补充重点，if逻辑里忽略中断
            selfInterrupt();
    }

2.第二点，就是acquireQueued（）方法里是有一个finally代码块的。

		finally {
            if (failed)
                cancelAcquire(node);
        }

我在看这段代码的时候有个疑问，cancelAcquire(node)什么时候会执行？因为仔细看acquireQueued（）方法循环里只有一个return，而且再return之前把failed设置为了false，所以正常的流程是不会走cancelAcquire(node)的。那么只有可能是for循环里抛出的异常：
是不是线程中断呢？这是我想到的第一个答案。但是在我搞懂了第一点的时候，我排除了这个答案，因为线程在park状态，或者正常执行状态，调用interrupt() 都不会导致InterruptedException，也就是我们第一点说的，先标记，然后方法返回后调用selfInterrupt()方法进行自我中断操作。
排除了线程中断，那么答案就只剩下一个了，那就是tryAcquire(arg)方法，因为这个方法是我们自定义实现的，如果我们在自定义实现里抛出了异常。那么就会走到cancelAcquire(node)这个方法里。
接下来让我们看一下这个方法

cancelAcquire(node)

	private void cancelAcquire(Node node) {
        // Ignore if node doesn't exist
        if (node == null)
            return;

        node.thread = null;

        // Skip cancelled predecessors
        Node pred = node.prev;
        while (pred.waitStatus > 0)
            node.prev = pred = pred.prev;

        // predNext is the apparent node to unsplice. CASes below will
        // fail if not, in which case, we lost race vs another cancel
        // or signal, so no further action is necessary.
        Node predNext = pred.next;

        // Can use unconditional write instead of CAS here.
        // After this atomic step, other Nodes can skip past us.
        // Before, we are free of interference from other threads.
        node.waitStatus = Node.CANCELLED;

        // If we are the tail, remove ourselves.
        if (node == tail && compareAndSetTail(node, pred)) {
            compareAndSetNext(pred, predNext, null);
        } else {
            // If successor needs signal, try to set pred's next-link
            // so it will get one. Otherwise wake it up to propagate.
            int ws;
            if (pred != head &&
                ((ws = pred.waitStatus) == Node.SIGNAL ||
                 (ws <= 0 && compareAndSetWaitStatus(pred, ws, Node.SIGNAL))) &&
                pred.thread != null) {
                Node next = node.next;
                if (next != null && next.waitStatus <= 0)
                    compareAndSetNext(pred, predNext, next);
            } else {
                unparkSuccessor(node);
            }

            node.next = node; // help GC
        }
    }

这个方法有点长，但是总结成一句话就是，**把自己踢出链表，然后把屁股擦干净。**过程包括

找到前一个最近的有效节点
如果是尾部节点，把前驱节点的下个节点置为null
是中间节点，保证前驱节点的waitStatus为SIGNAL，把自己的后继节点托付给前驱结点。
如果是头部节点，则去唤醒下一个有效节点，
然后把自己的引用都干掉，等待gc去清理他。
用我自己的理解就是，发现自己做了不可饶恕的事情了，决心要悔悟，写个遗书交待清楚后事，然后打个110自首，等待警察上门。

至此，lock()的源码已经走完了，接下来是unlock()

unlock()方法调用的是AbstractQueuedSynchronizer的release(int arg)方法

release(int arg)

	public final boolean release(int arg) {
        if (tryRelease(arg)) {
            Node h = head;
            if (h != null && h.waitStatus != 0)
                unparkSuccessor(h);
            return true;
        }
        return false;
    }

首先调用tryRelease(int arg)方法，这个方法同样是没有默认实现，实现方法在我们自己的MyLock方法里

	protected boolean tryRelease(int arg) {
        throw new UnsupportedOperationException();
    }

	@Override
    protected boolean tryRelease(int arg) {
        setExclusiveOwnerThread(null);
        setState(0);
        return true;
    }

这里没有用cas去设置状态，因为释放锁的前提是拿到了锁。所以直接设置state就可以了。我们主要分析release(int arg)方法：

直接拿头结点，因为头结点才能释放资源
判断头结点的状态如果不等于0，调用unparkSuccessor(h)去唤醒下一个节点

unparkSuccessor(Node node)

	private void unparkSuccessor(Node node) {
        /*
         * If status is negative (i.e., possibly needing signal) try
         * to clear in anticipation of signalling.  It is OK if this
         * fails or if status is changed by waiting thread.
         */
        int ws = node.waitStatus;
        // 先把自己的状态设置为0
        if (ws < 0)
            compareAndSetWaitStatus(node, ws, 0);

        /*
         * Thread to unpark is held in successor, which is normally
         * just the next node.  But if cancelled or apparently null,
         * traverse backwards from tail to find the actual
         * non-cancelled successor.
         */
        // 找到下一个最近的有效节点
        Node s = node.next;
        if (s == null || s.waitStatus > 0) {
            s = null;
            for (Node t = tail; t != null && t != node; t = t.prev)
                if (t.waitStatus <= 0)
                    s = t;
        }
        // 唤醒下一个节点
        if (s != null)
            LockSupport.unpark(s.thread);
    }

至此，释放资源的方法也就结束了。

遗留问题

在上面的文章中，我遗留了一个问题没有解答，就是shouldParkAfterFailedAcquire()方法中，为什么最后直接返回了false。

第一个原因，compareAndSetWaitStatus(pred, ws, Node.SIGNAL)这个方法，并一定百分百成功，如果调用失败，继续自旋，直到设置成功后再次进入这个方法时，在第一个ws == Node.SIGNAL判断时会返回true，此方法结束。那么我们可以得到一个结论，正常情况下shouldParkAfterFailedAcquire()方法最少执行两次（除非在第二次自旋的时候，直接重试获取到了资源）。且每次执行一次for循环，都会重新尝试去获取锁，这其实也是一种乐观锁的应用（ps：上面的enq（）方法也用到了乐观锁）。
第二个原因，多线程编程中，一定会存在时间片的概念。假设一个释放资源的线程，在下面图一执行完节点3后交出时间片，另一个新增节点再执行图二的1节点，这个时候释放资源的线程的waitStatus被设置成SIGNAL，但是条件判断已经走完了，不会再执行一遍，而新增节点的线程设置好了之后，如果直接返回true，那新增节点的线程就进入park，前驱节点不会再唤醒他，程序就会一直阻塞。返回false的话，重新自旋获取锁，因为图一已经走完了节点1，所以tryAcquire(arg)直接就拿到了锁。
换一个角度来看，新增节点的线程把前驱节点的waitStatus被设置成SIGNAL，但是前驱节点并没有唤醒他，而是自己去获取到的锁，所以释放资源的线程的waitStatus不一定全部都是0（ps: unparkSuccessor(Node node)方法中会把自己的waitStatus设置为0），也有可能是-1，也就是SIGNAL状态