Netty源码解析（四）之EventLoop和EventLoopGroup

最新推荐文章于 2025-02-12 08:00:14 发布

枫_Maple

最新推荐文章于 2025-02-12 08:00:14 发布

阅读量988

点赞数

分类专栏： Netty源码解析文章标签： netty 服务器运维 java

本文链接：https://blog.youkuaiyun.com/benjam1n77/article/details/122964042

版权

Netty源码解析专栏收录该内容

9 篇文章

订阅专栏

本文详细解读了Netty中的EventLoopGroup和EventExecutorGroup概念，阐述了它们作为线程池的角色，并深入剖析了MultithreadEventExecutorGroup和NioEventLoop的工作原理。重点讲解了EventLoop的选择、任务提交与执行过程。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

前面几篇文章对Netty的服务器启动以及客户端连接做了较为详细的解析，这样大家应该对Netty有了一个比较全面、宏观的了解。接下来的文章，我将对Netty的各个主要的组件进行源码分析。我个人认为，这种先宏观再细节，从面到点的顺序是比较科学的。这篇文章我会先介绍EventLoop和EventLoopGroup。

一. EventLoopGroup和EventExecutorGroup

前面说过很多次，EventLoopGroup其实就相当于一个线程池，带着这样一个概念，我们来看一下EventLoopGroup的主要方法（包括其从父接口EventExecutorGroup继承的方法）。

EventLoop next();

该方法返回下一个将要使用的EventLoop，相当于线程池返回下一个用来处理任务的线程。

ChannelFuture register(Channel channel);

通过该方法将一个Channel注册(或者说绑定)到某个EventLoop上，一个Channel在其生命周期内只会绑定到一个EventLoop上（但一个EventLoop可以绑定多个Channel），这样这个Channel后续的IO事件都会交由这个EventLoop来处理，因此Netty可以保证Channel一定是线程安全的。

Future<?> shutdownGracefully();
boolean isShuttingDown();
void shutdown();

这三个方法都是继承自父接口EventExecutorGroup。shutdownGracefully()方法可以“优雅地”关闭EventLoopGroup，所谓优雅，就是能保证在调用该方法之前提交给EventLoopGroup的任务都可以执行，也就是说会在执行完剩余的任务后才会真正关闭，该方法是异步的，因此返回一个Future。与这个方法相对应的是shutdown()方法，他则不能保证剩余的任务能执行完毕（这个方法在4.1.47版本已经过时了）。isShuttingDown()只有在两种情况下返回true，1. 该EventLoopGroup的所有EventLoop都已经关闭；2. 该EventLoopGroup调用了shutdownGracefully()方法，正在关闭的过程中。

Future<?> submit(Runnable task);
ScheduledFuture<?> schedule(Runnable command, long delay, TimeUnit unit);
ScheduledFuture<?> scheduleAtFixedRate(Runnable command, long initialDelay, long period, TimeUnit unit);
ScheduledFuture<?> scheduleWithFixedDelay(Runnable command, long initialDelay, long delay, TimeUnit unit);

这几个方法也是继承自父接口EventExecutorGroup，从方法名上也可以看出他们的功能，向EventLoopGroup提交一个普通任务，或定时任务，或延时任务。关于这几个方法的详细解释可以去看 java.util.concurrent.ScheduledExecutorService中的注释说明，这里不再赘述。

我们可以这样理解EventLoopGroup和EventExecutorGroup：EventExecutorGroup相当于一个通用的可以执行各种类型的任务的线程池，并且能对线程池进行生命周期的管理，而EventLoopGroup则是在此基础上专门用来处理Channel的IO任务的线程池。

说完了接口方法，接下来看一些主要的EventLoopGroup的子类是如何实现这些接口方法的。

MultithreadEventExecutorGroup

MultithreadEventExecutorGroup有两个重要的成员变量children和chooser，children是一个类型为EventExecutor的数组，相当于线程池里的线程数组，chooser则是用来选择下一个将要使用的EventExecutor的选择器。

private final EventExecutor[] children;

private final EventExecutorChooserFactory.EventExecutorChooser chooser;

MultithreadEventExecutorGroup的构造方法如下，方法的逻辑很简单，首先会实例化一个ThreadPerTaskExecutor，然后循环调用newChild()方法得到每一个child，然后再通过chooserFactory得到chooser，这样MultithreadEventExecutorGroup最重要的两个成员变量就得到了。

protected MultithreadEventExecutorGroup(int nThreads, Executor executor,
                                        EventExecutorChooserFactory chooserFactory, Object... args) {
    if (nThreads <= 0) {
        throw new IllegalArgumentException(String.format("nThreads: %d (expected: > 0)", nThreads));
    }

    if (executor == null) {
        executor = new ThreadPerTaskExecutor(newDefaultThreadFactory());
    }

    children = new EventExecutor[nThreads];

    for (int i = 0; i < nThreads; i ++) {
        boolean success = false;
        try {
            children[i] = newChild(executor, args);
            success = true;
        } catch (Exception e) {
            // TODO: Think about if this is a good exception type
            throw new IllegalStateException("failed to create a child event loop", e);
        } finally {
            if (!success) {
                for (int j = 0; j < i; j ++) {
                    children[j].shutdownGracefully();
                }

                for (int j = 0; j < i; j ++) {
                    EventExecutor e = children[j];
                    try {
                        while (!e.isTerminated()) {
                            e.awaitTermination(Integer.MAX_VALUE, TimeUnit.SECONDS);
                        }
                    } catch (InterruptedException interrupted) {
                        // Let the caller handle the interruption.
                        Thread.currentThread().interrupt();
                        break;
                    }
                }
            }
        }
    }

    chooser = chooserFactory.newChooser(children);

    final FutureListener<Object> terminationListener = new FutureListener<Object>() {
        @Override
        public void operationComplete(Future<Object> future) throws Exception {
            if (terminatedChildren.incrementAndGet() == children.length) {
                terminationFuture.setSuccess(null);
            }
        }
    };

    for (EventExecutor e: children) {
        e.terminationFuture().addListener(terminationListener);
    }

    Set<EventExecutor> childrenSet = new LinkedHashSet<EventExecutor>(children.length);
    Collections.addAll(childrenSet, children);
    readonlyChildren = Collections.unmodifiableSet(childrenSet);
}

MultithreadEventExecutorGroup的newChild()方法是一个抽象方法，需要交给子类来实现。那么我们就来看一下NioEventLoopGroup，他的newChild()方法就是实例化一个NioEventLoop，至于那些参数在后面讲NioEventLoop的时候再详细介绍。NioEventLoopGroup的execute()方法继承自AbstractEventExecutorGroup，逻辑就是将任务交由下一个要使用的EventLoop来处理，NioEventLoopGroup的register()方法继承自MultithreadEventLoopGroup，逻辑也差不多，也是将channel的register操作交由EventLoop来处理。

//NioEventLoopGroup.java
protected EventLoop newChild(Executor executor, Object... args) throws Exception {
    EventLoopTaskQueueFactory queueFactory = args.length == 4 ? (EventLoopTaskQueueFactory) args[3] : null;
    return new NioEventLoop(this, executor, (SelectorProvider) args[0],
            ((SelectStrategyFactory) args[1]).newSelectStrategy(), (RejectedExecutionHandler) args[2], queueFactory);
}


//AbstractEventExecutorGroup.java
public void execute(Runnable command) {
    next().execute(command);
}

//MultithreadEventExecutorGroup.java
public EventExecutor next() {
    return chooser.next();
}


//MultithreadEventLoopGroup.java
public ChannelFuture register(Channel channel) {
    return next().register(channel);
}

至此，我们可以总结一下EventLoopGroup的整体逻辑，他相当于一个专门用来处理Channel的IO操作的线程池，他的内部有一个children数组和一个选择器chooser，EventLoopGroup在实例化的时候，会调用newChild()方法创建每一个EventLoop，Channel可以通过register()方法将自身绑定到某一个EventLoop上。EventLoopGroup通过调用next()方法来选择一个EventLoop，而next()方法则返回通过选择器chooser的策略选择出来的EventLoop（或者说EventExecutor）。

二. EventLoop和EventExecutor

上一节说了EventLoopGroup和EventExecutorGroup的关系和区别，那么对于EventLoop和EventExecutor来说其实也是一样的。其实在大多数语义下，不需要对EventLoop和EventExecutor二者进行区分，可以把他们两个看作同一个东西。

EventExecutor next();

没错，EventExecutor也有next()方法，不过他的next()方法只是返回一个自身的引用

EventExecutorGroup parent();

返回该EventExecutor所属的EventExecutorGroup

boolean inEventLoop();

判断当前线程是否是该EventLoop线程

接口方法比较简单，我们主要还是关心实现类是怎么实现这些功能的，这也是这篇文章的重点。

NioEventLoop

NioEventLoop是使用Netty的用户见的最多的EventLoop，我们主要说一下他的逻辑。

public ChannelFuture register(Channel channel)

NioEventLoop本身并不处理Channel的注册操作，具体的操作还是交由Channel自身来完成，这一块之前在讲Netty服务器启动的文章中详细说过了

Netty源码解析（二）之服务器启动源码_benjam1n77的博客-优快云博客

NioEventLoop所完成的工作仅仅是将channel注册的结果封装为一个ChannelFuture而已。

@Override
public ChannelFuture register(Channel channel) {
    return register(new DefaultChannelPromise(channel, this));
}

@Override
public ChannelFuture register(final ChannelPromise promise) {
    ObjectUtil.checkNotNull(promise, "promise");
    promise.channel().unsafe().register(this, promise);
    return promise;
}

private void execute(Runnable task, boolean immediate)

execute()方法的逻辑是NioEventLoop最核心的部分，他继承自SingleThreadEventExecutor。当我们调用execute()方法向NioEventLoop提交一个任务（Runnable）时，会将这个任务加入到队列中，SingleThreadEventExecutor有一个任务队列，用来存储递交给他的task。

private final Queue<Runnable> taskQueue;

然后，如果调用execute()方法的线程不是NioEventLoop自身的线程（inEventLoop()方法返回false），那么则会调用startThread()方法，值得注意的是，如果第一次向一个NioEventLoop提交任务时，那么startThread()方法必被调用，因为第一次调用execute()方法必然是外部线程调用，不会是NioEventLoop自身的线程调用。

private void execute(Runnable task, boolean immediate) {
    boolean inEventLoop = inEventLoop();
    addTask(task);
    if (!inEventLoop) {
        startThread();
        if (isShutdown()) {
            boolean reject = false;
            try {
                if (removeTask(task)) {
                    reject = true;
                }
            } catch (UnsupportedOperationException e) {
                // The task queue does not support removal so the best thing we can do is to just move on and
                // hope we will be able to pick-up the task before its completely terminated.
                // In worst case we will log on termination.
            }
            if (reject) {
                reject();
            }
        }
    }

    if (!addTaskWakesUp && immediate) {
        wakeup(inEventLoop);
    }
}

private void startThread()

startThread()方法逻辑很简单，如果EventLoop的状态为ST_NOT_STARTED，那么将状态设置为ST_STARTED，然后调用doStartThread()方法开启线程。也就是说doStartThread()方法只会调用一次，即第一次向EventLoop提交任务的时候，他相当于一个初始化方法。

EventLoop的生命周期有如下几种状态：

ST_NOT_STARTED：EventLoop还未开始工作
ST_STARTED：EventLoop已经开始正常工作，可以向其正常地提交任务
ST_SHUTTING_DOWN：调用shutdownGracefully()后状态变为ST_SHUTTING_DOWN，此时仍然还是可以向EventLoop提交任务。
ST_SHUTDOWN：执行完队列中的剩余任务，并且执行完runShutdownHooks()方法后，状态变为ST_SHUTDOWN，此时已经不能向EventLoop提交任务了（在钩子方法中提交任务除外），但并不意味着任务队列中的所有任务全部执行完了，因为有可能在钩子方法中又提交了新的任务。
ST_TERMINATED：EventLoop中的所有任务全部完成了，selector也已经关闭了（如果是NioEventLoop的话）。

注意，从ST_SHUTTING_DOWN到ST_SHUTDOWN和从ST_SHUTDOWN到ST_TERMINATED两次调用confirmShutDown()方法是不一样的，前者调用confirmShutDown()方法时，是可以在钩子方法中添加任务，而后者调用confirmShutDown()方法时，仅执行完队列中剩余的任务，不能再添加新的任务。confirmShutDown()方法会在文章后面进行更详细的解析，这里我们先暂时放下。

private void startThread() {
    if (state == ST_NOT_STARTED) {
        if (STATE_UPDATER.compareAndSet(this, ST_NOT_STARTED, ST_STARTED)) {
            boolean success = false;
            try {
                doStartThread();
                success = true;
            } finally {
                if (!success) {
                    STATE_UPDATER.compareAndSet(this, ST_STARTED, ST_NOT_STARTED);
                }
            }
        }
    }
}

private void doStartThread()

NioEventLoop调用doStartThread()方法开始执行任务，该方法调用自身内部的executor来执行一个匿名Runnable，这个executor才是真正执行任务的executor，EventLoop其实是对这个executor的一个包装。这个匿名Runnable的run()方法的主要逻辑是这样的：

首先保存该EventLoop底层所用到的线程。
调用SingleThreadEventExecutor.run()方法，该方法是SingleThreadEventExecutor的抽象方法，需要子类来实现。通常来说，这个方法会在一个死循环中不断从任务队列中取任务来执行。我们接下来会分析NioEventLoop.run()。
run()方法执行结束，这意味着该EventLoop应该停止工作了，他的生命周期至少走到了ST_SHUTTING_DOWN这一步，后续的操作全部都是处理EventLoop的shutdown操作。shutdown的逻辑我们在前面也已经叙述过了。

private void doStartThread() {
    assert thread == null;
    executor.execute(new Runnable() {
        @Override
        public void run() {
            thread = Thread.currentThread();
            if (interrupted) {
                thread.interrupt();
            }

            boolean success = false;
            updateLastExecutionTime();
            try {
                SingleThreadEventExecutor.this.run();
                success = true;
            } catch (Throwable t) {
                logger.warn("Unexpected exception from an event executor: ", t);
            } finally {
                for (;;) {
                    int oldState = state;
                    if (oldState >= ST_SHUTTING_DOWN || STATE_UPDATER.compareAndSet(
                            SingleThreadEventExecutor.this, oldState, ST_SHUTTING_DOWN)) {
                        break;
                    }
                }

                // Check if confirmShutdown() was called at the end of the loop.
                if (success && gracefulShutdownStartTime == 0) {
                    if (logger.isErrorEnabled()) {
                        logger.error("Buggy " + EventExecutor.class.getSimpleName() + " implementation; " +
                                SingleThreadEventExecutor.class.getSimpleName() + ".confirmShutdown() must " +
                                "be called before run() implementation terminates.");
                    }
                }

                try {
                    // Run all remaining tasks and shutdown hooks. At this point the event loop
                    // is in ST_SHUTTING_DOWN state still accepting tasks which is needed for
                    // graceful shutdown with quietPeriod.
                    for (;;) {
                        if (confirmShutdown()) {
                            break;
                        }
                    }

                    // Now we want to make sure no more tasks can be added from this point. This is
                    // achieved by switching the state. Any new tasks beyond this point will be rejected.
                    for (;;) {
                        int oldState = state;
                        if (oldState >= ST_SHUTDOWN || STATE_UPDATER.compareAndSet(
                                SingleThreadEventExecutor.this, oldState, ST_SHUTDOWN)) {
                            break;
                        }
                    }

                    // We have the final set of tasks in the queue now, no more can be added, run all remaining.
                    // No need to loop here, this is the final pass.
                    confirmShutdown();
                } finally {
                    try {
                        cleanup();
                    } finally {
                        // Lets remove all FastThreadLocals for the Thread as we are about to terminate and notify
                        // the future. The user may block on the future and once it unblocks the JVM may terminate
                        // and start unloading classes.
                        // See https://github.com/netty/netty/issues/6596.
                        FastThreadLocal.removeAll();

                        STATE_UPDATER.set(SingleThreadEventExecutor.this, ST_TERMINATED);
                        threadLock.countDown();
                        int numUserTasks = drainTasks();
                        if (numUserTasks > 0 && logger.isWarnEnabled()) {
                            logger.warn("An event executor terminated with " +
                                    "non-empty task queue (" + numUserTasks + ')');
                        }
                        terminationFuture.setSuccess(null);
                    }
                }
            }
        }
    });
}

protected void run()

run()方法是SingleThreadEventExecutor定义的一个抽象方法，这里我们讲的是NioEventLoop的实现，run()方法是NioEventLoop最核心的一个方法。run()方法在一个死循环中，他会周而复始地做以下几件事情：

调用selector.select()或selectNow()判断是否有IO事件任务
如果有，则调用processSelectedKeys()处理IO事件任务
再调用runAllTasks()处理其他任务

protected void run() {
    int selectCnt = 0;
    for (;;) {
        try {
            int strategy;
            try {
                strategy = selectStrategy.calculateStrategy(selectNowSupplier, hasTasks());
                switch (strategy) {
                    case SelectStrategy.CONTINUE:
                        continue;

                    case SelectStrategy.BUSY_WAIT:
                        // fall-through to SELECT since the busy-wait is not supported with NIO

                    case SelectStrategy.SELECT:
                        long curDeadlineNanos = nextScheduledTaskDeadlineNanos();
                        if (curDeadlineNanos == -1L) {
                            curDeadlineNanos = NONE; // nothing on the calendar
                        }
                        nextWakeupNanos.set(curDeadlineNanos);
                        try {
                            if (!hasTasks()) {
                                strategy = select(curDeadlineNanos);
                            }
                        } finally {
                            // This update is just to help block unnecessary selector wakeups
                            // so use of lazySet is ok (no race condition)
                            nextWakeupNanos.lazySet(AWAKE);
                        }
                        // fall through
                    default:
                }
            } catch (IOException e) {
                // If we receive an IOException here its because the Selector is messed up. Let's rebuild
                // the selector and retry. https://github.com/netty/netty/issues/8566
                rebuildSelector0();
                selectCnt = 0;
                handleLoopException(e);
                continue;
            }

            selectCnt++;
            cancelledKeys = 0;
            needsToSelectAgain = false;
            final int ioRatio = this.ioRatio;
            boolean ranTasks;
            if (ioRatio == 100) {
                try {
                    if (strategy > 0) {
                        processSelectedKeys();
                    }
                } finally {
                    // Ensure we always run tasks.
                    ranTasks = runAllTasks();
                }
            } else if (strategy > 0) {
                final long ioStartTime = System.nanoTime();
                try {
                    processSelectedKeys();
                } finally {
                    // Ensure we always run tasks.
                    final long ioTime = System.nanoTime() - ioStartTime;
                    ranTasks = runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
                }
            } else {
                ranTasks = runAllTasks(0); // This will run the minimum number of tasks
            }

            if (ranTasks || strategy > 0) {
                if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS && logger.isDebugEnabled()) {
                    logger.debug("Selector.select() returned prematurely {} times in a row for Selector {}.",
                            selectCnt - 1, selector);
                }
                selectCnt = 0;
            } else if (unexpectedSelectorWakeup(selectCnt)) { // Unexpected wakeup (unusual case)
                selectCnt = 0;
            }
        } catch (CancelledKeyException e) {
            // Harmless exception - log anyway
            if (logger.isDebugEnabled()) {
                logger.debug(CancelledKeyException.class.getSimpleName() + " raised by a Selector {} - JDK bug?",
                        selector, e);
            }
        } catch (Throwable t) {
            handleLoopException(t);
        }
        // Always handle shutdown even if the loop processing threw an exception.
        try {
            if (isShuttingDown()) {
                closeAll();
                if (confirmShutdown()) {
                    return;
                }
            }
        } catch (Throwable t) {
            handleLoopException(t);
        }
    }
}

我们接下来再一步步详细地解释

1. 获取selectStrategy

selectStrategy.calculateStrategy(selectNowSupplier, hasTasks())

这一步是获取selectStrategy，如果队列中没有任务，则返回SelectStrategy.SELECT，如果没有任务，则调用selectNow()并返回结果。

public int calculateStrategy(IntSupplier selectSupplier, boolean hasTasks) throws Exception {
    return hasTasks ? selectSupplier.get() : SelectStrategy.SELECT;
}



private final IntSupplier selectNowSupplier = new IntSupplier() {
    @Override
    public int get() throws Exception {
        return selectNow();
    }
};

2. select或selectNow

try {
    switch (strategy) {
        case SelectStrategy.CONTINUE:
            continue;

        case SelectStrategy.BUSY_WAIT:
            // fall-through to SELECT since the busy-wait is not supported with NIO

        case SelectStrategy.SELECT:
            long curDeadlineNanos = nextScheduledTaskDeadlineNanos();
            if (curDeadlineNanos == -1L) {
                curDeadlineNanos = NONE; // nothing on the calendar
            }
            nextWakeupNanos.set(curDeadlineNanos);
            try {
                if (!hasTasks()) {
                    strategy = select(curDeadlineNanos);
                }
            } finally {
                // This update is just to help block unnecessary selector wakeups
                // so use of lazySet is ok (no race condition)
                nextWakeupNanos.lazySet(AWAKE);
            }
// fall through
        default:
    }
}

SelectStrategy有三种值：

SelectStrategy.CONTINUE (-2) : 跳过这次循环
SelectStrategy.BUSY_WAIT (-3) : 忙等，非阻塞式地获取IO事件，在NioEventLoop中不支持
SelectStrategy.SELECT (-1) : 进行一次select

当任务队列中还有任务时，selectStrategy为selectNow()返回的值，不等于上面任何一种情况；没有任务时， selectStrategy为SelectStrategy.SELECT，然后会根据curDeadlineNanos来判断调用哪一个select方法。

没有定时任务，调用selector.select()，因为该方法是阻塞式的，因此如果一直没有IO事件发生，线程会阻塞在这里。
有定时任务，并且定时任务的截止时间小于5微秒，说明马上有定时任务要执行，线程不需要阻塞，因此调用selector.selectNow()。
有定时任务，并且定时任务的截止时间大于5微妙，则调用selector.select(timeOutMillis)，只需要保证在截止时间之前线程能及时醒来即可。

private int select(long deadlineNanos) throws IOException {
    if (deadlineNanos == NONE) {
        return selector.select();
    }
    // Timeout will only be 0 if deadline is within 5 microsecs
    long timeoutMillis = deadlineToDelayNanos(deadlineNanos + 995000L) / 1000000L;
    return timeoutMillis <= 0 ? selector.selectNow() : selector.select(timeoutMillis);
}

此外，我们需要注意一件事情，即如果NioEventLoop的任务队列中没有普通任务且定时任务的截止时间很久远或者没有定时任务，那么调用selector.select()会阻塞线程，直到有IO事件的发生。那么在线程阻塞期间如果提交新任务，是否该任务会因为线程阻塞而一直得不到执行呢？事实上并不会，因为当调用addTask(Runnable task)方法提交任务时，是可以将线程唤醒的，代码如下。

public void execute(Runnable task) {
    ObjectUtil.checkNotNull(task, "task");
    //如果task不是一个LazyRunnable类型的task，则第二个传递的参数为true
    execute(task, !(task instanceof LazyRunnable) && wakesUpForTask(task));
}

//这个方法总是返回true，当然子类可以对这个方法进行重写，但NioEventLoop并没有重写
protected boolean wakesUpForTask(Runnable task) {
    return true;
}

private void execute(Runnable task, boolean immediate) {
    boolean inEventLoop = inEventLoop();
    addTask(task);
    if (!inEventLoop) {
        startThread();
        if (isShutdown()) {
            boolean reject = false;
            try {
                if (removeTask(task)) {
                    reject = true;
                }
            } catch (UnsupportedOperationException e) {
                // The task queue does not support removal so the best thing we can do is to just move on and
                // hope we will be able to pick-up the task before its completely terminated.
                // In worst case we will log on termination.
            }
            if (reject) {
                reject();
            }
        }
    }
    //对NioEventLoop，addTaskWakesUp参数默认为false，因此immediate如果为true，则会在把任务加入到队列后调用wakeup方法
    if (!addTaskWakesUp && immediate) {
        wakeup(inEventLoop);
    }
}


protected void wakeup(boolean inEventLoop) {
    if (!inEventLoop && nextWakeupNanos.getAndSet(AWAKE) != AWAKE) {
        selector.wakeup();
    }
}

3. 如果发生异常，则需要重建selector

这一块没啥说的，就是调用rebuildSelector0()方法重建一个selector。rebuildSelector0()方法的逻辑也不复杂，就是重新open一个新的selector，并且关闭旧的selector，这里不多赘述。

catch(IOException e){
    // If we receive an IOException here its because the Selector is messed up. Let's rebuild
    // the selector and retry. https://github.com/netty/netty/issues/8566
    rebuildSelector0();
    selectCnt = 0;
    handleLoopException(e);
    continue;
}

4. 执行IO任务和普通任务

判断ioRatio的值，这个值表示的是NioEventLoop用来执行IO任务和执行普通任务的时间的比例，如果这个值越小，那么用来执行普通任务的时间就越多。例如，如果ioRatio=20，那么执行IO任务和执行普通任务的时间比例为20:80，即1:4。默认情况下，ioRatio=50。此外，如果ioRatio=100，则在执行完IO任务后会执行所有的普通任务，而不管他花费了多少时间。

selectCnt++;
cancelledKeys = 0;
needsToSelectAgain = false;
final int ioRatio = this.ioRatio;
boolean ranTasks;
if (ioRatio == 100) {
    try {
        if (strategy > 0) {
            processSelectedKeys();
        }
    } finally {
        // Ensure we always run tasks.
        //ioRatio为100，调用不带参的runAllTasks()方法，执行队列中的所有任务，不管他们需要花费多少时间
        ranTasks = runAllTasks();   
    }
} else if (strategy > 0) {
    final long ioStartTime = System.nanoTime();
    try {
        //执行IO任务
        processSelectedKeys();
    } finally {
        // Ensure we always run tasks.
        final long ioTime = System.nanoTime() - ioStartTime;
        //执行普通任务，这里调用带参的runAllTasks方法，该方法可以保证在规定时间内返回，即使队列中还存在任务没有执行，也不会继续执行了
        ranTasks = runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
    }
} else {
    //没有IO任务，那么就只执行最少数量的普通任务。（最少数量为64）
    ranTasks = runAllTasks(0); // This will run the minimum number of tasks
}

if (ranTasks || strategy > 0) {
    if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS && logger.isDebugEnabled()) {
        logger.debug("Selector.select() returned prematurely {} times in a row for Selector {}.",
                selectCnt - 1, selector);
    }
    selectCnt = 0;
} else if (unexpectedSelectorWakeup(selectCnt)) { // Unexpected wakeup (unusual case)
    selectCnt = 0;
}

5. 关闭NioEventLoop

每次循环的最后，都会判断NioEventLoop的状态，如果isShuttingDown()返回true，则说明某个线程调用了shutdownGracefully()，也即意味着该NioEventLoop将要被关闭了，那么就调用closeAll()关闭所有的SelectionKey和Channel。最后再调用一次confirmShutdown()确认是否真正的要关闭，如果是，那么run()方法返回，循环结束。

try {
    if (isShuttingDown()) {
        closeAll();
        if (confirmShutdown()) {
            return;
        }
    }
} catch (Throwable t) {
    handleLoopException(t);
}

至此，NioEventLoop的run()方法的逻辑已经讲完了，这里还剩下两个比较重要的方法。processSelectedKeys()：处理IO任务；runAllTasks()：处理普通任务。

private void processSelectedKeys()

该方法会根据selectedKeys是否为null来调用processSelectedKeysOptimized()或processSelectedKeysPlain()，selectedKeys是NioEventLoop对JDK NIO原生的SelectionKey做的优化，selector.selectedKeys()返回的是一个Set<SelectionKey>，而这里的selectedKeys是一个SelectionKey数组，相比于Set，数组的遍历速度肯定更快。

private void processSelectedKeys() {
    if (selectedKeys != null) {
        processSelectedKeysOptimized();
    } else {
        processSelectedKeysPlain(selector.selectedKeys());
    }
}

private void processSelectedKeysOptimized()

在循环中调用processSelectedKey()依次处理所有的selectedKey，每次循环的最后都会检查一下needsToSelectAgain，如果为true，则需要reset selectedKeys并重新调用一次selectNow()。needsToSelectAgain只会在有多次selectedKey被cancelled的情况出现才会设置为true。

private void processSelectedKeysOptimized() {
    for (int i = 0; i < selectedKeys.size; ++i) {
        final SelectionKey k = selectedKeys.keys[i];
        // null out entry in the array to allow to have it GC'ed once the Channel close
        // See https://github.com/netty/netty/issues/2363
        selectedKeys.keys[i] = null;

        final Object a = k.attachment();

        if (a instanceof AbstractNioChannel) {
            processSelectedKey(k, (AbstractNioChannel) a);
        } else {
            @SuppressWarnings("unchecked")
            NioTask<SelectableChannel> task = (NioTask<SelectableChannel>) a;
            processSelectedKey(k, task);
        }

        if (needsToSelectAgain) {
            // null out entries in the array to allow to have it GC'ed once the Channel close
            // See https://github.com/netty/netty/issues/2363
            selectedKeys.reset(i + 1);

            selectAgain();
            i = -1;
        }
    }
}


private void selectAgain() {
    needsToSelectAgain = false;
    try {
        selector.selectNow();
    } catch (Throwable t) {
        logger.warn("Failed to update SelectionKeys.", t);
    }
}

private void processSelectedKey(SelectionKey k, AbstractNioChannel ch)

这里先判断selectionKey是否valid，如果selectionKey不是valid且相应的channel仍然注册在该EventLoop上，那么应该关闭该channel，如果channel已经不注册在这个EventLoop上，那么显然在这里是无权关闭这个channel的。

接下来判断IO事件的类型，然后调用channel的unsafe的响应的方法。如果为OP_CONNETC，调用unsafe.finishConnect()，并且取消注册的OP_CONNETC事件，否则可能导致JDK的空轮询BUG；如果为OP_WIRITE，调用unsafe().forceFlush()；如果为OP_ACCEPT或者OP_READ，调用unsafe.read()。

private void processSelectedKey(SelectionKey k, AbstractNioChannel ch) {
    final AbstractNioChannel.NioUnsafe unsafe = ch.unsafe();
    if (!k.isValid()) {
        final EventLoop eventLoop;
        try {
            eventLoop = ch.eventLoop();
        } catch (Throwable ignored) {
            // If the channel implementation throws an exception because there is no event loop, we ignore this
            // because we are only trying to determine if ch is registered to this event loop and thus has authority
            // to close ch.
            return;
        }
        // Only close ch if ch is still registered to this EventLoop. ch could have deregistered from the event loop
        // and thus the SelectionKey could be cancelled as part of the deregistration process, but the channel is
        // still healthy and should not be closed.
        // See https://github.com/netty/netty/issues/5125
        if (eventLoop == this) {
            // close the channel if the key is not valid anymore
            unsafe.close(unsafe.voidPromise());
        }
        return;
    }

    try {
        int readyOps = k.readyOps();
        // We first need to call finishConnect() before try to trigger a read(...) or write(...) as otherwise
        // the NIO JDK channel implementation may throw a NotYetConnectedException.
        if ((readyOps & SelectionKey.OP_CONNECT) != 0) {
            // remove OP_CONNECT as otherwise Selector.select(..) will always return without blocking
            // See https://github.com/netty/netty/issues/924
            int ops = k.interestOps();
            ops &= ~SelectionKey.OP_CONNECT;
            k.interestOps(ops);

            unsafe.finishConnect();
        }

        // Process OP_WRITE first as we may be able to write some queued buffers and so free memory.
        if ((readyOps & SelectionKey.OP_WRITE) != 0) {
            // Call forceFlush which will also take care of clear the OP_WRITE once there is nothing left to write
            ch.unsafe().forceFlush();
        }

        // Also check for readOps of 0 to workaround possible JDK bug which may otherwise lead
        // to a spin loop
        if ((readyOps & (SelectionKey.OP_READ | SelectionKey.OP_ACCEPT)) != 0 || readyOps == 0) {
            unsafe.read();
        }
    } catch (CancelledKeyException ignored) {
        unsafe.close(unsafe.voidPromise());
    }
}

protected boolean runAllTasks() 和 protected boolean runAllTasks(long timeoutNanos)

无参的runAllTasks方法会执行完任务队列中的全部任务，带timeoutNanos参数的runAllTasks方法只会执行一定的时间，如果超过这个时间，那么该方法return，队列中剩余的任务就不执行了。

在无参的runAllTasks()方法中，会一直从定时任务队列中取定时任务加入到普通任务队列中，然后执行普通队列中的所有任务，直到fetchFromScheduledTaskQueue()方法返回true。最后再调用一次afterRunningAllTasks()方法。

protected boolean runAllTasks() {
    assert inEventLoop();
    boolean fetchedAll;
    boolean ranAtLeastOne = false;

    do {
        fetchedAll = fetchFromScheduledTaskQueue();
        if (runAllTasksFrom(taskQueue)) {
            ranAtLeastOne = true;
        }
    } while (!fetchedAll); // keep on processing until we fetched all scheduled tasks.

    if (ranAtLeastOne) {
        lastExecutionTime = ScheduledFutureTask.nanoTime();
    }
    afterRunningAllTasks();
    return ranAtLeastOne;
}

fetchFromScheduledTaskQueue()方法逻辑为，一直从定时任务队列中取满足执行条件的任务到普通队列中，直到普通队列已经满了或者定时任务队列中没有需要执行的任务了，如果是前者，返回false，后者返回true。

private boolean fetchFromScheduledTaskQueue() {
    //定时任务队列为空，没有定时任务，直接返回true
    if (scheduledTaskQueue == null || scheduledTaskQueue.isEmpty()) {
        return true;
    }
    long nanoTime = AbstractScheduledEventExecutor.nanoTime();
    for (;;) {
        //取已经满足执行时间的定时任务
        Runnable scheduledTask = pollScheduledTask(nanoTime);
        if (scheduledTask == null) {
            //没有的话，直接返回true
            return true;
        }
        if (!taskQueue.offer(scheduledTask)) {
            //将需要执行的定时任务加入到普通任务队列中，如果普通任务队列已经满了，返回false，意味着还需要继续调用该方法
            // No space left in the task queue add it back to the scheduledTaskQueue so we pick it up again.
            scheduledTaskQueue.add((ScheduledFutureTask<?>) scheduledTask);
            return false;
        }
    }
}

带参的runAllTasks()方法唯一不同的是，每执行64个任务就会检查一次是否超时，如果超时了，就不继续执行任务了，直接返回。

protected boolean runAllTasks(long timeoutNanos) {
    fetchFromScheduledTaskQueue();
    Runnable task = pollTask();
    if (task == null) {
        afterRunningAllTasks();
        return false;
    }

    final long deadline = timeoutNanos > 0 ? ScheduledFutureTask.nanoTime() + timeoutNanos : 0;
    long runTasks = 0;
    long lastExecutionTime;
    for (;;) {
        safeExecute(task);

        runTasks ++;

        // Check timeout every 64 tasks because nanoTime() is relatively expensive.
        // XXX: Hard-coded value - will make it configurable if it is really a problem.
        if ((runTasks & 0x3F) == 0) {
            lastExecutionTime = ScheduledFutureTask.nanoTime();
            if (lastExecutionTime >= deadline) {
                break;
            }
        }

        task = pollTask();
        if (task == null) {
            lastExecutionTime = ScheduledFutureTask.nanoTime();
            break;
        }
    }

    afterRunningAllTasks();
    this.lastExecutionTime = lastExecutionTime;
    return true;
}

（完）