rust, smol

最新推荐文章于 2025-03-21 23:57:26 发布

Kingwel2020

最新推荐文章于 2025-03-21 23:57:26 发布

阅读量723

点赞数

分类专栏： rust

本文链接：https://blog.youkuaiyun.com/m0_37889044/article/details/107041749

版权

rust 专栏收录该内容

21 篇文章

订阅专栏

rust, smol

Reactor
Async IO
Timer
- Timer Btree
- Reactor 定时器超时处理逻辑

Smol 是一个轻量级的异步io运行时刻库，实现了高效率的多线程 Executor 和 Reactor，支持定时器。smol 对异步 Socket I/O 也做了很好的封装，同时使用独立线程池的机制支持异步化的同步操作（Blocking Executor）。目前 async-std 正是使用 smol 作为基础的运行库。

Dependencies

async-task, as the abstrction of task or runnable
blocking, to block async I/O, in its dedicated thread pool, which has nothing to do with smol Executor
concurrent-queue, a concurrent multi-producer multi-consumer queue
fastrand, a simple random number generator
futures-io
futures-util
libc
once_cell, cell but only initialized once. used to initialize global variables
scoped-tls, scoped_thread_local!() macro
slab, just like Vec, but return index when inserted
socket2, provide direct access to the system’s functionality for sockets

Cross-platform, I/O multiplexing

Reactor is built on top of:

Linux, [epoll]: https://en.wikipedia.org/wiki/Epoll
BSD, [kqueue]: https://en.wikipedia.org/wiki/Kqueue
Windows, [wepoll]: https://github.com/piscisaureus/wepoll

Executors

Thread-local executor for tasks created by Task::local().
Work-stealing executor for tasks created by Task::spawn().
Blocking executor for tasks created by Task::blocking(), blocking!, iter(), reader() and writer().

Task

Task是对 async-task 的包装。启动/spawn 一个task实际上是生成一个 async-task，将其放入当前线程的 Worker 任务队列，或者当 Worker 不存在时，放入全局队列。同时返回 Task handle。Task 随后将被 Executor 调度执行。

生成一个 Task 需要提供一个 Future。在内部，Future 包装入 RawTask，并同时被 runnable 和 handle 引用。Rawtask 包含了Task的基本定义，如状态，waker，输出等等。runnable 是基本调度单位，被推入队列，择机执行。runnable.run() 被用来最终执行 Future。而 handle 也是一个 Future，根据 RawTask 的状态返回 Pending/Ready，推动 Executor 运行。这部分有一些复杂，需要仔细研究代码。

pub struct Task<T>(pub(crate) Option<async_task::JoinHandle<T, ()>>);


/// Raw pointers to the fields inside a task.
pub(crate) struct RawTask<F, R, S, T> {
    /// The task header.
    pub(crate) header: *const Header,

    /// The schedule function.
    pub(crate) schedule: *const S,

    /// The tag inside the task.
    pub(crate) tag: *mut T,

    /// The future.
    pub(crate) future: *mut F,

    /// The output of the future.
    pub(crate) output: *mut R,
}

	...
    pub fn spawn(future: impl Future<Output = T> + Send + 'static) -> Task<T> {
        QUEUE.spawn(future)
    }
    
    ...	
	pub fn QUEUE::spawn<T: Send + 'static>(
        &self,
        future: impl Future<Output = T> + Send + 'static,
    ) -> Task<T> {
        let global = self.global.clone();

        // The function that schedules a runnable task when it gets woken up.
        let schedule = move |runnable| {
            if WORKER.is_set() {
                WORKER.with(|w| {
                    if Arc::ptr_eq(&global, &w.global) {
                        if let Err(err) = w.shard.push(runnable) {
                            global.queue.push(err.into_inner()).unwrap();
                        }
                    } else {
                        global.queue.push(runnable).unwrap();
                    }
                });
            } else {
                global.queue.push(runnable).unwrap();
            }

            global.notify();
        };

        // Create a task, push it into the queue by scheduling it, and return its `Task` handle.
        let (runnable, handle) = async_task::spawn(future, schedule, ());
        runnable.schedule();
        Task(Some(handle))
    }

Blocking Executor

Blocking Executor 比较简单。大致等于 block_on 的逻辑。提供的功能是将同步的操作放到一个线程池上去执行。这样的功能其实比较重要，可以在 async-std 中看到，对于常规文件操作都是通过这种方式来实现。

基本的原理是创建一个任务，将其放入全局队列。线程池会轮询这个队列，从中提取任务执行。任务执行的结果通过 async-task的 JoinHandle 这个Future 返回给 Blocking Executor。

/// The blocking executor. 这是一个全局单例的数据结构。惰性初始化
pub(crate) struct BlockingExecutor {
    /// The current state of the executor.
    state: Mutex<State>,

    /// Used to put idle threads to sleep and wake them up when new work comes in.
    cvar: Condvar,
}

/// Current state of the blocking executor. 这里也记录了所使用到的线程池的状态。
struct State {
    /// Number of idle threads in the pool.
    ///
    /// Idle threads are sleeping, waiting to get a task to run.
    idle_count: usize,

    /// Total number of thread in the pool.
    ///
    /// This is the number of idle threads + the number of active threads.
    thread_count: usize,

    /// The queue of blocking tasks.
    queue: VecDeque<Runnable>,
}

...
impl BlockingExecutor {
    /// Returns a reference to the blocking executor.
    pub fn get() -> &'static BlockingExecutor {
        static EXECUTOR: Lazy<BlockingExecutor> = Lazy::new(|| BlockingExecutor {
            state: Mutex::new(State {
                idle_count: 0,
                thread_count: 0,
                queue: VecDeque::new(),
            }),
            cvar: Condvar::new(),
        });
        &EXECUTOR
    }
	// 注意这里创建task的方式。并没有使用task::spawn，schedule 的方式也是 blocking executor 特有的
    pub fn spawn<T: Send + 'static>(
        &'static self,
        future: impl Future<Output = T> + Send + 'static,
    ) -> Task<T> {
        // Create a task, schedule it, and return its `Task` handle.
        let (runnable, handle) = async_task::spawn(future, move |r| self.schedule(r), ());
        runnable.schedule();
        Task(Some(handle))
    }
    // 线程的主循环。执行这个任务。
    fn main_loop(&'static self) {
        let mut state = self.state.lock().unwrap();
        loop {
            // This thread is not idle anymore because it's going to run tasks.
            state.idle_count -= 1;

            // Run tasks in the queue.
            while let Some(runnable) = state.queue.pop_front() {
                // We have found a task - grow the pool if needed.
                self.grow_pool(state);

                // Run the task.
                let _ = panic::catch_unwind(|| runnable.run());

                // Re-lock the state and continue.
                state = self.state.lock().unwrap();
            }

            // This thread is now becoming idle.
            state.idle_count += 1;

            // Put the thread to sleep until another task is scheduled.
            let timeout = Duration::from_millis(500);
            let (s, res) = self.cvar.wait_timeout(state, timeout).unwrap();
            state = s;

            // If there are no tasks after a while, stop this thread.
            if res.timed_out() && state.queue.is_empty() {
                state.idle_count -= 1;
                state.thread_count -= 1;
                break;
            }
        }
    }

    /// Schedules a runnable task for execution. 任务推入全局队列。
    fn schedule(&'static self, runnable: Runnable) {
        let mut state = self.state.lock().unwrap();
        state.queue.push_back(runnable);

        // Notify a sleeping thread and spawn more threads if needed.
        self.cvar.notify_one();
        self.grow_pool(state);
    }

Local Executor

Task::spawn_local 会创建一个在 local executor上执行的任务。使用 worker local queue

pub fn spawn_local<T: 'static>(&self, future: impl Future<Output = T> + 'static) -> Task<T> {
        let queue = self.local.queue.clone();
        let callback = self.callback.clone();
        let id = thread_id();

        // The function that schedules a runnable task when it gets woken up.
        let schedule = move |runnable| {
            if thread_id() == id && WORKER.is_set() {
                WORKER.with(|w| {
                    if Arc::ptr_eq(&queue, &w.local.queue) {
                        w.local.push(runnable).unwrap();
                    } else {
                        queue.push(runnable).unwrap();
                    }
                });
            } else {
                queue.push(runnable).unwrap();
            }

            callback.call();
        };

        // Create a task, push it into the queue by scheduling it, and return its `Task` handle.
        let (runnable, handle) = async_task::spawn_local(future, schedule, ());
        runnable.schedule();
        Task(Some(handle))
    }

Work-stealing Executor

Task::spawn 会创建一个在多线程任务窃取 executor上执行的任务。

Multi-thread

To start multi-threaded executor, we have to create the OS threads explicitly:

    for _ in 0..num_threads {
        thread::spawn(|| smol::run( future::pending::<u8>()));
    }

smol::run

执行executor，轮询reactor。至少需要在一个线程中被调用。如果顶层被执行的 future 返回ready，那么函数将会返回，execturo 线程退出。所以一般在启动executor时，会将传入的顶层future设置为 future::pending()，永远返回 pending。这种情况下，excutor 会尝试从队列里获取已经启动（spawned）的任务并执行。如果当前没有任务可以执行，则挂起在 Reactor 上等待 I/O 或定时器事件。

QUEUE.worker(move || unparker.unpark())，会创建一个 Executor的 worker。worker 维护了所有运行任务的队列。注意worker 被放入 scoped TLS，与闭包 loop 循环相关。 WORKER.set(&worker ...

pub fn run<T>(future: impl Future<Output = T>) -> T {
    let parker = Parker::new();
    let unparker = parker.unparker();
    // 创建一个worker	
    let worker = QUEUE.worker(move || unparker.unpark());

    // Create a waker that triggers an I/O event in the thread-local scheduler.
    let unparker = parker.unparker();
    let waker = async_task::waker_fn(move || unparker.unpark());
    let cx = &mut Context::from_waker(&waker);
    futures_util::pin_mut!(future);

    // Set up tokio if enabled.
    context::enter(|| {
        WORKER.set(&worker, || {
            'start: loop {
                // Poll the main future.
                if let Poll::Ready(val) = future.as_mut().poll(cx) {
                    return val;
                }
                for _ in 0..200 {
                	// Take函数会试图执行任务
                    if !worker.tick() {
                    	// 暂时没有任务，挂起reactor
                        parker.park();
                        continue 'start;
                    }
                }
                // Process ready I/O events without blocking.
                parker.park_timeout(Duration::from_secs(0));
            }
        })
    })
}

如果当前没有任何可以执行的 task，Executor 会调用parker.park() 从而进入 reactor 。

Call stack:

ntdll!ZwRemoveIoCompletionEx 0x00007ffec7e6db84
KERNELBASE!GetQueuedCompletionStatusEx 0x00007ffec5b07414
port__poll wepoll.c:1235
port_wait wepoll.c:1292
epoll_wait wepoll.c:680
smol::reactor::sys::Reactor::wait reactor.rs:820
smol::reactor::ReactorLock::react reactor.rs:241
smol::parking::Inner::park parking.rs:203
smol::parking::Parker::park parking.rs:40
smol::run::run::{{closure}}::{{closure}} run.rs:125
scoped_tls::ScopedKey<T>::set lib.rs:137
smol::run::run::{{closure}} run.rs:116
smol::context::enter context.rs:8
smol::run::run run.rs:115
second::main::{{closure}} main.rs:110

Queue::worker()

Worker 维护几个不同的任务队列，实现任务运行的策略。

    pub fn worker(&self, notify: impl Fn() + Send + Sync + 'static) -> Worker {
        let mut shards = self.global.shards.write().unwrap();
        let vacant = shards.vacant_entry();

        // Create a worker and put its stealer handle into the executor.
        let worker = Worker {
            key: vacant.key(),
            global: Arc::new(self.global.clone()),
            shard: SlotQueue {
                slot: Cell::new(None),
                queue: Arc::new(ConcurrentQueue::bounded(512)),
            },
            local: SlotQueue {
                slot: Cell::new(None),
                queue: Arc::new(ConcurrentQueue::unbounded()),
            },
            callback: Callback::new(notify),
            sleeping: Cell::new(false),
            ticker: Cell::new(0),
        };
        vacant.insert(worker.shard.queue.clone());

        worker
    }

worker.tick()

从队列里获取一个任务并执行。self.search() 实现了任务的窃取机制。

/// Runs a single task and returns `true` if one was found.
    pub fn tick(&self) -> bool {
        loop {
            match self.search() {
                None => {
                    // Move to sleeping and unnotified state.
                    if !self.sleep() {
                        // If already sleeping and unnotified, return.
                        return false;
                    }
                }
                Some(r) => {
                    // Wake up.
                    if !self.wake() {
                        // If already woken, notify another worker.
                        self.global.notify();
                    }

                    // Bump the ticker.
                    let ticker = self.ticker.get();
                    self.ticker.set(ticker.wrapping_add(1));

                    // Flush slots to ensure fair task scheduling.
                    if ticker % 16 == 0 {
                        if let Err(err) = self.shard.flush() {
                            self.global.queue.push(err.into_inner()).unwrap();
                            self.global.notify();
                        }
                        self.local.flush().unwrap();
                    }

                    // Steal tasks from the global queue to ensure fair task scheduling.
                    if ticker % 64 == 0 {
                        self.shard.steal(&self.global.queue);
                    }

                    // Run the task.
                    if WORKER.set(self, || r.run()) {
                        // The task was woken while it was running, which means it got
                        // scheduled the moment running completed. Therefore, it is now inside
                        // the slot and would be the next task to run.
                        //
                        // Instead of re-running the task in the next iteration, let's flush
                        // the slot in order to give other tasks a chance to run.
                        //
                        // This is a necessary step to ensure task yielding works as expected.
                        // If a task wakes itself and returns `Poll::Pending`, we don't want it
                        // to run immediately after that because that'd defeat the whole
                        // purpose of yielding.
                        if let Err(err) = self.shard.flush() {
                            self.global.queue.push(err.into_inner()).unwrap();
                            self.global.notify();
                        }
                        self.local.flush().unwrap();
                    }

                    return true;
                }
            }
        }
    }

parker.park()

Parker 实现了从executor 进入reactor 的过程。

park 调用Reactor 进行 epoll 等待I/O 事件或者定时器。因为reactor 被一个mutex 保护。因此，当运行了多线程的executor ，在任一时刻只会有一个executor 持有 reactor。未能获取到 Reactor的executor 将会阻塞进入睡眠，直到被 I/O事件唤醒。

pub(crate) struct Parker {
    key: Cell<Option<usize>>,
    unparker: Unparker,
}
pub(crate) struct Unparker {
    inner: Arc<Inner>,
}

fn park(&self, timeout: Option<Duration>) -> bool {
		...
		
		// 这里尝试获取reactor，这个lock 会一直持有。
        let mut reactor_lock = Reactor::get().try_lock();
        let state = match reactor_lock {
            None => PARKED,
            Some(_) => POLLING,
        };
        let mut m = self.lock.lock().unwrap();

        match self.state.compare_exchange(EMPTY, state, SeqCst, SeqCst) {
            Ok(_) => {}
            // Consume this notification to avoid spurious wakeups in the next park.
            Err(NOTIFIED) => {
                // We must read `state` here, even though we know it will be `NOTIFIED`. This is
                // because `unpark` may have been called again since we read `NOTIFIED` in the
                // `compare_exchange` above. We must perform an acquire operation that synchronizes
                // with that `unpark` to observe any writes it made before the call to `unpark`. To
                // do that we must read from the write it made to `state`.
                let old = self.state.swap(EMPTY, SeqCst);
                assert_eq!(old, NOTIFIED, "park state changed unexpectedly");
                return true;
            }
            Err(n) => panic!("inconsistent park_timeout state: {}", n),
        }

        match timeout {
            None => {
                loop {
                    // Block the current thread on the conditional variable.
                    match &mut reactor_lock {
                    	// 没有获得 lock的 executor thread, 阻塞在这个条件变量上。
                        None => m = self.cvar.wait(m).unwrap(),
                        Some(reactor_lock) => {
                            drop(m);
                            //println!("lock on tid={:?}", std::thread::current().id());
                            reactor_lock.react(None).expect("failure while polling I/O");

                            m = self.lock.lock().unwrap();
                        }
                    }

                    match self.state.compare_exchange(NOTIFIED, EMPTY, SeqCst, SeqCst) {
                        Ok(_) => return true, // got a notification
                        Err(_) => {}          // spurious wakeup, go back to sleep
                    }
                }
            }
            Some(timeout) => {
                // Wait with a timeout, and if we spuriously wake up or otherwise wake up from a
                // notification we just want to unconditionally set `state` back to `EMPTY`, either
                // consuming a notification or un-flagging ourselves as parked.
                let _m = match reactor_lock.as_mut() {
                    None => self.cvar.wait_timeout(m, timeout).unwrap().0,
                    Some(reactor_lock) => {
                        drop(m);
                        let deadline = Instant::now() + timeout;
                        loop {
                            reactor_lock
                                .react(Some(deadline.saturating_duration_since(Instant::now())))
                                .expect("failure while polling I/O");

                            if Instant::now() >= deadline {
                                break;
                            }
                        }
                        self.lock.lock().unwrap()
                    }
                };

                match self.state.swap(EMPTY, SeqCst) {
                    NOTIFIED => true,          // got a notification
                    PARKED | POLLING => false, // no notification
                    n => panic!("inconsistent park_timeout state: {}", n),
                }
            }
        }
    }

Reactor.react

处理定时器以及IO 事件。当IO事件发生，调用相应的 waker 。

/// Processes new events, blocking until the first event or the timeout.
    pub fn react(&mut self, timeout: Option<Duration>) -> io::Result<()> {
        // Fire timers.
        let next_timer = self.reactor.fire_timers();

        // compute the timeout for blocking on I/O events.
        let timeout = match (next_timer, timeout) {
            (None, None) => None,
            (Some(t), None) | (None, Some(t)) => Some(t),
            (Some(a), Some(b)) => Some(a.min(b)),
        };

        // Bump the ticker before polling I/O.
        let tick = self
            .reactor
            .ticker
            .fetch_add(1, Ordering::SeqCst)
            .wrapping_add(1);

        // Block on I/O events.
        match self.reactor.sys.wait(&mut self.events, timeout) {
            // No I/O events occurred.
            Ok(0) => {
                if timeout != Some(Duration::from_secs(0)) {
                    // The non-zero timeout was hit so fire ready timers.
                    self.reactor.fire_timers();
                }
                Ok(())
            }

            // At least one I/O event occurred.
            Ok(_) => {
                // Iterate over sources in the event list.
                let sources = self.reactor.sources.lock().unwrap();
                let mut ready = Vec::new();

                for ev in self.events.iter() {
                    // Check if there is a source in the table with this key.
                    if let Some(source) = sources.get(ev.key) {
                        let mut wakers = source.wakers.lock().unwrap();

                        // Wake readers if a readability event was emitted.
                        if ev.readable {
                            wakers.tick_readable = tick;
                            ready.append(&mut wakers.readers);
                        }

                        // Wake writers if a writability event was emitted.
                        if ev.writable {
                            wakers.tick_writable = tick;
                            ready.append(&mut wakers.writers);
                        }

                        // Re-register if there are still writers or
                        // readers. The can happen if e.g. we were
                        // previously interested in both readability and
                        // writability, but only one of them was emitted.
                        if !(wakers.writers.is_empty() && wakers.readers.is_empty()) {
                            self.reactor.sys.reregister(
                                source.raw,
                                source.key,
                                !wakers.readers.is_empty(),
                                !wakers.writers.is_empty(),
                            )?;
                        }
                    }
                }
                // Drop the lock before waking.
                drop(sources);
                // Wake up tasks waiting on I/O.
                for waker in ready {
                    waker.wake();
                }
                Ok(())
            }
			...
        }

Wake up

空闲时，多线程的 Executor 中其中一个 worker 阻塞在 Reactor 上等待 I/O，其他 worker 则阻塞于 Parker 条件变量上。一旦 I/O 事件到来，Reactor 所在的 Worker 会调用对应的 Waker，unpark 相应的 parking worker，进而恢复执行 Task。这样，Task 有机会继续轮询自己的 Future 的状态。

smol::parking::Inner::unpark parking.rs:268
smol::parking::Unparker::unpark parking.rs:111
smol::run::run::{{closure}} run.rs:110
async_task::waker_fn::Helper<F>::wake waker_fn.rs:32
core::task::wake::Waker::wake wake.rs:241
smol::reactor::ReactorLock::react reactor.rs:294
smol::parking::Inner::park parking.rs:203
smol::parking::Parker::park parking.rs:40
smol::run::run::{{closure}}::{{closure}} run.rs:125
scoped_tls::ScopedKey<T>::set lib.rs:137
smol::run::run::{{closure}} run.rs:116
smol::context::enter context.rs:8
smol::run::run run.rs:115
second::main::{{closure}} main.rs:110

Reactor

TBD

Async IO

TBD

Timer

定时器的实现依赖于 Reactor 的epoll 的超时机制。Reactor 使用BTree 存放所有定时器，每次进入Reactor epoll wait，会使用下一个即将超时的定时器时间值作为 wait 的超时参数，此时有任何 I/O 事件，则处理I/O，然后继续以更新后的定时器超时值作为参数继续等待。否则，wait 返回即以为定时器超时，移除定时器，调用 waker。

定时器本质上是一个 Future，首次轮询时，会插入定时器到 Reactor，等待唤醒。代码如下：

impl Future for Timer {
    type Output = Instant;

    fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
        // Check if the timer has already fired.
        if Instant::now() >= self.when {
            if let Some(id) = self.id.take() {
                // Deregister the timer from the reactor.
                Reactor::get().remove_timer(self.when, id);
            }
            Poll::Ready(self.when)
        } else {
            if self.id.is_none() {
                // Register the timer in the reactor.
                self.id = Some(Reactor::get().insert_timer(self.when, cx.waker()));
            }
            Poll::Pending
        }
    }
}

Timer Btree

一个按照元组（时间，ID）为key组织的排序数据结构 BtreeMap。其中，时间代表定时器到期时刻，ID 是一个按照定时器创建顺序递增的惟一值。

BtreeMap.split_off(key) 可以方便地将已超时的所有定时器移出BtreeMap。

    /// An ordered map of registered timers.
    ///
    /// Timers are in the order in which they fire. The `usize` in this type is a timer ID used to
    /// distinguish timers that fire at the same time. The `Waker` represents the task awaiting the
    /// timer.
    timers: Mutex<BTreeMap<(Instant, usize), Waker>>,

Reactor 定时器超时处理逻辑

已超时的定时器将会被移出，应依次调用其waker 函数。

fn fire_timers(&self) -> Option<Duration> {
        let mut timers = self.timers.lock().unwrap();

        // Process timer operations, but no more than the queue capacity because otherwise we could
        // keep popping operations forever.
        for _ in 0..self.timer_ops.capacity().unwrap() {
            match self.timer_ops.pop() {
                Ok(TimerOp::Insert(when, id, waker)) => {
                    timers.insert((when, id), waker);
                }
                Ok(TimerOp::Remove(when, id)) => {
                    timers.remove(&(when, id));
                }
                Err(_) => break,
            }
        }

        let now = Instant::now();

        // Split timers into ready and pending timers.
        let pending = timers.split_off(&(now, 0));
        let ready = mem::replace(&mut *timers, pending);

        // Calculate the duration until the next event.
        let dur = if ready.is_empty() {
            // Duration until the next timer.
            timers
                .keys()
                .next()
                .map(|(when, _)| when.saturating_duration_since(now))
        } else {
            // Timers are about to fire right now.
            Some(Duration::from_secs(0))
        };

        // Drop the lock before waking.
        drop(timers);

        // Wake up tasks waiting on timers.
        for (_, waker) in ready {
            waker.wake();
        }
        dur
    }