nng协议nni_posix_pollq_create(nni_posix_pollq *pq) 初始化

1.基本概念

epoll 是 Linux 内核提供的一种高效的 I/O 事件通知机制，适用于监视多个文件描述符，以便在这些文件描述符上发生 I/O 事件时能够立即得到通知。epoll 的工作方式比传统的 select 和 poll 更加高效，尤其适合处理大量并发连接的场景。

epoll实例是由epoll_create 或 epoll_create1 创建的，它本质上是一个内核对象，用来跟踪和管理一组文件描述符的事件。创建epoll 实例时，会返回一个文件描述符，这个文件描述符用与后续的epoll操作。

2. 文件描述符（file descriptors）

在epoll中，文件描述符可以是任何支持非阻塞I/O的文件描述符，如套接字、管道、文件等。

可以通过 epoll_ctl 函数将这些文件描述符添加到epoll实例中，以监视它们上的事件。

3.事件：

epoll 支持监视多种事件，如读事件（EPOLLIN）、写事件（EPOLLOUT）、错误事件等。

事件通过 structure epoll_event 结构体来描述。

4. epoll 的基本使用流程

4.1 创建epoll 实例

使用epoll_create 或 epoll_create1 创建一个epoll实例。返回的文件描述符用于标识这个实例。

int  epfd = epoll_create1(EPOLL_CLOEXEC);
if(epfd < 0){
    perror("epoll_create1");
    return -1;
}

4.2 添加、修改或删除文件描述符：

使用epoll_ctl 函数添加（EPOLL_CTL_ADD）、修改（EPOLL_CTL_MOD）或删除（EPOLL_CTL_DEL）要监视的文件描述符。

struct epoll_event ev;
ev.events = EPOLLIN;  // 监视读事件
ev.data.fd = fd;      // 需要监视的文件描述符

if(epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) < 0){
    perror("epoll_ctl");
    return -1;
}

4.3 等待事件

使用epoll_wait 来等待文件描述符上的事件。该函数会阻塞直到一个或多个事件发生，或者超时。

struct epoll_event events[MAX_EVENTS];
int nfds = epoll_wait(epfd, events, MAX_EVENTS, -1); // epoll 等待超时，也就是没有等到集合中文件描述符上发生指定的事件，将返回 0；
if(nfds < 0){
    perror("epoll_wait");
    return -1;
}

for(int i=0; i < nfds; ++i){
    if(events[i].events & EPOLLIN){
        //处理读事件
    }

    if(events[i].events & EPOLLOUT){
        //处理写事件
    }

    //  处理其他事件，或许有错误发生
}

5. 代码分析【正文】

//初始化函数定义
int 
nni_init(void)
{
    int rv;
    if((rv = nni_plat_init(nni_init_halper)) != 0){
        nng_log_err("NNG-INIT", "NNG library initialization failed: %s", nng_strreror(rv));
    } 
    return rv;
}
//本文说的函数在nni_plat_init函数调用：
//Line 396
if((rv = nni_posix_pollq_sysinit()) != 0){
    //...
}

在nni_plat_init中，设置了 nni_cvattr 的时钟属性，参考这里(了解即可）：nng协议分析记录--pthread_condattr_setclock 设置时钟属性-优快云博客

5.1 函数说明

static int nni_posix_pollq_create(nni_posix_pollq *pq)

这段代码的功能是创建一个文件描述符，并将其添加到 epoll 实例中，以便在程序需要自我唤醒时，能够使用该文件描述符触发 epoll 事件。主要步骤包括创建事件文件描述符，设置文件描述符属性、配置 epoll 事件并将其添加到 epoll 实例中。详细信息如下：

if((pq->epfd = epoll_create1(EPOLL_CLOEXEC) < 0){
    return (nni_plat_erron(errno))
}

这段代码的作用是创建一个epoll 实例，并将文件描述符存储在 pq 结构体的 epfd 中。参数EPOLL_CLOEXEC 的表示在执行exec系列函数时，自动关闭这个文件描述符。（请自行参考exec系列函数，exit...）。

5.2 nni_posix_pollq_add_eventfd(pq)

在nni_posix_pollq_create（nni_posix_pollq *pq）中调用了 nni_posix_pollq_add_eventfd（pq）这个函数，函数操作如下。

1. 定义epoll_event ev

2. 创建事件文件描述符, 使用eventfd 函数创建一个文件描述符 fd, 初始值为 0 ，标志为EFD_NONBLOCK （非阻塞）

if((fd = eventfd(0, EFD_NONBLOCK) < 0 ){
    return (nni_plat_errno(errno));
}

3. 设置文件描述符属性

(void) fcntl(fd, F_SETFD, FD_CLOEXEC);
(void) fcntl(fd, F_SETFL, O_NONBLOCK);

使用fcntl 设置文件描述符的标志，FD_CLOEXEC，使其在执行exec()系列函数后关闭。

再次使用fcntl 函数设置文件描述符为非阻塞模式。

4. 设置 epoll 事件

ev.events = EPOLLIN;
ev.data.ptr = 0;

设置 epoll 事件唯 EPOLLIN，表示监听读事件。将ev.data.ptr 设置为 0，这个指针可以用来存储用户数据，此处设置为 0.

5. 向 epoll 实例中添加事件。

if(epoll_ctl(pq->epfd, EPOLL_CTL_ADD, fd, &ev) != 0){
    (void) close(fd);
    return (nni_plat_errno(errno);
}

6. 保存文件描述符并返回成功

将文件描述符 fd 保存到 pq 结构体的 evfd 字段中。并返回成功。

pq->evfd  = fd;
return (0);

总结一下就是：这段代码的功能是创建一个事件文件描述符，并将其添加到 epoll 实例中，以便在程序需要自我唤醒时能够使用该文件描述符触发 epoll 事件。主要步骤包括创建事件文件描述符、设置文件描述符属性、配置 epoll 事件并将其添加到 epoll 实例中。

6. nni_thr_init(&pq->thr, nni_posix_poll_thr, pq) 初始化

这里先贴一下nni_posix_pollq结构体的定义。

struct nni_posix_pollq{
    nni_mtx  mtx;
    int      epfd;
    int      evfd;
    bool     close;
    nni_thr  thr;
    nni_list reqpq;
}
//成员变量nni_thr thr的定义如下：

struct nni_thr{
    nni_plat_thr   thr;
    nni_plat_mtx   mtx;
    nni_plat_cv    cv;
    nni_thr_func   fn;
    void *         arg;
    int            start;
    int            stop;
    int            done;
    int            init;
}

//这里又嵌套了一层nng_plat_thr
strcut nni_plat_thr{
    pthread_t  tid;
    void(*func)(void *);
    void *arg;
}

nni_thr_init 函数的参数是全局变量 nni_posix_global_pollq->thr 对应上面的结构体定义就是thr 这个成员变量。这个函数的作用就是对pq结构体成员thr进行初始化。

第一层 nni_thr_init 这里就是对pq->thr进行初始化，首先对基本成员赋值。

第一个参数就是pq->thr,

第二个参数是函数指针(函数 nni_posix_poll_thr)，

第三个参数是pq.

thr->done  = 0;
thr->start = 0;
thr->stop  = 0;
thr->fn    = fn; 
thr->arg   = arg;

第一层初始化后的值如下：

//nni_thr_init(&pq->thr, nni_posix_poll_thr, pq) 函数
pq->thr->done    = 0;
pq->thr->start   = 0;
pq->thr->stop    = 0;
pq->thr->func    =  nni_posix_poll_thr;  //对应thr->func = fn
pq->thr->arg     = pq;

这里如果fn 为空则会返回，至此pq->thr 初始化完成，下面会pq->thr->thr进行初始化。

第二层初始化，在nni_thr_init中，对nni_thr 的成员变量 nni_plat_thr再次进行初始化。也就是对 pq->thr->thr 进行的初始化操作，执行函数 nni_plat_thr_init:

传入的三个参数：

1. thr->thr 也就是pq->thr->thr 待初始化对象

2. 函数指针 nni_thr_wrap

3. thr 也就是 pq->thr

//nni_plat_thr_init(&thr->thr, nni_thr_wrap, thr)
//这里的赋值实际关系是：
thr->func = fn;    //---->  pq-thr-thr-func = nni_thr_wrap;
thr->arg  = arg;   //---->  pq-thr-thr-arg  = pq-thr;

总结一下，如果仅仅看赋值关系，在执行完 nni_plat_htr_init之后，如果函数指针参数均不为空，我们可以得到的pq->thr的成员变量值如下：

pq->thr->done  = 0;

pq->thr->start   = 0;

pq->thr->stop   = 0;

pq->thr->fn       = nni_posix_poll_thr;

pq->thr->arg     = pq;

pq->thr->thr->func = nni_thr_wrap;

pq->thr->thr->arg   = pq->thr;

下面创建线程，线程属性是nni_init函数在前面设置的线程属性。线程ID 是pq->thr->thr->tid ，需要执行的线程函数是 nni_plat_thr_main，函数参数是 pq->thr->thr .

在 nni_plat_thr_main中，执行了 nni_plat_thr结构体的成员函数 thr->func(thr->arg)

实际执行的就是 pq->thr->thr->fn 也就是 nni_thr_wrap

参数是thr->arg 也就是 nni_plat_thr->arg 也就是 pq->thr。

这里贴出nni_thr_wrap 函数的定义：

//这里执行的函数是pq->thr->thr-fn() 
//参数是pq->thr, 这里可能有点迷惑，但是从下面函数的变量类型也可以推断出来
static void
nni_thr_wrap(void *arg)
{
	nni_thr *thr = arg;
	int      start;

	nni_plat_mtx_lock(&thr->mtx);
	while (((start = thr->start) == 0) && (thr->stop == 0)) {
		nni_plat_cv_wait(&thr->cv);
	}
	nni_plat_mtx_unlock(&thr->mtx);
	if ((start) && (thr->fn != NULL)) {
		thr->fn(thr->arg);
	}
	nni_plat_mtx_lock(&thr->mtx);
	thr->done = 1;
	nni_plat_cv_wake(&thr->cv);
	nni_plat_mtx_unlock(&thr->mtx);
}

从这里看，由于前面初始化的值thr->start == 0， thr->stop == 0 线程将等待在thr->cv这个条件变量上，除非while的条件不满足。

while (((start = thr->start) == 0) && (thr->stop == 0)) {
        nni_plat_cv_wait(&thr->cv);

}

7. 设置 pq->thr名称

略

8. 执行nni_thr_run函数

nni_thr_run函数里，设置thr->start = 1, 同时会唤醒等待在 pq->thr->cv 这个条件变量上的任务，条件满足，开始执行pq->thr->fn() 函数，也就是nni_posix_poll_thr，而参数就是 pq.系统就开始轮询在pq->epfd实例上的可读事件。初始化结束！

这里也贴出 nni_posix_poll_thr 的函数定义：

//这里在轮询的就是最初创建的epoll实例 ！！！
static void
nni_posix_poll_thr(void *arg)
{
	nni_posix_pollq *  pq = arg;
	struct epoll_event events[NNI_MAX_EPOLL_EVENTS];

	for (;;) {
		int  n;
		bool reap = false;

		n = epoll_wait(pq->epfd, events, NNI_MAX_EPOLL_EVENTS, -1);
		if ((n < 0) && (errno == EBADF)) {
			// Epoll fd closed, bail.
			return;
		}

		// dispatch events
		for (int i = 0; i < n; ++i) {
			const struct epoll_event *ev;

			ev = &events[i];
			// If the waker pipe was signaled, read from it.
			if ((ev->data.ptr == NULL) &&
			    (ev->events & (unsigned) POLLIN)) {
				uint64_t clear;
				if (read(pq->evfd, &clear, sizeof(clear)) !=
				    sizeof(clear)) {
					nni_panic("read from evfd incorrect!");
				}
				reap = true;
			} else {
				nni_posix_pfd *  pfd = ev->data.ptr;
				nni_posix_pfd_cb cb;
				void *           cbarg;
				unsigned         mask;

				mask = ev->events &
				    ((unsigned) EPOLLIN | (unsigned) EPOLLOUT |
				        (unsigned) EPOLLERR);

				nni_mtx_lock(&pfd->mtx);
				pfd->events &= ~mask;
				cb    = pfd->cb;
				cbarg = pfd->arg;
				nni_mtx_unlock(&pfd->mtx);

				// Execute the callback with lock released
				if (cb != NULL) {
					cb(pfd, mask, cbarg);
				}
			}
		}

		if (reap) {
			nni_posix_pollq_reap(pq);
			nni_mtx_lock(&pq->mtx);
			if (pq->close) {
				nni_mtx_unlock(&pq->mtx);
				return;
			}
			nni_mtx_unlock(&pq->mtx);
		}
	}
}

这里省略了部分实现细节，比如线程属性，信号的处理以及操作的加解锁过程。

pthread_cond_wait 条件变量等相关的用法，请参考这里：pthread_cond_broadcast和pthread_cond_wait使用_cond broadcast-优快云博客