端口初始化流程
如上所示给出了端口打开的简单流程图,下面以ixgbe驱动为例详细说明:
1. 注册设备驱动到“dev_driver_list”链表中
这个链表节点为:
/**
* A structure describing a device driver.
*/
struct rte_driver {
TAILQ_ENTRY(rte_driver) next; /**< Next in list. */
enum pmd_type type; /**< PMD Driver type */
const char *name; /**< Driver name. */
rte_dev_init_t *init; /**< Device init. function. */
rte_dev_uninit_t *uninit; /**< Device uninit. function. */
};
将ixgbe的这些信息注册到该链表中:
static struct rte_driver rte_ixgbe_driver = {
.type = PMD_PDEV,
.init = rte_ixgbe_pmd_init,
};
PMD_REGISTER_DRIVER(rte_ixgbe_driver);
PMD_REGISTER_DRIVER为dpdk定义的宏,使用了GNU C提供的“__attribute__(constructor)”机制,使得注册设备驱动的过程在main函数执行之前完成。
这样就有了设备驱动类型、设备驱动的初始化函数
2.扫描系统中的pci设备,并注册到“pci_device_list”中
链表节点为:
/**
* A structure describing a PCI device.
*/
struct rte_pci_device {
TAILQ_ENTRY(rte_pci_device) next; /**< Next probed PCI device. */
struct rte_pci_addr addr; /**< PCI location. */
struct rte_pci_id id; /**< PCI ID. */
struct rte_pci_resource mem_resource[PCI_MAX_RESOURCE]; /**< PCI Memory Resource */
struct rte_intr_handle intr_handle; /**< Interrupt handle */
struct rte_pci_driver *driver; /**< Associated driver */
uint16_t max_vfs; /**< sriov enable if not zero */
int numa_node; /**< NUMA node connection */
struct rte_devargs *devargs; /**< Device user arguments */
enum rte_kernel_driver kdrv; /**< Kernel driver passthrough */
};
从系统中获取到PCI设备的相关信息后,记录到这样的一个结构体中。如何获取到这些信息:
在main函数的一开始,调用rte_eal_init()获取用户、系统的相关配置信息以及设置基础运行环境,其中包括调用rte_eal_pci_init()来扫描、获取系统中的CPI网卡信息;
首先,初始化pci_device_list链表,后面扫描的到的pci网卡设备信息会记录到这个链表中;
然后,调用rte_eal_pci_scan()扫描系统中的PCI网卡:遍历”/sys/bus/pci/devices”目录下的所有pci地址,逐个获取对应的pci地址、pci id、sriov使能时的vf个数、亲和的numa、设备地址空间、驱动类型等;
/*
* Scan the content of the PCI bus, and the devices in the devices list
*/
int rte_eal_pci_scan(void)
{
struct dirent *e;
DIR *dir;
char dirname[PATH_MAX];
uint16_t domain;
uint8_t bus, devid, function;
dir = opendir(SYSFS_PCI_DEVICES);
if (dir == NULL) {
RTE_LOG(ERR, EAL, "%s(): opendir failed: %s\n",
__func__, strerror(errno));
return -1;
}
while ((e = readdir(dir)) != NULL) {
if (e->d_name[0] == '.')
continue;
if (parse_pci_addr_format(e->d_name, sizeof(e->d_name), &domain,
&bus, &devid, &function) != 0)
continue;
snprintf(dirname, sizeof(dirname), "%s/%s", SYSFS_PCI_DEVICES,
e->d_name);
if (pci_scan_one(dirname, domain, bus, devid, function) < 0)
goto error;
}
closedir(dir);
return 0;
error:
closedir(dir);
return -1;
}
这样,扫描并记录了系统中所有的pci设备的相关信息,后面根据上面获取的这些设备信息以及前面注册的驱动信息,就可以完成具体网卡设备的初始化;
3、初始化注册的驱动
在rte_eal_init()函数中,后面会调用rte_eal_dev_init()来初始化前面注册的驱动“dev_driver_list”:分别调用注册的每款驱动的初始化函数,把每款驱动的一些信息记录到“pci_driver_list”链表中,链表节点为:
/**
* @internal
* The structure associated with a PMD Ethernet driver.
*
* Each Ethernet driver acts as a PCI driver and is represented by a generic
* *eth_driver* structure that holds:
*
* - An *rte_pci_driver* structure (which must be the first field).
*
* - The *eth_dev_init* function invoked for each matching PCI device.
*
* - The *eth_dev_uninit* function invoked for each matching PCI device.
*
* - The size of the private data to allocate for each matching device.
*/
struct eth_driver {
struct rte_pci_driver pci_drv; /**< The PMD is also a PCI driver. */
eth_dev_init_t eth_dev_init; /**< Device init function. */
eth_dev_uninit_t eth_dev_uninit; /**< Device uninit function. */
unsigned int dev_private_size; /**< Size of device private data. */
};
结构中记录设备的init、uinit、私有数据大小以及pci driver信息,而struct rte_pci_driver中的记录了驱动支持的网卡设备的verder id、device id信息,这个在后面具体的PCI网卡设备初始化时,会根据这些信息来匹配驱动:
/**
* A structure describing a PCI driver.
*/
struct rte_pci_driver {
TAILQ_ENTRY(rte_pci_driver) next; /**< Next in list. */
const char *name; /**< Driver name. */
pci_devinit_t *devinit; /**< Device init. function. */
pci_devuninit_t *devuninit; /**< Device uninit function. */
const struct rte_pci_id *id_table; /**< ID table, NULL terminated. */
uint32_t drv_flags; /**< Flags contolling handling of device. */
};
/**
* A structure describing an ID for a PCI driver. Each driver provides a
* table of these IDs for each device that it supports.
*/
struct rte_pci_id {
uint16_t vendor_id; /**< Vendor ID or PCI_ANY_ID. */
uint16_t device_id; /**< Device ID or PCI_ANY_ID. */
uint16_t subsystem_vendor_id; /**< Subsystem vendor ID or PCI_ANY_ID. */
uint16_t subsystem_device_id; /**< Subsystem device ID or PCI_ANY_ID. */
};
已ixgbe类型的网卡为例,注册的信息为rte_ixgbe_pmd:
static struct eth_driver rte_ixgbe_pmd = {
.pci_drv = {
.name = "rte_ixgbe_pmd",
.id_table = pci_id_ixgbe_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
RTE_PCI_DRV_DETACHABLE,
},
.eth_dev_init = eth_ixgbe_dev_init,
.eth_dev_uninit = eth_ixgbe_dev_uninit,
.dev_private_size = sizeof(struct ixgbe_adapter),
};
至此,注册的每款驱动的设备初始化,支持的设备等信息以及系统中所有的pci设备信息就已经都有了,分别记录在”pci_driver_list”和”pci_device_list”这两个全局的链表中,接下来就可以完成设备匹配驱动,别初始化设备了。
4、网卡设备初始化
rte_eal_init()函数接下来调用rte_eal_pci_probe()函数完成具体的设备的初始化
/*
* Scan the content of the PCI bus, and call the devinit() function for
* all registered drivers that have a matching entry in its id_table
* for discovered devices.
*/
int rte_eal_pci_probe(void)
{
struct rte_pci_device *dev = NULL;
struct rte_devargs *devargs;
int probe_all = 0;
int ret = 0;
/* 如果配置了白名单,只初始化白名单中的设备,否则所有支持的设备都初始化 */
if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_PCI) == 0)
probe_all = 1;
TAILQ_FOREACH(dev, &pci_device_list, next) {
/* set devargs in PCI structure */
devargs = pci_devargs_lookup(dev);
if (devargs != NULL)
dev->devargs = devargs;
/* probe all or only whitelisted devices */
if (probe_all)
ret = pci_probe_all_drivers(dev);
else if (devargs != NULL &&
devargs->type == RTE_DEVTYPE_WHITELISTED_PCI)
ret = pci_probe_all_drivers(dev);
if (ret < 0)
rte_exit(EXIT_FAILURE, "Requested device " PCI_PRI_FMT
" cannot be used\n", dev->addr.domain, dev->addr.bus,
dev->addr.devid, dev->addr.function);
}
return 0;
}
rte_eal_pci_probe_one_driver()函数中,在probe一个具体的设备时,比较vendor id、device id,然后映射设备资源、调用驱动的设备初始化函数:
/*
* If vendor/device ID match, call the devinit() function of the
* driver.
*/
static int
rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr, struct rte_pci_device *dev)
{
int ret;
const struct rte_pci_id *id_table;
for (id_table = dr->id_table; id_table->vendor_id != 0; id_table++) {
/* check if device's identifiers match the driver's ones */
... ...
if (dr->drv_flags & RTE_PCI_DRV_NEED_MAPPING) {
/* map resources for devices that use igb_uio */
ret = rte_eal_pci_map_device(dev);
if (ret != 0)
return ret;
} else if (dr->drv_flags & RTE_PCI_DRV_FORCE_UNBIND &&
rte_eal_process_type() == RTE_PROC_PRIMARY) {
/* unbind current driver */
if (pci_unbind_kernel_driver(dev) < 0)
return -1;
}
/* call the driver devinit() function */
return dr->devinit(dr, dev);
}
/* return positive value if driver doesn't support this device */
return 1;
}
pci_uio_map_resource()函数为pci设备在虚拟地址空间映射pci资源,后续直接通过操作内存来操作pci设备;
驱动的设备初始化函数rte_eth_dev_init()主要是初始化dpdk驱动框架中,为每个设备分配资源以及资源的初始化:
/* 每个设备对应数组的一个成员,记录了设备相关的所有信息 */
struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
/* 端口相关的配置 */
struct rte_eth_dev_data *dat;
dpdk框架中,对端口的初始化操作已经基本完成,后面则是根据用户的设置,配置端口的收发包队列以及最终start端口,开始收发包:
a、rte_eth_dev_configure()函数完成端口配置:队列数配置、RSS、offload等等设置;
b、rte_eth_rx_queue_setup()、rte_eth_tx_queue_setup()函数分别设置端口的每个收发队列:ring空间申请、初始化等;
c、rte_eth_dev_start()函数:发送队列初始化buf填充,端口使能(具体可以参考代码或网卡芯片手册,均是相关寄存器设置);