Linux I/O Block--递交I/O请求

最新推荐文章于 2025-06-11 01:01:27 发布

转载最新推荐文章于 2025-06-11 01:01:27 发布 · 1.1k 阅读

文章标签：

#块设备 #io

设备专栏收录该内容

13 篇文章

订阅专栏

本文详细解析了Linux内核中I/O请求从bio结构到request结构的转换过程，包括bio结构定义、请求队列的使用及请求合并策略等关键技术点。

在通用块层中，bio用来描述单一的I/O请求，它记录了一次I/O操作所必需的相关信息，如用于I/O操作的数据缓存位置，I/O操作的块设备起始扇区，是读操作还是写操作等等。struct bio的定义如下

[cpp] view plain copy print ?

struct bio {
sector_t bi_sector; /* device address in 512 byte
sectors */
struct bio *bi_next; /* request queue link */
struct block_device *bi_bdev;
unsigned long bi_flags; /* status, command, etc */
unsigned long bi_rw; /* bottom bits READ/WRITE,
* top bits priority
*/
unsigned short bi_vcnt; /* how many bio_vec's */
unsigned short bi_idx; /* current index into bvl_vec */
/* Number of segments in this BIO after
* physical address coalescing is performed.
*/
unsigned int bi_phys_segments;
unsigned int bi_size; /* residual I/O count */
/*
* To keep track of the max segment size, we account for the
* sizes of the first and last mergeable segments in this bio.
*/
unsigned int bi_seg_front_size;
unsigned int bi_seg_back_size;
unsigned int bi_max_vecs; /* max bvl_vecs we can hold */
unsigned int bi_comp_cpu; /* completion CPU */
atomic_t bi_cnt; /* pin count */
struct bio_vec *bi_io_vec; /* the actual vec list */
bio_end_io_t *bi_end_io;
void *bi_private;
#if defined(CONFIG_BLK_DEV_INTEGRITY)
struct bio_integrity_payload *bi_integrity; /* data integrity */
#endif
bio_destructor_t *bi_destructor; /* destructor */
/*
* We can inline a number of vecs at the end of the bio, to avoid
* double allocations for a small number of bio_vecs. This member
* MUST obviously be kept at the very end of the bio.
*/
struct bio_vec bi_inline_vecs[0];
};

bi_sector:该I/O操作的起始扇区号

bi_rw:指明了读写方向

bi_vcnt:该I/O操作中涉及到了多少个缓存向量，每个缓存向量由[page,offset,len]来描述

bi_idx:指示当前的缓存向量

bi_io_vec：缓存向量数组

缓存向量的定义：

[cpp] view plain copy print ?

struct bio_vec {
struct page *bv_page;
unsigned int bv_len;
unsigned int bv_offset;
};

struct request用于描述提交给块设备的I/O请求，bio会动态地添加进request，因此一个request往往会包含若干相邻的bio。

[cpp] view plain copy print ?

struct request {
struct list_head queuelist;
struct call_single_data csd;
int cpu;
struct request_queue *q;
unsigned int cmd_flags;
enum rq_cmd_type_bits cmd_type;
unsigned long atomic_flags;
/* the following two fields are internal, NEVER access directly */
sector_t __sector; /* sector cursor */
unsigned int __data_len; /* total data len */
struct bio *bio;
struct bio *biotail;
struct hlist_node hash; /* merge hash */
/*
* The rb_node is only used inside the io scheduler, requests
* are pruned when moved to the dispatch queue. So let the
* completion_data share space with the rb_node.
*/
union {
struct rb_node rb_node; /* sort/lookup */
void *completion_data;
};
/*
* two pointers are available for the IO schedulers, if they need
* more they have to dynamically allocate it.
*/
void *elevator_private;
void *elevator_private2;
struct gendisk *rq_disk;
unsigned long start_time;
/* Number of scatter-gather DMA addr+len pairs after
* physical address coalescing is performed.
*/
unsigned short nr_phys_segments;
unsigned short ioprio;
void *special; /* opaque pointer available for LLD use */
char *buffer; /* kaddr of the current segment if available */
int tag;
int errors;
int ref_count;
/*
* when request is used as a packet command carrier
*/
unsigned short cmd_len;
unsigned char __cmd[BLK_MAX_CDB];
unsigned char *cmd;
unsigned int extra_len; /* length of alignment and padding */
unsigned int sense_len;
unsigned int resid_len; /* residual count */
void *sense;
unsigned long deadline;
struct list_head timeout_list;
unsigned int timeout;
int retries;
/*
* completion callback.
*/
rq_end_io_fn *end_io;
void *end_io_data;
/* for bidi */
struct request *next_rq;
};

queuelist:用于将request链入请求队列的链表元素

q:指向所属的请求队列

__sector:下一个要传输的bio的起始扇区号

__data_len:request要传输的数据字节数

bio,biotail:用于维护request中的bio链表

在之前介绍的gendisk结构中，可以看到每个块设备(或分区)都对应了一个request_queue的结构，该结构用来容纳request,并且包含了相应的递交request以及I/O调度的方法

递交一个bio的主要工作是从generic_make_request()函数开始的，我们以此为入口来分析一个bio的递交过程。在每个进程的task_struct中，都包含有两个变量----struct bio *bio_list, **bio_tail，generic_make_request()的主要工作就是用这两个变量来维护当前待添加的bio链表，实际的提交操作会由generic_make_request()调用__generic_make_request()函数完成。而在__generic_make_request()中，会调用到queue_list中定义的make_request_fn函数，也就是特定于设备的提交请求函数来完成后续的工作。在这里便会有一些问题，大部分设备的make_request_fn都可以直接定义为内核实现的__make_request函数，而一些设备需要使用自己的make_request_fn，而自行实现的make_request_fn有可能会递归调用gerneric_make_request(),由于内核的堆栈十分有限，因此在generic_make_request()的实现中，玩了一些小把戏，使得递归的深度不会超过一层。我们注意到bio_tail是一个二级指针，这个值最初是NULL，当有bio添加进来，bio_tail将会指向bio->bi_next(如果bio全都递交上去了，则bio_tail将会指向bio_list)，也就是说除了第一次调用外，其他每次递归调用generic_make_request()函数都会出现bio_tail不为NULL的情形，因此当bio_tail不为NULL时，则只将bio添加到由bio_list和bio_tail维护的链表中，然后直接返回，而不调用__generic_make_request()，这样便防止了多重递归的产生

[cpp] view plain copy print ?