io engine

本文探讨了Linux下的不同I/O引擎,如sync、psync、vsync及libaio等,并对比了它们之间的性能差异和使用场景。此外还讨论了mmap、syslet-rw等其他相关I/O机制。
http://www.spinics.net/lists/fio/msg00819.html
 
n 2011-08-03 22:13, Martin Steigerwald wrote:
> Hi!
> 
> In order to understand I/O engines better, I like to summarize what I 
> think to know at the moment. Maybe this can be a starting point for some 
> additional documentation:
> 
> === sync, psync, vsync ===
> 
> - all these are using synchronous Linux (POSIX) system calls
> - is used by regular applications
> - synchronous just refers to the system call interface: i.e. the when the 
> system call returns to the application
> - as far as I understand it returns when the I/O request is told to be 
> completed
> - it does not imply synchronous I/O aka O_SYNC which is way slower and 
> enabled by sync=1
> - thus it does not guarantee that the I/O has been physically written to 
> the underlying device (see open(2))

All of above are correct.

> - thus is only guarantees that the I/O request has been dealt with? what 
> does this exactly mean?

For reads, the IO has been done by the device. For writes, it could just
be sitting in the page cache for later writeback.

> - does it mean that this is I/O in the context of the process?

Not sure what you mean here. For reads, the IO always happens in the
context of the process. For buffered writes, it usually does not. The
process merely dirties the page, kernel threads will most often do the
actual writeback of the data.

> - it can be used with direct=1 to circumvent the pagecache

Right, and additionally direct=1 will make the writes sync as well. So
instead of just returning when it's in page cache, when a sync write
with direct=1 returns, the data has been received and acknowledged by
the backing device. That does not mean it's stable, it could just be
sitting in the drive write back cache.

> difference is the kind of system call used:
> - sync uses read/write which read/write count bytes into from/to a buffer. 
> Uses current file offset, changeable via fseek (or lseek, I did not find a 
> manpage for fseek)

Fio uses file descriptors, not handles. So lseek() will be used to
position the file before each IO, unless the offset of the new IO is
identical to the current offset.

> - psync uses pread/pwrite which read/write count bytes from given offset
> - vsync uses readv/writev which read/writes count, i.e. mutiple buffers of 
> given length in one call (struct iovec)
> 
> I am not sure on what performance difference to expect. I bet that 
> sync/psync should perform roughly the same.

For random IO, you save a lseek() syscall for each IO. Depending on your
IO rates, this may or may not be significant. It usually isn't. But if
you are doing hundreds of thousand IOPS, then it could make a
difference.

> === libaio ===
> 
> - this uses Linux asynchronous I/O calls[1]
> - it uses libaio for that
> - who else uses libaio? It systems application that are near to the 
> system:
> 
> martin@merkaba:~> apt-cache rdepends libaio1
> libaio1
> Reverse Depends:
>   fio
>   qemu-kvm
>   libdbd-oracle-perl
>   zfs-fuse
>   stressapptest
>   qemu-kvm
>   qemu-utils
>   qemu-system
>   multipath-tools
>   ltp-kernel-test
>   libaio1-dbg
>   libaio-dev
>   fio
>   drizzle
>   blktrace
> 
> - these calls allow applications to offload I/O calls to the background
> - according to [1] this is only supported for direct I/O
> - using anything else let it fall back to synchronous call behavior
> - thus one sees this in combination with direct=1 in fio jobs
> - does this mean that this is I/O outside the context of the process?

aio assumes the identity of the process. aio is usually mostly used by
databases.

> Question:
> - what difference is between the following two other than the second one 
> seems to be more popular in example job files?
> 1) ioengine=sync + direct=1
> 2) ioengine=libaio + direct=1
> 
> Current answer: It is that fio can issue further I/Os while the Linux 
> kernels handles the I/O.

Yes

> === other I/O engines relevant to Linux ===
> There seem to be some other I/O engines relevant to Linux and mass storage 
> I/O:
> 
> == mmap ==
> - maps the memory into files and uses memcpy
> - used by quite some applications
> - what else to note?

mmap'ed IO is quite widely used.

> == syslet-rw ==
> - make regular read/write asynchronous
> - where is this used?
> - what else to note?

syslet-rw is an engine that was written to benchmark/test the syslet
async system call interface. It was never merged, so it has mostly
historic relevance now.

> Any others?

You should mention posixaio and net as well, might be interesting. And
splice is unique to Linux, would be good to cover.

> Is what I wrote correct so far?

Yep, good so far!

> I think I´d like to write something up about the different I/O concepts in 
> Linux, if such a document doesn´t exist yet.

Might not be a bad idea :-)

-- 
Jens Axboe

<think>我们正在讨论LinuxIO引擎,特别是libaio的作用和用法。根据引用内容,我们知道libaioLinux原生的异步I/O接口(引用[2]和[4])。用户的问题分为两部分:LinuxIOengine的定义,以及libaio的作用和用法。首先,LinuxIOengine是指Linux系统中用于执行输入输出操作的底层机制或接口。它决定了如何发起I/O操作(同步或异步)以及如何通知I/O完成。常见的I/O引擎包括同步I/O(如read/write)、多路复用I/O(如select/poll/epoll)、以及异步I/O(如POSIXAIO和libaio)。其次,libaioLinux内核提供的原生异步I/O库(即LinuxNativeAIO),它允许应用程序发起异步I/O操作,从而在I/O操作进行时,应用程序可以继续执行其他任务,而不会被阻塞。引用[2]指出,libaio是一个底层的异步I/O接口,而引用[4]提到,新的异步I/O框架io_uring正在逐渐取代libaio。关于libaio的作用:1.提供异步I/O操作,避免应用程序在I/O等待时被阻塞,提高并发性能。2.适用于直接I/O(DirectI/O)场景,绕过操作系统的页面缓存,直接读写磁盘。关于libaio的用法,引用[2]提到了一些基本函数(如aio_read,aio_write等),但未给出详细示例。我们可以根据标准用法进行说明:1.需要包含头文件`libaio.h`。2.初始化一个异步I/O上下文:`io_context_tctx;io_queue_init(深度,&ctx);`3.准备I/O控制块(structiocb)并设置I/O操作(读、写等)。4.提交I/O请求:`io_submit(ctx,提交数量,&iocb_指针数组);`5.获取完成事件:`io_getevents(ctx,最少完成数量,最多完成数量,事件数组,超时);`6.使用完成后销毁上下文:`io_queue_release(ctx);`注意:使用libaio需要安装libaio-devel库(引用[3])。由于引用[4]提到io_uring是新的异步I/O框架,我们也可以简要提及,但用户主要询问的是libaio。现在,我们按照用户要求,生成相关问题。</think>### Linux IO Engine 定义及示例 **Linux IO Engine** 指 Linux 系统中用于管理 I/O 操作的底层机制框架,它定义了应用程序与存储设备之间的交互方式,核心目标是优化 I/O 性能。根据操作特性可分为: 1. **同步引擎** - 如默认的 `sync` 引擎(阻塞式 `read/write`) - 公式示例:$$ \text{I/O延迟} = T_{\text{queue}} + T_{\text{transfer}} $$ 2. **异步引擎** - 如 `libaio`(Linux 原生异步 I/O) - 允许应用在 I/O 操作进行时继续执行其他任务 --- ### libaio 的作用与原理 **作用**: 提供 **原生异步 I/O 接口**,使应用能通过非阻塞方式提交 I/O 请求,内核完成操作后通过事件通知机制返回结果。对比用户态模拟的 POSIX AIO,libaio 直接通过系统调用与内核交互,性能更高[^2][^4]。 **核心优势**: - 避免线程阻塞,提升高并发场景吞吐量 - 支持 **Direct I/O**(绕过页缓存直接读写磁盘) - 适用于数据库、高性能存储系统等低延迟场景 --- ### libaio 使用示例 #### 1. 环境准备 安装开发库(引用[3]): ```bash yum install libaio-devel # CentOS apt install libaio-dev # Ubuntu ``` #### 2. 基础代码流程 ```c #include <libaio.h> int main() { io_context_t ctx = 0; // 初始化异步上下文 io_setup(128, &ctx); // 128为事件队列深度 struct iocb cb; // I/O控制块 struct iocb *cbs[1] = { &cb }; char buffer[4096]; // 准备异步读请求 io_prep_pread(&cb, fd, buffer, sizeof(buffer), 0); // 提交请求到内核 io_submit(ctx, 1, cbs); // 等待完成事件 struct io_event events[1]; io_getevents(ctx, 1, 1, events, NULL); // 处理完成事件 if (events[0].res > 0) { // 读取成功,处理数据 } io_destroy(ctx); // 销毁上下文 } ``` #### 3. 关键函数说明 | 函数 | 作用 | |---------------|----------------------------------| | `io_setup()` | 创建异步I/O上下文 | | `io_prep_pread()` | 准备异步读请求(写用`pwrite`) | | `io_submit()` | 提交I/O请求到内核队列 | | `io_getevents()` | 获取已完成I/O事件 | --- ### 与其他技术的对比 | 引擎类型 | 代表实现 | 特点 | |----------------|----------------|------------------------------| | 原生异步I/O | `libaio` | 内核支持,高性能,需Direct I/O | | 用户态异步I/O | POSIX AIO | 应用层线程模拟,通用但效率低[^1] | | 新型异步框架 | `io_uring` | 零拷贝、高扩展性,Linux 5.1+[^4] | > **注意**:使用 `libaio` 时需打开文件时指定 `O_DIRECT` 标志,且缓冲区需内存对齐(`posix_memalign` 分配)。 --- ### 典型应用场景 1. 数据库系统(如 MySQL InnoDB 引擎) 2. 高性能文件服务器(如 Ceph OSD) 3. 低延迟交易系统 4. 大规模日志处理 ---
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值