directio,dsync & sync , async IO

本文探讨了Sybase数据库中的不同I/O模型,包括同步I/O与异步I/O的区别,dsync标志如何确保数据同步写入磁盘,以及直接I/O(Direct I/O)如何绕过文件系统缓存实现高效数据写入。文章还讨论了dsync与直接I/O之间的互斥关系。

http://sybaseblog.com/2010/04/20/directiodsync-sync-async-io/

I came across very conceptual article about the sybase device io.

If you have any questions and suggestions please let me know. Thanks.

Source: Sybase Blogs, www, sybooks, sybase white paper for direct io.

ASYNCHRONOUS I/O:

  • Asynchronous I/O allows the process issuing the I/O to continue executing while OS completes the request.
  • Dataserver process does not block on this request till its completion.

SYNCHRONOUS I/O:

  • Synchronous I/O blocks the I/O issuing process from continuing executing while OS completes the request.
  • The process is blocked till the request is completed. This generally results in poor throughput.

DSYNC :

  • In 12.0 sybase introduced the dsync flag – shorthand for “Data Synchronous.”
  • When the dsync setting is on, Adaptive Server opens a database device file using the UNIX dsync flag.
  • The dsync flag in ASE directly translates to the O_DSYNC open(2) flag. That is to say, when ASE opens a device that has the dsync flag set, ASE will pass O_DSYNC to open(2).
  • This flag tells the file system that a write to that file must pass though the cache and be written to disk before the write is considered complete.
  • In other words, for writes we throw away the cache efficiency and make sure the data goes to disk.
  • This way if the system crashes, everything that we thought had been written to disk has in fact been written to disk.

Is DSYNC “synchronous”??

  • This is not true.
  • This synchronous / asynchronous conflict is at a different level. With async i/o we are talking about the context in which the i/o is executed, i.e. whether the i/o blocks the caller or if it is done in a different context.
  • With dsync we are talking about when the write() is considered complete.
  • These are not mutually exclusive, and you can asynchronously do a data synchronous i/o.
  • The async portion is as always: the application issues an i/o and later polls (or is notified) for completion. The dsync portion means that the application won’t be told that the I/O has completed until the data has made it to disk.

DIRECT IO

  • Direct I/O is another kind of file system i/o, introduced in ASE15.
  • In the direct i/o model the file system cache is completely bypassed.
  • Using direct i/o, writes are naturally safe because they bypass the cache and go straight to disk.
  • Direct i/o is very, very similar to raw i/o.
  • The main difference is that in direct i/o the structure of a file system still exists, easing manageability.
  • With raw i/o, no file system exists. The performance of raw and direct i/o should be very similar.

Also like to add :

  • The dsync and directio are mutually exclusive. Both cannt be turn on same time on the device. Both dsync and directio provide the full recoveribility.
  • If you gone through the full article, the next question is, which would be best in raw device and filesystem device with direct io, both are bypassing the file system cache.
<think>嗯,用户问的是在7*24小时运行的监控系统中,选择Buffer I/O还是Direct I/O配合io_uring更好。首先,我需要理解监控系统的写入特点。监控系统通常会持续产生大量的日志或指标数据,这些数据需要高效且可靠地写入磁盘。接下来,我需要回顾Buffer I/O和Direct I/O的特点,以及io_uring的作用。 Buffer I/O的优点是通过内核缓存减少直接写磁盘的次数,提高吞吐量,但存在数据丢失风险,因为数据可能还在缓存中未落盘。而Direct I/O绕过内核缓存,数据直接写入磁盘,可靠性更高,但会增加延迟和CPU开销,尤其是在小数据块频繁写入的情况下。 io_uring是一种异步I/O框架,旨在降低I/O操作的延迟,提高并发性能。它适用于需要高吞吐和低延迟的场景,比如高性能服务器或存储系统。结合io_uring,选择哪种I/O方式需要考虑性能、可靠性和系统开销。 监控系统对数据可靠性有一定要求,但可能也更关注性能和稳定性。如果使用Buffer I/O配合io_uring,可以利用内核的异步刷盘机制,减少每次写入的等待时间,同时通过io_uring的异步处理提高并发能力。不过,需要定期调用fsync确保数据落盘,避免数据丢失。而Direct I/O虽然可靠,但每次写入都需要直接落盘,可能会增加延迟,尤其是在高频率的小数据块写入时,可能会影响性能。 另外,io_uring本身支持异步操作,可能更适合与Buffer I/O结合,因为Buffer I/O的异步写入可以更好地利用io_uring的异步能力,而Direct I/O的严格同步可能限制其优势。同时,监控系统可能需要处理大量的并发写入请求,Buffer I/O的合并IO请求能力可以减少磁盘负载,提升整体效率。 但也要考虑监控系统的具体需求,如果数据丢失的风险不可接受,比如金融交易日志,可能需要Direct I/O。但如果是普通的系统监控日志,可能可以接受Buffer I/O的风险,通过定期fsync来平衡可靠性和性能。 总结下来,用户的问题需要权衡性能、可靠性以及io_uring的特性。可能的结论是Buffer I/O配合io_uring更适合,但需要具体分析。</think>在7×24小时运行的监控系统中,**Buffer I/O配合`io_uring`通常是更优的选择**,但需结合具体场景(数据可靠性、性能目标)进行权衡。以下是详细分析: --- ### **1. Buffer I/O + io_uring 的优势** - **异步吞吐量提升**: - `io_uring`通过无锁环形队列实现高效的异步I/O提交与完成通知,结合Buffer I/O的页缓存机制,可批量合并多次写入请求,显著减少磁盘IO次数。 - 例如:100次小数据写入可能合并为1次400KB的磁盘IO(依赖内核调度策略)。 - **降低延迟**: - 用户态缓冲区写入页缓存后立即返回,无需等待数据落盘;`io_uring`的异步机制进一步减少线程阻塞。 - **与`io_uring`的协同优化**: - `io_uring`的`IORING_FEAT_RW_CUR_POS`特性可动态管理文件偏移量,适合日志类追加写入场景。 - 通过`IORING_OP_FSYNC`异步触发`fsync`,避免阻塞主线程。 --- ### **2. Direct I/O + io_uring 的适用场景** - **强一致性需求**: - 若监控数据要求“写即落盘”(如金融交易日志),Direct I/O可避免数据丢失风险。 - **避免缓存污染**: - 监控系统持续写入大量冷数据时,Direct I/O可防止页缓存被非热点数据占用,避免影响其他服务的缓存命中率。 - **性能瓶颈**: - Direct I/O需严格对齐内存和文件偏移量,且每次写入直接触发磁盘IO,可能因磁盘延迟成为瓶颈(尤其在HDD或低性能SSD上)。 --- ### **3. 核心对比与决策建议** | **维度** | **Buffer I/O + io_uring** | **Direct I/O + io_uring** | |------------------|----------------------------------------|----------------------------------------| | **数据可靠性** | 需周期性调用`fsync`(可能丢失未刷盘数据) | 数据立即落盘(无丢失风险) | | **吞吐量** | 高(合并IO,利用页缓存) | 低(每次独立IO) | | **延迟** | 低(异步刷盘) | 高(同步等待磁盘响应) | | **CPU开销** | 较高(用户态→内核态复制) | 较低(零拷贝) | | **适用场景** | 非关键日志、指标采集(如Prometheus) | 关键日志、事务日志(如数据库WAL) | --- ### **4. 最佳实践建议** 1. **优先选择Buffer I/O + `io_uring`**: - 在监控系统中,数据丢失容忍度通常较高(如丢失1秒内的指标),可通过以下方式平衡性能与可靠性: - 使用`IORING_OP_FSYNC`异步刷盘(每N次写入或每T秒触发一次)。 - 调整内核参数(如`dirty_expire_centisecs`)控制页缓存刷盘频率。 2. **Direct I/O的替代方案**: - 若需强一致性,可采用`O_DSYNC`或`O_SYNC`标志打开文件(`write()`返回时数据已落盘),但性能低于Direct I/O。 - 结合`io_uring`的`IORING_SETUP_IOPOLL`模式(轮询模式)降低IO延迟。 --- ### **5. 示例代码片段** ```c // Buffer I/O + io_uring异步写入 struct io_uring_sqe *sqe = io_uring_get_sqe(ring); io_uring_prep_write(sqe, fd, buf, len, offset); io_uring_sqe_set_data(sqe, user_data); io_uring_submit(ring); // 异步fsync(替代直接调用fsync) struct io_uring_sqe *sqe_fsync = io_uring_get_sqe(ring); io_uring_prep_fsync(sqe_fsync, fd, IORING_FSYNC_DATASYNC); io_uring_submit(ring); ``` ---
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值