一般如果在Linux内核中读写一个文件,其IO流程都需要经过Kernel内的page cache层次,若想要使用自己开发的缓存系统,那么就可以在打开这个文件的时候,对该文件加以O_DIRECT的标志位,这样一来就可以让程序对该文件的IO直接在磁盘上进行,从而避开了Kernel的page cache,进而对IO流程里的块数据进行拦截,让其流入到自己开发的缓存系统内。
O_DIRECT参数适用要求:
O_DIRECT
Try to minimize cache effects of the I/O to and from this file. In general this will degrade performance, but it is useful in special situations, such as when applications do their own caching. File I/O is done directly to/from user space buffers. The I/O is synchronous, i.e., at the completion of a read(2) or write(2), data is guaranteed to have been transferred. Under Linux 2.4 transfer sizes, and the alignment of user buffer and file offset must all be multiples of the logical block size of the file system.
比较重要的地方:
将IO数据写到自己的cache的时候,cache所分配的内存的大小和首地址必须是块对齐的,毕竟是直接和磁盘打交道,数据都是一块一块的,地址映射什么的都没了。如果用普通的malloc的方法去分配就无法做到了,否则会出现数据无法正确写入到磁盘的囧状,
可以用valloc 或者memalign来分配空间对其的内存,其中valloc分配的是页对齐的内存。
引用:
OPEN(2) Linux Programmer's Manual OPEN(2)
NAME top
open, openat, creat - open and possibly create a file
O_DIRECT (since Linux 2.4.10)
Try to minimize cache effects of the I/O to and from this
file. In general this will degrade performance, but it is
useful in special situations, such as when applications do
their own caching. File I/O is done directly to/from user-
space buffers. The O_DIRECT flag on its own makes an effort
to transfer data synchronously, but does not give the
guarantees of the O_SYNC flag that data and necessary metadata
are transferred. To guarantee synchronous I/O, O_SYNC must be
used in addition to O_DIRECT. See NOTES below for further
discussion.A semantically similar (but deprecated) interface for block
devices is described in raw(8).
O_DIRECTORY
If pathname is not a directory, cause the open to fail. This
flag was added in kernel version 2.1.126, to avoid denial-of-
service problems if opendir(3) is called on a FIFO or tape
device.
O_DSYNC
Write operations on the file will complete according to the
requirements of synchronized I/O data integrity completion.
By the time write(2) (and similar) return, the output data has
been transferred to the underlying hardware, along with any
file metadata that would be required to retrieve that data
(i.e., as though each write(2) was followed by a call to
fdatasync(2)). See NOTES below.
WRITE(2) Linux Programmer's Manual WRITE(2)
NAME top
write - write to a file descriptor
EINVAL fd is attached to an object which is unsuitable for writing;
or the file was opened with the O_DIRECT flag, and either the
address specified in buf, the value specified in count, or the
file offset is not suitably aligned.