read ahead memory

本文介绍了 Linux 中的 Readahead 机制,该机制通过预测进程的文件读取行为来提高 I/O 性能。当进程按顺序读取文件时,系统会预加载后续页面到缓存,以减少等待时间。

linux为了提高I/O的性能,尽量减少I/O请求的发送,主要是为了提高进程顺序读取文件的效率,提供了read ahead的机制,如果linux判断一个进程在顺序读取文件,那么它会提前读取进程所需文件的数据,放在缓存中。

判断的标准是:

1.如果进程是第一次读取文件,那么检查进程是不是读取文件的第一个page

2.读取的文件page是不是上次读取page的下一个

满足上面两个条件的进程,linux会为其创建两个window,每个window包含几个page。一个被称为current window;另一个被称为ahead window。current window是进程正在读取的文件page 数据(也可能包含部分ahead数据),ahead window是预读取的文件数据。当current window里的page数据都被进程读取后,ahead window就成为了current window,linux将会申请下一个ahead window。

当进程I/O操作不满足前面的条件时,read ahead便会暂时失效,等条件再次被满足时,read ahead机制就可以重新被激活。

活动图参见<understanding the linux kernel>-647

/* 1613 * The madvise(2) system call. 1614 * 1615 * Applications can use madvise() to advise the kernel how it should 1616 * handle paging I/O in this VM area. The idea is to help the kernel 1617 * use appropriate read-ahead and caching techniques. The information 1618 * provided is advisory only, and can be safely disregarded by the 1619 * kernel without affecting the correct operation of the application. 1620 * 1621 * behavior values: 1622 * MADV_NORMAL - the default behavior is to read clusters. This 1623 * results in some read-ahead and read-behind. 1624 * MADV_RANDOM - the system should read the minimum amount of data 1625 * on any access, since it is unlikely that the appli- 1626 * cation will need more than what it asks for. 1627 * MADV_SEQUENTIAL - pages in the given range will probably be accessed 1628 * once, so they can be aggressively read ahead, and 1629 * can be freed soon after they are accessed. 1630 * MADV_WILLNEED - the application is notifying the system to read 1631 * some pages ahead. 1632 * MADV_DONTNEED - the application is finished with the given range, 1633 * so the kernel can free resources associated with it. 1634 * MADV_FREE - the application marks pages in the given range as lazy free, 1635 * where actual purges are postponed until memory pressure happens. 1636 * MADV_REMOVE - the application wants to free up the given range of 1637 * pages and associated backing store. 1638 * MADV_DONTFORK - omit this area from child's address space when forking: 1639 * typically, to avoid COWing pages pinned by get_user_pages(). 1640 * MADV_DOFORK - cancel MADV_DONTFORK: no longer omit this area when forking. 1641 * MADV_WIPEONFORK - present the child process with zero-filled memory in this 1642 * range after a fork. 1643 * MADV_KEEPONFORK - undo the effect of MADV_WIPEONFORK 1644 * MADV_HWPOISON - trigger memory error handler as if the given memory range 1645 * were corrupted by unrecoverable hardware memory failure. 1646 * MADV_SOFT_OFFLINE - try to soft-offline the given range of memory. 1647 * MADV_MERGEABLE - the application recommends that KSM try to merge pages in 1648 * this area with pages of identical content from other such areas. 1649 * MADV_UNMERGEABLE- cancel MADV_MERGEABLE: no longer merge pages with others. 1650 * MADV_HUGEPAGE - the application wants to back the given range by transparent 1651 * huge pages in the future. Existing pages might be coalesced and 1652 * new pages might be allocated as THP. 1653 * MADV_NOHUGEPAGE - mark the given range as not worth being backed by 1654 * transparent huge pages so the existing pages will not be 1655 * coalesced into THP and new pages will not be allocated as THP. 1656 * MADV_COLLAPSE - synchronously coalesce pages into new THP. 1657 * MADV_DONTDUMP - the application wants to prevent pages in the given range 1658 * from being included in its core dump. 1659 * MADV_DODUMP - cancel MADV_DONTDUMP: no longer exclude from core dump. 1660 * MADV_COLD - the application is not expected to use this memory soon, 1661 * deactivate pages in this range so that they can be reclaimed 1662 * easily if memory pressure happens. 1663 * MADV_PAGEOUT - the application is not expected to use this memory soon, 1664 * page out the pages in this range immediately. 1665 * MADV_POPULATE_READ - populate (prefault) page tables readable by 1666 * triggering read faults if required 1667 * MADV_POPULATE_WRITE - populate (prefault) page tables writable by 1668 * triggering write faults if required 1669 * 1670 * return values: 1671 * zero - success 1672 * -EINVAL - start + len < 0, start is not page-aligned, 1673 * "behavior" is not a valid value, or application 1674 * is attempting to release locked or shared pages, 1675 * or the specified address range includes file, Huge TLB, 1676 * MAP_SHARED or VMPFNMAP range. 1677 * -ENOMEM - addresses in the specified range are not currently 1678 * mapped, or are outside the AS of the process. 1679 * -EIO - an I/O error occurred while paging in data. 1680 * -EBADF - map exists, but area maps something that isn't a file. 1681 * -EAGAIN - a kernel resource was temporarily unavailable. 1682 */
最新发布
11-14
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值