Chapter 9-03

Please indicate the source: http://blog.youkuaiyun.com/gaoxiangnumber1.

9.7 Case Study: The Intel Core i7/Linux Memory System

l  The current Core i7 implementations (and those for the foreseeable future) support a 48-bit (256 TB) virtual address space and a 52-bit (4 PB) physical address space, along with a compatability mode that supports 32-bit (4 GB) virtual and physical address spaces.

l  The processor package includes four cores, a large L3 cache shared by all of the cores, and a DDR3 memory controller.

l  Each core contains a hierarchy of TLBs, a hierarchy of data and instruction caches, and a set of fast point-to-point links, based on the Intel QuickPath technology, for communicating directly with the other cores and the external I/O bridge.

l  The TLBs are virtually addressed, and four-way set associative. The L1, L2, and L3 caches are physically addressed, and eight-way set associative, with a block size of 64 bytes.

l  The page size can be configured at start-up time as either 4 KB or 4 MB. Linux uses 4-KB pages.

9.7.1 Core i7 Address Translation

l  The Core i7 uses a four-level page table hierarchy. Each process has its own private page table hierarchy. When a Linux process is running, the page tables associated with allocated pages are all memory-resident, although the Core i7 architecture allows these page tables to be swapped in and out. The CR3 control register points to the beginning of the level 1 (L1) page table. The value of CR3 is part of each process context, and is restored during each context switch.

l  Figure 9.23 shows the format of an entry in a level 1, level 2, or level 3 page table. When P = 1 (which is always the case with Linux), the address field contains a 40-bit physical page number (PPN) that points to the beginning of the appropriate page table. Notice that this imposes a 4 KB alignment requirement on page tables.

l  Figure 9.24 shows the format of an entry in a level 4 page table. When P = 1, the address field contains a 40-bit PPN that points to the base of some page in physical memory. Again, this imposes a 4 KB alignment requirement on physical pages.

l  The PTE has three permission bits that control access to the page.

²  The R/W bit determines whether the contents of a page are read/write or read/only.

²  The U/S bit, which determines whether the page can be accessed in user mode, protects code and data in the operating system kernel from user programs.

²  The XD (execute disable) bit, which was introduced in 64-bit systems, can be used to disable instruction fetches from individual memory pages. This is an important new feature that allows the operating system kernel to reduce the risk of buffer overflow attacks by restricting execution to the read-only text segment.

²  As the MMU translates each virtual address, it also updates two other bits that can be used by the kernel’s page fault handler. The MMU sets the A bit, which is known as a reference bit, each time a page is accessed. The kernel can use the reference bit to implement its page replacement algorithm. The MMU sets the D bit, or dirty bit, each time the page is written to. A page that has been modified is sometimes called a dirty page. The dirty bit tells the kernel whether or not it must write-back a victim page before it copies in a replacement page. The kernel can call a special kernel-mode instruction to clear the reference or dirty bits.

²  Figure 9.25 shows how the Core i7 MMU uses the four levels of page tables to translate a virtual address to a physical address. The 36-bit VPN is partitioned into four 9-bit chunks, each of which is used as an offset into a page table. The CR3 register contains the physical address of the L1 page table. VPN 1 provides an offset to an L1 PTE, which contains the base address of the L2 page table. VPN 2 provides an offset to an L2 PTE, and so on.

9.7.2 Linux Virtual Memory System

l  Linux maintains a separate virtual address space for each process of the form shown in Figure 9.26.

l  The kernel virtual memory contains the code and data structures in the kernel.

l  Some regions of the kernel virtual memory are mapped to physical pages that are shared by all processes. For example, each process shares the kernel’s code and global data structures.

l  Linux also maps a set of contiguous virtual pages (equal in size to the total amount of DRAM in the system) to the corresponding set of contiguous physical pages. This provides the kernel with a convenient way to access any specific location in physical memory, for example, when it needs to access page tables, or to perform memory-mapped I/O operations on devices that are mapped to particular physical memory locations.

l  Other regions of kernel virtual memory contain data that differs for each process. Examples include page tables, the stack that the kernel uses when it is executing code in the context of the process, and various data structures that keep track of the current organization of the virtual address space.

Linux Virtual Memory Areas

l  Linux organizes the virtual memory as a collection of areas (also called segments).

l  An area is a contiguous chunk of existing (allocated) virtual memory whose pages are related in some way. For example, the code segment, data segment, heap, shared library segment, and user stack are all distinct areas.

l  Each existing virtual page is contained in some area, and any virtual page that is not part of some area does not exist and cannot be referenced by the process.

l  The notion of an area is important because it allows the virtual address space to have gaps. The kernel does not keep track of virtual pages that do not exist, and such pages do not consume any additional resources in memory, on disk, or in the kernel itself.

l  Figure 9.27 highlights the kernel data structures that keep track of the virtual memory areas in a process.

l  The kernel maintains a distinct task structure (task_struct in the source code) for each process in the system. The elements of the task structure either contain or point to all of the information that the kernel needs to run the process (e.g., the PID, pointer to the user stack, name of the executable object file, and program counter).

l  One of the entries in the task structure points to an mm_struct that characterizes the current state of the virtual memory. The two important fields are pgd, which points to the base of the level 1 table (the page global directory), and mmap, which points to a list of vm_area_structs (area structs), each of which characterizes an area of the current virtual address space. When the kernel runs this process, it stores pgd in the CR3 control register.

l  The area struct for a particular area contains the following fields:

²  vm_start: Points to the beginning of the area

²  vm_end: Points to the end of the area

²  vm_prot: Describes the read/write permissions for all of the pages contained in the area

²  vm_flags: Describes (among other things) whether the pages in the area are shared with other processes or private to this process

²  vm_next: Points to the next area struct in the list

Linux Page Fault Exception Handling

l  Suppose the MMU triggers a page fault while trying to translate some virtual address A. The exception results in a transfer of control to the kernel’s page fault handler, which then performs the following steps:

Is virtual address A legal?

²  That is: does A lie within an area defined by some area struct? To answer this question, the fault handler searches the list of area structs, comparing A with the vm_start and vm_end in each area struct. If the instruction is not legal, then the fault handler triggers a segmentation fault, which terminates the process. This situation is labeled “1” in Figure 9.28.

²  Because a process can create an arbitrary number of new virtual memory areas (using the mmap function), a sequential search of the list of area structs might be very costly. So Linux superimposes a tree on the list, using some fields that we have not shown, and performs the search on this tree.

Is the attempted memory access legal?

²  In other words, does the process have permission to read, write, or execute the pages in this area? For example, was the page fault the result of a store instruction trying to write to a read-only page in the .text segment? Is the page fault the result of a process running in user mode that is attempting to read a word from kernel virtual memory? If the attempted access is not legal, then the fault handler triggers a protection exception, which terminates the process. This situation is labeled “2” in Figure 9.28.

At this point, the kernel knows that the page fault resulted from a legal operation on a legal virtual address.

²  It handles the fault by selecting a victim page, swapping out the victim page if it is dirty, swapping in the new page, and updating the page table. When the page fault handler returns, the CPU restarts the faulting instruction, which sends A to the MMU again. This time, the MMU translates A normally, without generating a page fault.

9.8 Memory Mapping

l  Linux initializes the contents of a virtual memory area by associating it with an object on disk, a process known as memory mapping.

l  Areas can be mapped to one of two types of objects:

1.    Regular file in the Unix file system

²  An area can be mapped to a contiguous section of a regular disk file, such as an executable object file.

²  The file section is divided into page-sized pieces, with each piece containing the initial contents of a virtual page. Because of demand paging, none of these virtual pages is actually swapped into physical memory until the CPU first touches the page (i.e., issues a virtual address that falls within that page’s region of the address space). If the area is larger than the file section, then the area is padded with zeros.

2.    Anonymous(匿名) file

²  An area can also be mapped to an anonymous file, created by the kernel, that contains all binary zeros.

²  The first time the CPU touches a virtual page in such an area, the kernel finds an appropriate victim page in physical memory, swaps out the victim page if it is dirty, overwrites the victim page with binary zeros, and updates the page table to mark the page as resident.

²  Notice that no data is actually transferred between disk and memory. For this reason, pages in areas that are mapped to anonymous files are sometimes called demand-zero pages.

l  In either case, once a virtual page is initialized, it is swapped back and forth between a special swap file maintained by the kernel. The swap file is also known as the swap space or the swap area. At any point in time, the swap space bounds the total amount of virtual pages that can be allocated by the currently running processes.

9.8.1 Shared Objects Revisited

l  The idea of memory mapping resulted from an insight that if the virtual memory system could be integrated into the conventional file system, then it could provide a simple and efficient way to load programs and data into memory.

l  The process abstraction promises to provide each process with its own private virtual address space that is protected from errant writes or reads by other processes. But many processes may have identical read-only .text areas. For example, each process that runs the Unix shell program tcsh has the same .text area. Further, many programs need to access identical copies of read-only run-time library code. For example, every C program requires functions from the standard C library such as printf. It would be wasteful for each process to keep duplicate copies of these commonly used codes in physical memory.

l  Memory mapping provides us with a clean mechanism for controlling how objects are shared by multiple processes.

l  An object can be mapped into an area of virtual memory as either a shared object or a private object.

²  If a process maps a shared object into an area of its virtual address space, then any writes that the process makes to that area are visible to any other processes that have also mapped the shared object into their virtual memory. Further, the changes are also reflected in the original object on disk.

²  Changes made to an area mapped to a private object are not visible to other processes, and any writes that the process makes to the area are not reflected back to the object on disk.

l  A virtual memory area into which a shared object is mapped is often called a shared area. Similarly for a private area.

l  Suppose that process 1 maps a shared object into an area of its virtual memory, as shown in Figure 9.29(a). Now suppose that process 2 maps the same shared object into its address space (not necessarily at the same virtual address as process 1), as shown in Figure 9.29(b).

l  Since each object has a unique file name, the kernel can determine that process 1 has already mapped this object and can point the page table entries in process 2 to the appropriate physical pages.

l  The key point is that only a single copy of the shared object needs to be stored in physical memory, even though the object is mapped into multiple shared areas. For convenience, we have shown the physical pages as being contiguous, but of course this is not true in general.

l  Private objects are mapped into virtual memory using a technique known as copy-on-write. A private object begins life in the same way as a shared object, with only one copy of the private object stored in physical memory.

l  (a) shows a case where two processes have mapped a private object into different areas of their virtual memories but share the same physical copy of the object. For each process that maps the private object, the page table entries for the corresponding private area are flagged as read-only, and the area struct is flagged as private copy-on-write. So long as neither process attempts to write to its respective private area, they continue to share a single copy of the object in physical memory.

l  As soon as a process attempts to write to some page in the private area, the write triggers a protection fault. When the fault handler notices that the protection exception was caused by the process trying to write to a page in a private copy-on-write area, it creates a new copy of the page in physical memory, updates the page table entry to point to the new copy, and then restores write permissions to the page, as shown in (b).

l  When the fault handler returns, the CPU reexecutes the write, which now proceeds normally on the newly created page.

l  By deferring the copying of the pages in private objects until the last moment, copy-on-write makes the most efficient use of scarce physical memory.

9.8.2 The fork Function Revisited

9.8.3 The execve Function Revisited

9.8.4 User-level Memory Mapping with the mmap Function

Please indicate the source: http://blog.youkuaiyun.com/gaoxiangnumber1.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值