The summary of virtual memory

最新推荐文章于 2025-11-23 19:46:00 发布

原创最新推荐文章于 2025-11-23 19:46:00 发布 · 1.7k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#performance #translation #windows #system #file #application

本文介绍了虚拟内存技术的基本原理，包括其如何允许程序访问超过实际物理内存大小的空间，并提供了内存隔离来增强系统的可靠性。此外，还详细阐述了虚拟内存地址转换过程、保护机制以及虚拟内存技术的历史发展。

Virtual memory is an addressing scheme implemented in hardware and software that allows non-contiguous memory to be addressed as if it were contiguous. The technique used by all current implementations provides two major capabilities to the system:

Memory can be addressed that does not currently reside in main memory and the hardware and operating system will load the required memory from auxiliary storage automatically, without any knowledge of the program addressing the memory, thus allowing a program to reference more (RAM) memory than actually exists in the computer.
In multi tasking systems, total memory isolation, otherwise referred to as a discrete address space, can be provided to every task except the lowest level operating system. This greatly increases reliability by isolating program problems within a specific task and allowing unrelated tasks to continue to process.

Overview

Hardware must have two methods of addressing RAM, real and virtual. In real mode, the memory address register will contain the integer that addresses a word or byte of RAM. The memory is addressed sequentially and by adding to the address register, the location of the memory being addressed moves forward by the number being added to the address register. In virtual mode, memory is divided into pages usually 4096 bytes long. These pages may reside in any available RAM location that can be addressed in virtual mode. The high order bits in the memory address register are an index into page-mapping tables at specific starting locations in memory and the table entries contain the starting real addresses of the corresponding pages. The low order bits in the address register are an offset of 0 up to 4,095 (0 to the page size - 1) into the page ultimately referenced by resolving all the table references of page locations.

The size of the tables is governed by the computer design and the size of RAM. All virtual addressing schemes require the page tables to start at either a fixed location or one identified by a register. In a typical computer, the first table will be an array of addresses of the start of the next array; certain high-order bits of the memory address register will be the index into the first array. Depending on the design goal of the computer, each array entry can be any size the computer can address. The next bits will be an index into the array resolved by the first index. This set of arrays of arrays can be repeated for as many bits that can be contained in the memory address register. The number of tables and the size of the tables will vary by architecture, but the end goal is to take the high order bits of the virtual address in the memory address register and resolve them to an entry in the page table that points to either the location of the page in real memory or a flag to say the page is not available.

Paging

If a program references a memory location that resolves within a virtual page not available, the computer will generate a page fault. The hardware will pass control to the operating system at a place that can load the required page from auxiliary storage (e.g., a paging file on disk) and turn on the flag to say the page is available. The hardware will then take the start location of the page, add in the offset of the low order bits in the address register and access the memory location desired.

All the work required to access the correct memory address is invisible to the application addressing the memory. If the page is in memory, the hardware resolves the address. If a page fault is generated, software in the operating system resolves the problem and passes control back to the application trying to access the memory location. This scheme is called paging.

Translating the memory addresses

To minimize the performance penalty of address translation, most modern CPUs include an on-chip Memory Management Unit, or MMU, and maintain a table of recently used virtual-to-physical translations, called a Translation Lookaside Buffer, or TLB. Addresses with entries in the TLB require no additional memory references (and therefore time) to translate. However, the TLB can only maintain a fixed number of mappings between virtual and physical addresses; when the needed translation is not resident in the TLB, action will have to be taken to load it in.

On some processors, this is performed entirely in hardware; the MMU has to do additional memory references to load the required translations from the translation tables, but no other action is needed. In other processors, assistance from the operating system is needed; an exception is raised, and the operating system handles this exception by replacing one of the entries in the TLB with an entry from the primary translation table, and the instruction which made the original memory reference is restarted.

Protected memory

Hardware that supports virtual memory almost always supports memory protection mechanisms as well. The MMU may have the ability to vary its operation according to the type of memory reference (for read, write or execution), as well as the privilege mode of the CPU at the time the memory reference was made. This allows the operating system to protect its own code and data (such as the translation tables used for virtual memory) from corruption by an erroneous application program and to protect application programs from each other and (to some extent) from themselves (e.g. by preventing writes to areas of memory that contain code).

History

Before the development of the virtual memory technique, programmers in the 1940s and 1950s had to manage two-level storage (main memory or RAM, and secondary memory in the form of hard disks or earlier, magnetic drums) directly. For example, using overlaying techniques.

Virtual memory was developed in approximately 1959–1962, at the University of Manchester for the Atlas Computer, completed in 1962. However, Fritz-Rudolf Güntsch, one of Germany's pioneering computer scientists and later the developer of the Telefunken TR 440 mainframe, claims to have invented the concept in his doctoral dissertation Logischer Entwurf eines digitalen Rechengerätes mit mehreren asynchron laufenden Trommeln und automatischem Schnellspeicherbetrieb (Logic Concept of a Digital Computing Device with Multiple Asynchronous Drum Storage and Automatic Fast Memory Mode) in 1957.

In 1961, Burroughs released the B5000, the first commercial computer with virtual memory.

Like many technologies in the history of computing, virtual memory was not accepted without challenge. Before it could be regarded as a stable entity, many models, experiments, and theories had to be developed to overcome the numerous problems with virtual memory. Specialized hardware had to be developed that would take a "virtual" address and translate it into an actual physical address in memory (secondary or primary). Some worried that this process would be expensive, hard to build, and take too much processor power to do the address translation.

By 1969 the debate over virtual memory for commercial computers was ove. An IBM research team, lead by David Sayre, showed that the virtual memory overlay system worked consistently better than the best manual-controlled systems.

Possibly the first minicomputer to introduce virtual memory was the Norwegian NORD-1 minicomputer. During the 1970s, other minicomputer models such as VAX models running VMS implemented virtual memories.

Virtual memory was introduced to the x86 architecture with the protected mode of the Intel 80286 processor. At first it was done with segment swapping, which becomes inefficent as segments get larger. With the Intel 80386 comes support for paging, which lay under segmentation. The page fault exception could be chained with other exceptions without causing a double fault.

Windows example

Virtual memory has been a feature of Microsoft Windows since Windows 3.0 in 1990; it was done in an attempt to slash the system requirements for the operating system in response to the failures of Windows 1.0 and Windows 2.0. 386SPART.PAR or WIN386.SWP is a hidden file created by Windows 3.x for use as a virtual memory swap file. It is generally found in the root directory, but it may appear elsewhere (typically in the WINDOWS directory). Its size depends on how much swap space the system has set up under Control Panel - Enhanced under "Virtual Memory". If a user moves or deletes this file, Windows will blue screen the next time it is started with "The permanent swap file is corrupt" and will ask the user if he wants to delete the file (It asks this question whether or not the file exists).

Windows 95 uses a similar file and the controls for it are located under Control Panel - System - Performance tab - Virtual Memory. Windows automatically sets the page file to start at 1.5× the size of physical memory, and expand up to 3× physical memory if necessary. If a user runs memory intensive applications on a low physical memory system, it is preferable to manually set these sizes to a value higher than default.

In NT-based versions of Windows (such as Windows 2000 and Windows XP), the swap file is named pagefile.sys. The default location of the page file is in the root directory of the partition where Windows is installed. Windows can be configured to use free space on any available drives for page files.

Fragmentation of the Windows page file

Occasionally, when the page file is gradually expanded, it can become heavily fragmented and cause performance issues. The common advice given to avoid this problem is to set a single "locked" page file size so that Windows will not resize it. Other people believe this to be problematic in the case that a Windows application requests more memory than the total size of physical and virtual memory. In this case, memory is not successfully allocated and as a result, programs, including system processes may crash. Supporters of this view will note that the page file is rarely read or written in sequential order, so the performance advantage of having a completely sequential page file is minimal. It is however, generally agreed that a large page file will allow use of memory-heavy applications, and there is no penalty except that more disk space is used.

Defragmenting the page file is also occasionally recommended to increase performance when a Windows system is chronically using much more memory than its total physical memory. In this case, while a defragmented page file can help slightly, performance concerns are much more effectively dealt with by adding more physical memory.

Swapping in the Linux and BSD operating systems

In the Linux and BSD operating systems, it is common to use a whole partition of a hard disk for swapping. Though it is still possible to use a swap file instead, it is recommended to use a separate partition because this excludes chances of file system fragmentation which would reduce performance. Also, by using a separate swap partition, it can be guaranteed that the swap region is located at the fastest location of the disk which is generally the center cylinders between the inner and outer edges of the disk (except for disks with fixed heads). However with the 2.6 Linux kernel, swap files are just as fas as swap partitions. As such, this recommendation doesn't apply much to current Linux systems and the flexibility of swap files can outweigh those of partitions, and since modern high capacity hard drives can remap physical sectors, there is no guarantee that a partition will be contiguous, and even if it were, having the swap data near the rest of the data will reduce seek times when swapping was needed, so the aforementioned performance claims probably do not apply to modern Linux systems.

Linux supports using a virtually unlimited number of swapping devices, each of which can be assigned a priority. When the operating system needs to swap pages out of physical memory, it uses the highest priority device with free space. If multiple devices are assigned the same priority, they are used in a fashion similar to level 0 RAID arrangements. This gives increased performance as long as the devices can be accessed efficiently in parallel - therefore, care should be taken assigning the priorities. For example, swaps located on the same physical disk shouldn't be used in parallel, but in order ranging from the fastest to the slowest (i.e.: the fastest having the highest priority).

There are also some successful attempt to use the memory located on the graphics card for swapping on Linux, as modern graphics cards often have 128 or even 256 megabytes of RAM which normally only gets put to use when playing games. Video memory being significantly faster than HDDs, this method gives excellent swapping performance.

Recently, some experimental improvements to the 2.6 Linux kernel have been made by Con Kolivas, published in his popular CK patchset. The improvements, called "Swap Prefetch", employ a mechanism of pre-fetching previously swapped pages back to physical memory even before they are actually needed, as long as the system is relatively idle (so not to impair performance) and there is available physical memory to use. This gives several orders of magnitude faster access to the affected pages when their owning process needs access to them, since they are effectively not swapped out by then.