Introduction
A cache stores a subset of the addresss space of RAM. An address space is the set of valid addresses. Thus, for eachaddress in cache, there is a corresponding address in RAM. Thissubset of addresses (and corresponding copy of data) changes overtime, based on the behavior of your program.Cache is used to keep the most commonly used sections of RAM in thecache, where it can be accessed quickly. This is necessary becauseCPU speeds increase much faster than speed of memory access. If wecould access RAM at 3 GHz, there wouldn't be any need for cache,because RAM could keep up. Because it can't keep up, we use cache.
What if we wanted more RAM than we had available. For example, wemight have 1 M of RAM, what if we wanted 10 M? How could we manage?
One way to extend the amount of memory accessible by a programis to use disk. Thus, we can use 10 Megs of disk space. At anytime, only 1 Meg resides in RAM.
In effect, RAM acts like cache for disk.
This idea of extending memory is called virtual memory.It's called "virtual" only because it's not RAM. It doesn'tmean it's fake.
The real problem with disk is that it's really, really slowto access. If registers can be accessed in 1 nanosecond,and cache in 5 ns and RAM in about 100 ns, then disk is accessedin fractions of seconds. It can be a million times slower toaccess disk than a register.
The advantage of disk is it's easy to get lots of disk spacefor a small cost.
Still, becaues disk is so slow to access, we want to avoidaccessing disk unnecessarily.
Uses of Virtual Memory
Virtual memory is an old concept. Before computers had cache,they had virtual memory. For a long time, virtual memory onlyappeared on mainframes. Personal computers in the 1980s did not usevirtual memory. In fact, many good ideas that were in common use inthe UNIX operating systems didn't appear until the mid 1990s inpersonal computer operating systems (pre-emptive multitaskingand virtual memory).Initially, virtual memory meant the idea of using diskto extend RAM. Programs wouldn't have to care whether thememory was "real" memory (i.e., RAM) or disk. The operating systemand hardware would figure that out.
Later on, virtual memory was used as a means of memoryprotection. Every program uses a range of addressed called theaddress space.
The assumption of operating systems developers is that anyuser program can not be trusted. User programs will try to destroythemselves, other user programs, and the operating system itself.That seems like such a negative view, however, it's how operating systemsare designed. It's not necessary that programs have to be deliberatelymalicious. Programs can be accidentally malicious (modify the dataof a pointer pointing to garbage memory).
Virtual memory can help there too. It can help preventprograms from interfering with other programs.
Occasionally, you want programs to cooperate, and sharememory. Virtual memory can also help in that respect.
How Virtual Memory Works
When a computer is running, many programs are simulataneouslysharing the CPU. Each running program, plus the data structuresneeded to manage it, is called a process.Each process is allocated an address space. This is a setof valid addresses that can be used. This address space canbe changed dynamically. For example, the program might requestadditional memory (from dynamic memory allocation) from theoperating system.
If a process tries to access an address that is not partof its address space, an error occurs, and the operating systemtakes over, usually killing the process (core dumps, etc).
How does virtual memory play a role? As you run a program,it generates addresses. Addresses are generated (for RISC machines)in one of three ways:
- A load instruction
- A store instruction
- Fetching an instruction
Each address generated by a program is considered virtual.It must be translated to a real physical address. Thus,address tranlation is occuring all the time. As you might imagine,this must be handled in hardware, if it's to be done efficiently.
You might think translating each address from virtual to physicalis a crazy idea, because of how slow it is. However, you getmemory protection from address translation, so it's worth thehardware needed to get memory protection.
Paging
In a cache, we fetched quantities called data blocks or cachelines. Those are typically somewhere between, say, 4 and 64 bytes.There is a corresponding terminology in virtual memory to acache line. It's called a page.
A page is a sequence of N bytes where N is a powerof 2.
These days, page sizes are at least 4K in size and maybe aslarge as 64 K or more.
Let's assume that we have 1M of RAM. RAM is also called physicalmemory. We can subdivide the RAM into 4K pages. Thus1M / 4K = 256 pages. Thus, our RAM has 256 physical pages, weachholding 4K.
Let's assume we have 10 M of disk. Thus, we have 2560 disk pages.
In principle, each program may have up to 4 G of address space.Thus, it can, in principle, access 220 virtualpages. In reality, many of those pages are considered invalid pages.
Page Tables
How is an address translated from virtual to physical?First, like the cache, we split up a 32 bit virtual addressinto a virtual page (which is like a tag) and a page offset.
If this looks a lot like a fully-associative cache, butwhose offset is much much larger, it's because that'sbasically what it is.
We must convert the virtual page number to a physical page number.In our example, the virtual page consists of 20 bits. A page table isa data structure which consists of 220 page tableentries (PTEs). Think of the page table as an array of page tableentries, indexed by the virtual page number.
The page table's index starts at 0, and ends at 220 -1.
Here's how it looks:

Suppose your program generates a virtual address. You'dextract bits B31-12 to get the virtual pagenumber. Use that as an index into the above page tableto access the page table entry (PTE).
Each PTE consists of a valid bit and a 20 bit physical page (it's20 bits, because we assume we have 1M of RAM, and 1M of RAM requires20 bits to access each byte). If the valid bit is 1, then the virtualpage is in RAM, and you can get the physical page from the PTE. Thisis called a page hit, and is basically the same as a cache hit.
If the valid bit is 0, the page is not in RAM, and the 20 bitphysical page is meaningless. This means, we must get the disk pagecorresponding to the virtual page from disk and place it into a pagein RAM. This is called a page fault.
Because disk access is slow, slow, slow, we want to minimizethe number of page faults.
In general, this is done by making RAM fully associative. Thatis, any disk page can go into any RAM page (disk, RAM, and virtualpages all have the same size).
In practice, some pages in RAM are reserved for the operatingsystem to make the OS run efficiently.
Translation
Suppose your program generated the following virtual address F0F0F0F0hex (which is 1111 0000 1111 0000 11110000 1111 0000 two). How would you translate this to aphysical address? First, you would split the address into a virtualpage, and a page offset (see below).Then, you'd see if the virtual page had a corresponding physicalpage in RAM using the page table. If the valid bit of the PTE is 1,then you'd translate the virtual page to a physical page, and appendthe page offset. That would give you a physical address in RAM.

Huge Page Tables
Page tables can be very large. If every virtual page was valid,our page table would be 220 X 21 bits. This isabout 3 Megs just for one program's page table. If there are manyprograms, there are many tables, each occupying a lot of memory.What's worse, the page tables we've been talking about areincomplete. If we have a page fault, we need to find the page ondisk. Where is it located? That information is kept in another pagetable, which is indexed by the virtual page (same as the page table wetalked about), and tells you where on disk to find it. Then, we haveto copy that page to RAM, and update the first page table. Thus,we need two page table tables!
These page tables are basically just data. Thus, they occupymemory as any data occupies memory. When we switch from one processto another, we need to load its page table in RAM for easy access.It's useful to keep it located in certain parts of RAM for just such apurpose. If RAM is suitable large, we can have several processes'page tables in RAM at the same time.
A page table register can hold the physical address of thepage table that's currently active to get quick access. Still, theseare large, and we may want to find ways to speed things up.
Inverted Page Tables
There are many schemes to reduce the size of a page table.One way is to use a hierarchy. Thus, we might have two layersof pages. Bits B31-22 might tell you thefirst layer, while B21-13 might tell you thesecond layer.Another idea is to use a kind of closed hash table. The hash table'ssize is based on the number of physical pages. The number of physicalpages is usually a lot smaller than the number of all virtual pagesput together.
A hash function takes a virtual page number as input, and producesan index into the hash table as the result. Each entry of thehash table consists of a virtual page number and a physical page number.You check to see if the virtual page number matched, and if so, thenyou use the physical page.
If it missed, then you must resolve the collision based on the hashtable. In practice, you may need the number of entries of the hashtable to be a few times larger than the number of physical pages,to avoid excessive collisions.
An inverted page table takes longer to access because you may havecollisions, but it takes up a lot less memory. It helps with pagehits. However, if you have a page fault, you still need a page tablethat maps virtual pages to disk pages, and that will be large.
Translation Lookaside Buffer (TLB)
What's the cost of address translation? For each virtual address,we must access the page table to find the PTE corresponding to thevirtual page. We look up the physical page from the PTE, andconstruct a physical address. Then, we access RAM at the physicaladdress. That's two memory acccess: one to access the PTE, one moreto access the data in RAM. Cache helps us cut down the amount of timeto access memory, but that's only if we have cache hits.The idea of a TLB is to create a special cache for translations.Here's one example of a TLB.

Each row in the TLB is like one slot of a cache. Assumewe have 64 rows. When you have a virtual address, you cansplit it into a virtual page and an offset.
In parallel, compare the virtual page to all of the entries of theTLB (say, 64). There should be, at most, one match. Just like afully associative cache, you want to check if the TLB entry is valid.
If a TLB hit occurs, replace the virtual page with a physical pageto create a physical address.
If there's a TLB miss, then it's still possible that the virtualpage resides in RAM. You must now look up the PTE (page table entry)to see if this is the case. If the PTE says the virtual page is inRAM, then you can update the TLB, so that it has a correct virtualto physical page translation.
The TLB is designed to only store a limited subset of virtualto physical page translation. It is really just a cache for thepage table, storing only the most frequently used translations.
The TLB can be kept small enough that it can be fully associative.However, some CPU designers make larger TLBs that are direct mappedor set associative.
Memory Protection
How does virtual addresses give us memory protection? Supposeyou work at a post office, which assigns post boxes to individuals.You have a person who comes in and says they want three post officeboxes: 100, 101, and 102.They insist on using those numbers. Another customer comes in,and insists on using those numbers too. How can you accomodateboth customers?
Basically, you cheat. You tell the first customer you have boxes100, 101, and 102, but you assign him boxes 200, 201, and 202.Similarly, you tell the second customer that you also have boxes100, 101, and 102, but you assign her boxes 320, 321, and 322.
Whenever customer 1 wants the mail in box 100, you translateit to box 200. Whenever customer 2 wants mail in box 100, youtranslate it to box 320. Thus, your two customers get to use thebox numbers they want, and through the magic of translation, theytwo customers avoid using each other's boxes.
If the post office wanted to reserve its own boxes for itsown use, it could reserve boxes 1 through 100 to itself, and neverassign those boxes, directly or indirectly to a customer. Evenif a customer wants box 50, they can be assigned box 150, safelyoutside the range of boxes reserved for the post office.
This same analogy applies to real programs. Each program canassume it uses the same set of 32 bit virtual addresses. We justmake sure that those virtual pages do not map to the same disk page,nor to the same physical page.
Who's job is it to assign the pages? It's the operating system.When a program starts up, it will want a certain range of addresses.The operating system creates a page table for the program, making surethe disk pages it uses do not conflict with the disk pages ofother programs.
Invalid Pages
Sometimes you don't really want a program to access all possible32 bit addresses. This helps reduce the total size of the pagetable. One way to prevent a user program from accessing invalidpages is make certain virtual pages entries invalid. Thus, anattempt to translate virtual to physical page will fail, and evenlooking up the virtual page on disk fails.Page Replacement Schemes
Like cache, you can have page replacement schemes based on FIFO,LRU, LFU, etc. In general, page replacement schemes can be moresophisticated because getting a page off disk is really slow, so youcan afford to take more time to make a better choice.Dirty Bit
In reality, caches usually don't have dirty bits. This meansthat you must always write back if a cache line is evicted.However, because disk access is slower, it makes sense to usedirty bits for pages. Thus, if a page hasn't been modified(maybe because it's read only), there's no reason to copy it backto disk. Just toss it out.
Cache
Virtual memory works with caching. Basically, once the virtualaddress is translated to a physical address, then the physical addressis passed to the cache, which checks to see if there is a cache hit.The Oblivious Programmer
As with cache, assembly lanugage programmers don't have toworry about virtual memory. They just see "memory". They don'tdo anything in particular whether there's virtual memory or not.Virtual memory is handled partly by hardware (translation mechanism)and partly by the operating system (sets up page table, handlespage faults, etc).This is good because at one point, programmers had to worryvery much about whether a chunk of memory resided on the diskor in RAM. Programmers had to spend a great deal of effortmanaging this, and it distracted them from coding. With virtualmemory, the management of disk as an extension of RAM is handledautomatically.
Shared Memory
You can share memory between two processes by mapping thevirtual page to the same disk page. When that disk pageis resident in physical memory, then both processes can accessthe same location.There may be issues of synchronization to handle, but that'sa topic that's best left to a course in operating systems.Suffice it to say that we do have a way to map virtualpages to the same disk page to allow for sharing.
Sharing is available when you want two processes to collaboratein a somewhat safe manner. Both processes, for the most part, havetheir own memory. They only share a small region between thetwo of them.
Terminology
Some definitions before we summarize:- A physical page is 2k consecutivebytes in memory. The page is "super word-aligned". That is, it'sfirst byte must be at an address divisible by 2k.
- A disk page is 2k consecutive bytesin disk (also, super word-aligned) Disks can be partitioned upnon-contiguously, so it doesn't have to be truly consecutive, as longas the CPU "thinks" it is. Each physical page should have a correspondingdisk page. However, there are usually far more disk pages thanphysical pages.
- A virtual page is 2k consecutivebytes, which is super word-aligned. This is the page theprogram thinks it is dealing with. This page can either bein RAM (physical memory) and/or disk.
Summary
Virtual memory serves two purposes. First, it allows us toextend the use of physical memory by using disk. Second, it allowsus to have memory protection, because each virtual address istranslated to a physical address.The two are somewhat orthogonal (independent) of each other. Forexample, we could use the disk as an extension of the address spaceaccessible to the programmer, without providing memory protection.Thus, a programmer might be able to access all disk pages, if theoperating system allowed it when setting up page tables.
It's also possible to have memory protection without any disks.Provided each process only needs a small number of pages, we couldallow all of those virtual pages to reside in RAM. For example,suppose each process has only 2 virtual pages. Then, since (inour example) RAM stores 256 pages, we could allow 128 processesto have virtual page 0 and 1, and they would not interferewith each other in virtual memory.