CS 2200 Intro to Systems and Networks
Project 3 - Virtual Memory Management

Introduction

In this project you will implement a virtual memory system using the simulator we will be providing. The project will walk you through the implementation step-by-step, asking questions about each step to make sure you understand what you have to do before you go off and write code.

The Assignment

This assignment shouldn't be extremely difficult. If you find yourself struggling with the code, step back and think about what you are supposed to be doing in theory. If you don't understand how the step should work, it will be impossible to write it out in code.

The assignment is organized into 6 steps, as follows:

Each step starts with some theoretical questions. You do not have to submit answers to these questions. They are there to guide you. If you can answer the questions without much trouble, you will probably have an easy time with this assignment. As was mentioned earlier, the code will be really hard to write if you aren't comfortable with the concepts first. The questions are designed to prepare you for the coding in each step, so do them first.

The Simulator

Code that you have to complete for this assignment can be found in the files in the directory student-src, while the simulator code can be found in the directory simulator-src. You probably shouldn't need to look at much (if any) of the simulator code aside from the files we tell you to modify. At the end of this file we have included an Appendix that discusses the simulator. It might be useful if you are having trouble figuring out how to do something to read that.

To recompile the simulator, we have provided a Makefile in the top level directory. This can be used to build the project by simply typing make. After you have built the program, you should find an executable named vm-sim in the top level directory. To run this program, you must specify a references file. The references file describes a series of memory references. The directory references contains four different potential references files for your use:

To run the simulator with a given references file, for example the basic file, issue the following command:

   ./vm-sim references/basic

There are several other command line options for the simulator that you can play around with to adjust the memory size, page size and tlb size. You don't need to use them, but you can play around with different settings to see the effect that they have on the memory access time. The default settings, are a memory size of 16 values, a page size of 2 values, and a tlb size of 4.

Step 1 - Split the Address [10 pts total]

In modern virtual memory systems, the program written by the user accesses memory using virtual addresses. The hardware and operating system work together to translate these addresses into a physical address which can be used to access physical memory. The first step of this process is to take the virtual address and convert it into a physical address.

Remember that a virtual address consists of two parts. The high-order bits make up the virtual page number (VPN), and the low-order bits make up the offset.

Part 1a - Some Questions About Address-Splitting

On a certain machine, virtual addresses are made up of 16 bits. Each page on this machine has 2^8 addressable locations. Answer the following questions:

Question 1 - How many bits must the offset be?

Question 2 - Recalling that the virtual address is split into the offset and VPN, how many bits is the VPN?

Question 3 - What is the VPN given the address 0xBEEF?

Question 4 - What is the offset given the address 0xBEEF?

Part 1b - Correcting the Address-Splitting macro

Look at file page-splitting.h. You should find two macros that are used by the simulator to split the address. They are named VADDR_PAGENUM and VADDR_OFFSET. They take a virtual address and return the page number or offset, respectively. They currently do not function correctly. Fix them so they properly return the virtual page number and offset.

[Hint 1: Use the global variable page_size to access the size of a page]

[Hint 2: While your first instinct is probably to do this using bitwise arithmetic, it may make the implementation difficult. Think about using modulus and integer division. It should be possible to implement each macro in approximately one line of code with either method.]

Step 2 - Address Translation [20 pts total]

Now that we can split the address, we are ready to translate the virtual address into a physical address. This requires a page table to store the mapping between virtual addresses and physical addresses. In the simulator, a page table is represented as an array of the following structure:

typedef struct {
   pfn_t pfn;           /* Physical frame number */
   unsigned char valid; /* Valid 'bit' */
   unsigned char dirty; /* Dirty 'bit' */
   unsigned char used;  /* Used (aka accessed recently) */
} pte_t;

You'll notice, that the VPN doesn't appear as part of the page table. This is because the page table is an array of these entries. The index into the array corresponds to the VPN. This allows the mapping from VPN to page table entry to be performed very easily.

Part 2a - Questions about Address Translation

Most of address translation deals with reading values from a page table. The table below, is similar in organization to the array of pte_ts that is used in the simulator, although the size of the VPN and PFN have been reduced to simplify the table. Assume that the page size is the same as that used in the questions on address splitting (and therefore, that the way to split addresses is identical). Any VPNs not appearing in the table are invalid.

Page Table
VPN PFN validdirtyused
0xDE0xBE YES NO NO
0xF00xD0 YES YES YES
0xBE0x42 YES NO NO
0x420x43 NO NO NO
...---- NO NO NO

Question 5 - What physical address is 0xDEAD mapped to?

Question 6 - What physical address is 0xF00D mapped to?

Question 7 - Is the address 0x42ED valid?

Part 2b - Implementing Address Translation

After finishing the questions in the previous section, you should have a pretty good idea about how the page table is used when translating addresses. Open up the file page-lookup.c. In this file you will find a function called pagetable_lookup. Modify this function to behave correctly. Note that you will have to check to make sure the entry is valid. If it isn't, you should increment the variable count_pagefaults and call the function pagefault_handler.

When implementing address translation, keep in mind that the global variable current_pagetable is a pointer to an array of page table entries (pte_t), indexed by VPN.

[Hint: You don't have to worry about marking the entry as dirty or used, or using the TLB. This is already done for you by the simulator. You do, however, need to notify the page fault handler whether or not a write is required.]

[Note: In the questions, you were asked to find the physical address. In the function, you have to find the PFN. These are slightly different!]

Step 3 - Computing the EMAT [10 pts total]

Now that we are getting some results from our simulator, we are ready to perform some real computations with our results. EMAT stands for Average Memory Access Time. It is computed (quite simply) by figuring out the amount of time for each access and dividing the total access time by the number of accesses performed.

Part 3a - Questions about EMAT

Question 8 - Assuming a memory access time of 100ns, and that an average disk access takes 10ms, how long would it take to access memory 10 times, if 2 of those accesses resulted in page faults, and 4 of the accesses resulted in TLB misses?

[Hint: Don't forget to take into account the time it takes to access the page table.]

Question 9 - What would a general formula for the EMAT be, in terms of average disk access time and memory access time?

Part 3b - Automate EMAT Computation

Obviously, computing EMAT every time you run the simulator is quite tedious. Since it is generally better to make the computer do these kinds of computations for us, this step of the project asks you to fix the function compute_emat found in emat.c. This function takes no parameters, but it has access to the global statistics maintained by in statistics.h, specifically:

Since we haven't yet implemented the TLB, there will be no TLB hits. After the TLB has been correctly implemented, make sure that your implementation of compute_emat still works correctly.

Your computation should use the constant values DISK_ACCESS_TIME and MEMORY_ACCESS_TIME which are defined in emat.c. For the purposes of this function, treat a TLB hit as taking no time when looking up the VPN in the page table.

Step 4 - Handling Page Faults [20 pts total]

What happens when the CPU encounters an invalid address in the page table? When this occurs, the OS should allocate a physical frame for the requested page, by either locating a free frame, or evicting a physical frame that is already in use. After this occurs, the OS should update the page table to reflect the new mapping, and then continue with the memory access.

What exactly can cause this? Well, when a program is initially started, none of the virtual addresses are valid. When a given address is first used, the OS will allocate a physical frame to store the information. If this keeps occurring, the OS might be trying to allocate a new physical frame, when there is no memory remaining. In this situation, the page fault handler will have to evict a physical frame. When it does this, it moves the information stored there to the hard disk, and then uses the recently cleared frame.

Part 4a - Questions About Page Faults

Question 10 - If a page fault occurs, and there is an invalid frame in the physical memory, should you choose to evict a frame? When would this occur?

Question 11 - What is thrashing? How would a virtual memory system try to eliminate it?

Part 4b - Implementing the Page Fault Handler

Look in the file page-fault.c, and you will find a partially implemented page-fault handler. The FIX ME comments in there walk you through the changes you should have to make.

While working on this, keep in mind that each process has a page table. current_pagetable refers to the pagetable of the currently running process (which is the one that needs a new page). The page table of the process that owns the victim page can be found in the victim process control block (victim_pcb)

Step 5 - Improving Page Replacement [20 pts total]

Up to this point we haven't really worried about what happens when we have to make more pages than we have room to store in physical memory. In reality, the virtual memory system uses the hard drive to make it seem like there is considerably more memory than there really is. While it is really slow to use the hard drive in this manner, if page replacement is performed in an intelligent manner it is much better than just stopping the user's program.

Right now, the virtual memory system you have built uses a random page replacement algorithm that we provided for you. As was discussed above, an intelligent system would be much better than this. After asking some questions about page replacement, we will ask you to implement one of the algorithms we have discussed in class, and observe the impact that this can have upon EMAT.

The optimal page replacement alogrithm would be to look into the future. Of all the physical frames, we should pick the one that is first used the longest time from now. We know that we will not have to evict a page again until that page is accessed. Unfortunately, this algorithm requires me to be able to look into the future, which isn't something that a modern computer is capable of doing. Instead, we take advantage of temporal locality to justify the claim that if a page was used recently, it is likely to be used again in the very near future. While not optimal, the clock-sweep algorithm is a very reasonable algorithm, because it is easy to implement and has relatively decent results in practical use. For this project, you will be implementing the clock-sweep algorithm.

The basic idea of clock-sweep is that you mark each page when it is used. When you need to evict a page, you iterate (sweep) through memory examining each frame's marked bit. If the page is marked, you unmark the bit. When you encounter a page that isn't marked, that is the page you will evict. If you reach the end of memory the search wraps around (this means that if all pages are marked, we will unmark all the pages and then choose the page we started with for eviction). Normally, clock-sweep remembers the page it left off on, so that the next time it executes it continues on from there. For this assignment you will not have to do that. If you chose to do this, however, you'll earn brownie points.

In Linux, the clock-sweep algorithm is sometimes also referred to as the second chance algorithm because it gives each page that has been marked as recently used a second chance at not-being evicted.

Part 5a - Questions About Page Replacement

Answer the following questions about page replacement, given the following page table. Unlike in the previous question (or the question to come up soon), assume that any entry not listed is VALID and USED.

Page Table
VPN PFN validdirtyused
0xDE0xBE YES NO YES
0xF00xD0 YES YES YES
0xBE0x42 YES NO NO
0xBC0x43 YES NO NO
0x420xAB YES NO YES
0x210x23 NO NO NO
...---- YES NO YES

Question 12 - What is the first page that should be used when we need to allocate a new page?

Question 13 - Assuming that we are now using the page we selected in question 12 and no pages have been marked as used since then, what is the next page we will select?

Question 14 - Again, assuming that we are using the pages selected in questions 12 and 13 and no pages have been marked as used since then, which page is selected?

Part 5b - Implementing a Page Replacement Policy

Now that you understand how page replacement works, open up the file page-replacement.c and change the replacement algorithm to be more intelligent. You should do two things. If there is an invalid page, simply use that one. If there are no invalid pages, perform a clock-sweep algorithm to decide which page you should evict.

Step 6 - Adding a TLB [20 pts total]

Accessing memory is slow. Virtual memory doesn't really help this, because page tables mean that we will have to access memory twice -- once to translate the virtual address to the physical address, and again to actually access the correct location. As useful as virtual memory is, if it caused every memory access to take twice as long (or longer, if the hard drive came into play), it would be an unacceptable cost for smaller programs that didn't need virtual memory. Luckily, there are ways of reducing the performance hit.

We obviously can't eliminate the actual memory access, but we can attack the page table lookup by adding a small piece of hardware that keeps a small buffer of past translations. If we can locate the virtual address in this buffer, we can bypass the page table lookup. This buffer is called the Translation Lookaside Buffer (TLB) because it provides an alternative means of performing the lookup for translation

Part 6a - Questions about the TLB

The structure of the TLB is remarkably similar to the page table. The biggest difference is the fact that in the page table, the VPN serves as an index into an array. In the TLB, the VPN is simply another entry in the TLB. This is because, the TLB is relatively small, and can't store every single entry. Use the TLB provided below to answer the questions. The TLB is only capable of holding four entries. As in the previous question, any entry not explicitly present in the page table is assumed to be invalid.

Page Table
VPN PFN validdirtyused
0xDE0xBE YES NO NO
0xF00xD0 YES YES YES
0xBE0x42 YES NO NO
0xBC0x43 YES NO NO
0x420xAB YES NO NO
0x210x23 YES NO NO
...---- NO NO NO
TLB
VPN PFN validdirtyused
0xDE0xBE YES NO YES
0xF00xD0 YES YES YES
0x420xAB YES NO YES
0x210x23 YES NO NO

Question 15 - What address does the virtual address 0xDEAD translate to? Is this translation found in the TLB or the page table?

Question 16 - What address does the virtual address 0xBE21 translate to? Is this translation found in the TLB or the page table?

Question 17 - When we lookup the address 0xBC87, we miss the TLB. This requires us to evict a an entry form the TLB. Which entry would we pick to evict, assuming we use a standard clock-sweep algorithm?

Part 6b - Adding a TLB

Open up the file tlb-lookup.c. Like the previous step, you will find a partially implemented function, with comments describing what you need to change. The code for the TLB can be found in the file tlb.c. Of special interest are the structure for TLB entries, and the pointer tlb which points to an array of TLB entries. Since there is no relationship between the index in the TLB and the content stored there, you will have to check every valid entry in the TLB before deciding that you were unable to find an entry. The structure of each entry is shown below:

typedef struct {
  vpn_t vpn;     /* Virtual page number */
  pfn_t pfn;     /* Physical frame number */
  uint8_t valid; /* Valid 'bit' */
  uint8_t dirty; /* Dirty 'bit' */
  uint8_t used;  /* Used (aka recently accessed) 'bit' */
} tlbe_t;

Pay attention to the comments, as they describe all of the different things you must do. When scanning through the array of TLB entries, it might be useful to know that the array has tlb_size entries in it.

Turnin

When you are ready to turn-in the project, upload everything in the student-src directory to T-Square. To make this easier on you, we have provided a target in the make file to produce a nice package for submission. Running the command make submit will produce a file named submit.zip, which will contain the files mentioned above. Upload the file to T-Square.

Appendix A - The Simulator Code

This section is intended to serve as a reference to the simulator code, if you are interested in how the simulator works, or need to figure out how to do something.

Simulator Files

Simulator files are located in the directory simulator-src. You shouldn't have to modify any of these, and if you feel a need to look at any of the simulator code, you should only need to look at the header files in there. To help you understand how the simulator is organized, the following list explains what each file in simulator-src does. For simplicity, I only list the filename, and not the extension, as the header file serves the same purpose as the source file.