CS 6210 Advanced Operating Systems
Spring 2003
Project IV: Evaluating Web Proxy Caches

Due: 11:59 p.m., April 24, 2003
(One minute prior to midnight on Thursday, April. 24th.)
This project is to be completed in groups of  3-4 (the same groups as Project III)


Goal

Evaluate the performance of the web proxy server cache using the system already developed in Project 3.  You will make low-level benchmarks of system primitives defined in this project and construct trace examples that contrast replacement schemes and page refresh mechanisms. The client is a web browser or another custom program that allows testing of your system. You will provide individual contributions of each team member in the project directory. Grading will be done a team basis. One person teams are not encouraged. Choose server and cache organizations that will yield best results.

You will experiment with cache replacement policies. You will make measurements on "edhpc" and compare these with measurements on NETLAB.

Use the previous project as a reference if you forget what each operation means. The words update & invalidate have a special meaning, and the context was set by the previous project.

Information regarding netlab accounts and access will be announced soon. Go ahead and start the project with EDHPC, while you get familiarized with NETLAB. The NETLAB contacts for this project are - 1) Dave Robinson, robinson@cc.gatech.edu,  2) Neil Bright - ncb@cc.gatech.edu and 3) Matt Sanders, msanders@cc.gatech.edu .

Dave Robinson will check the newsgroups when the TA is away for a conference from April 17 - April 28, 2003. Please address
each contact and cc the TA.


General Information

Resources

Details

Figure 1. Experimental Topologies. (T1) Clients are away from Primary cache 0, 1 or 2 hops away.  (T2) Clients are 0, 1 hops away
 

T1 and T2 are two topologies. Topology T1 has four clients - A, B, C and D at different distances from the primary caches. Only the primary cache is connected to the Internet. There is ONE client in T1 and T2 connected to the primary cache. All the assumptions and definitions of the previous project apply.

SYSTEMS MEASUREMENT PART

For T1 and T2, measure

a) In T1, cold-miss cost for A, B and C's secondary caches. Measure this for (i) pages exist in the PRIMARY cache (no pages in secondary cache along the route to primary cache has a copy) and (ii) pages non-existent in the PRIMARY cache, need Internet retrieval.

b) In T2, cold-miss cost for clients (two types). Measure this for (i) pages exist in the PRIMARY cache (no pages in secondary cache along the route to primary cache has a copy) and (ii) pages non-existent in the PRIMARY cache, need Internet retrieval.

c) Invalidate message cost from the PRIMARY cache to the (i) farthest secondary cache, (ii) closest secondary cache. Invalidate has a special meaning with respect to web page types, defined in the previous project.

d) Update cost, if a whole page is shipped from the PRIMARY cache to the (i) closest secondary cache (ii) farthest secondary cache. Update has a special meaning with respect to web page types, and this operation is explained in the previous project.

e) Simulate web-page updates at the primary cache. For T1 and T2, what is the highest  rate  you can sustain for
(i) the closest  secondary cache and (ii) the farthest secondary cache. How many operations/second for  a single web page can your system sustain?. An operation is defined as invalidating a web page in the secondary cache, and the secondary cache automatically reloading it from the PRIMARY cache.

e) Simulate web-page updates at the primary cache. For T1 and T2, what is the highest  rate  you can sustain for
(i) the closest  secondary cache and (ii) the farthest secondary cache. How many operations/second for  a single web page can your system sustain?. An operation is defined as *updating* a web page in the secondary cache, an update refers to the use of the update protocol defined in the previous project. ie. the primary cache actually provides a page directly to the secondary cache cf. Invalidate pages are refreshed only when a client asks for a new copy during the next reference.

f) For T1 and T2, pick a client connected to a secondary cache. Generate 1000 requests for distinct 4K-size pages that miss in the parent secondary cache but get satisfied in (i) the nearest peer secondary cache and (ii) the most distant peer secondary cache. This  measurement will exercise the inter-cache protocol. What is the cost of request-and-4Kpagereply ?

State the configuration of the caches for each part ie. size, replacement policy, data structures etc.

CREATIVITY PART

a) Define and generate a reference trace, if at all possible, that makes -

    (i) FIFO outperform LRU and RANDOM
    (ii) LRU outperform FIFO and RANDOM
    (iii) RANDOM outperform LRU and FIFO
    (iv) a configurable hybrid scheme that dynamically adapts based on the references

Consider only the primary cache and a client attached to the primary cache.
 
 

b) Define and generate a trace or circumstance which makes -

    (i) UPDATE outperform INVALIDATE
    (ii) INVALIDATE outperform UPDATE
    (iii) configurable hybrid scheme that dynamically adapts based on references

Consider a primary cache and a directly attached secondary cache.

PHYSICAL PLATFORM

You will complete the "System Measurement part" for EDHPC and NETLAB. Choose an appropriate physical machine allocation. You will describe the virtual topology (T1 and T2) to physical machine mapping in you report. Choose a mapping that
will yield the best results. PRIMARY and SECONDARY caches must reside on separate machines as much as possible. Clients can exist on the same machine, but communicate with their respective secondary caches.  You will compare and contrast measurements on EDHPC and NETLAB and account for their differences.

Generate and define a trace for the "Creativity part" and make sure that this runs on edhpc and netlab. Your page reference generator is simply the client attached to the secondary cache.

DEMONSTRATION

A demo schedule will be announced for the functioning of project 3 and project 4 soon.


Due Date & Turn-In Process
When: April 24, 2003 before midnight. This is one minute prior to midnight on  Thursday, April 24th. No late assignments will be accepted unless prior arrangements have been made.

Where:  /net/hc280/class/cs6210/groups/<group_name>. Please create a README file in each group directory with the names of group members. Email to help@cc if you cannot create your group directory.

What: