From: rama@cc.gatech.edu (Kishore Ramachandran) Newsgroups: git.cc.class.cs6210 Subject: Summary of Tornado paper - clustered objects Date: 9 Oct 2005 20:12:26 -0400 Based on some questions from a student on the Tornado paper, I have compiled the following summary which would be useful to all. All the Tornado structuring (with clustered objects) has to do with how the multiprocessor microkernel is organized. You can have Unix like processes with multiple threads in them running above this microkernel. The Tornado kernel provides the resources needed for such application processes and threads (such as time on the processors, memory, etc.). Here is a summary of clustered objects used as the systems structuring principle in the Tornado microkernel: A true shared object: As the name implies there is only one object in memory. All object references will go to the same physical memory area. That is all representations of a true shared object point to the same physical object (same physical memory space). If there is hardware cache coherence then updates to data structures in the shared object lead to coherence traffic. A replicated object: Each unique replica references different physical memory area. Partial replication only means that a group of processors may (true) share a replica. For example, a group of processors in a cluster may share an object. That is each representation of a replicated object points to different physical objects (occupy different physical memory spaces). Read the second para of Sec 3.2 for an example use of replicated object for implementing a process object. Note that since each replica occupies different physical memory, if there are common fields in the object they have to be updated explicitly in software by broadcasting updates to the representations in the different processors. Partitioned object: In this case each representation (as in the case of a replicated object) references different physical memory area. By design, the representations share *no* common fields. Hence there is no need for coherence among the representations of a clustered partitioned object. Putting them all together for a memory manager: 1) A process object is shared by the group of threads that comprise an application. The threads may be running on different processors (some on the same processor as well depending on resource availability). Thus it makes sense to have full replication of the process object on all the processors that the process has threads running. There may be some attributes of the process that is global to all the threads (for example base priority) of a given application. Thus the different representations of the process object may need explicit software updates to modify common fields (which occupy different physical memory locations in the different representations). 2) The paper gives reasons why a region object (which corresponds to a region of virtual memory) may lend itself to a partial replication (since several threads may execute in the same region of virtual memory). For example, there may be one region object per cluster of processors in a NUMA-style architecture. 3) An FCM is specific to each region and thus is a good candidate for partitioned representation. It manages a group of physical page frames for backing the virtual pages in a region. The page frames managed by an FCM are disjoint from the page frames managed by other FCMs. 4) A COR initiates the I/O from/to Disk to/from DRAM when a page has to be swapped in/out. Since in a well-designed virtual memory system you expect disk activity for paging to be an infrequent event it makes sense to have a single representation that is shared by all the processors. -- ---- Kishore Ramachandran Professor, College of Computing Georgia Tech, 801 Atlantic Dr. Atlanta, Ga 30332 http://www.cc.gatech.edu/~rama Ph: (404) 894-5136; FAX: (404) 385-6506