You are to implement GTThreads -- a userlevel thread package. The goal of this project is to understand the performance of different scheduling algorithms. To do so you must implement a user level thread scheduler that can schedule multiple threads. The focus of this project is on scheduling algorithms and scheduling overheads and optimizations.
To start with you must implement a thread library and a simple scheduler. The scheduler MUST be a preemptive scheduler. As a baseline implement a basic round robin scheduler (assuming no hardware knowledge). You must measure the performance of two pieces of code using atleast 5 different quantum values.
In the second step you must implement different scheduling policies and compare their overheads using the same methods as Step 1. In this step you can assume that the library has knowledge of the underlying hardware. Compare the performance of each scheduling policy with the basic no-knowledge round robin scheduler. You must implement atleast the following scheduler policies
Measure the performance impact of using different scheduling policies on different hardware. Measure on a multiple processor machine. Your schedulers must be fair and there must be no starvation.
Each thread is assigned a time slice(its quantum)
for which it is allowed to run. A thread is preempted if it used up
its quantum. Preemption can be achieved by using an alarm signal as
a timer. You may use the system's virtual time instead of wall-clock
time, see setitimer(2).
The preemption period in microseconds should be specified
in a function, i.e., gtthread_init(period). Any program using your
GTThreads package must call the function first.
The GTThreads API should look something like this:
int gtthread_create(gtthread_t * thread,
void * (*start_routine)(void *),
void * arg);
int gtthread_join(gtthread_t thread, void **status);
void gtthread_exit(void * retval);
void gtthread_yield(void);
gtthread_t gtthread_self(void);
int gtthread_equal(gtthread_t t1, gtthread_t t2);
int gtthread_cancel(gtthread_t thread);
Next, you have to implement a mutex synchronization primitive to enable safe concurrent manipulation of shared data. The API for the mutex primitive again follows closely Pthread's:
int gtthread_mutex_init(gtthread_mutex_t * mutex);
int gtthread_mutex_lock(gtthread_mutex_t * mutex);
int gtthread_mutex_unlock(gtthread_mutex_t * mutex);
main(), counts as a thread just like any other. You can
get the gtthread_t for it with gtthread_self(), and it
can call or be the argument of any of the thread functions. If the
main thread gtthread_exit's, then the program should keep
going; if it return's from main() then it is
fine to exit the whole program.
Preemption can be achieved by using an alarm signal as a timer. You may use the system's virtual time instead of wall-clock time, see setitimer(2). Virtual time does not count CPU cycles spent on other processes and increments only when the process is running. The unix call setitimer() can be used to generate SIGVTALRM signals. You will need to establish a signal handler for the SIGVTALRM signal. Test your package with different scheduling periods to determine an appropriate value.
Two sets of calls for context switching are sigsetjmp/siglongjmp and
makecontext/getcontext/setcontext/swapcontext. We
suggest that you use sigsetjmp/siglongjmp since the makecontext etc. may be unsafe
in signals.
sigsetjmp() saves the current state
in a buffer and siglongjmp( ) then restores the state
that are stored in the buffer. When creating a thread,
first allocate memory to store the thread state. Then execute the sigsetjmp( ), and modify the program counter(PC) and
stack pointer(SP) to the values for the thread which is to be started.
If you choose to use the makecontext/setcontext/getcontext/swapcontext,
please report potential signal-safe problems of your scheduler if there's any.
Note that of these functions do not act like normal C functions; in fact,
that's the point of them. Manual pages for these Unix standard library routines are available on CoC Unix systems.
We also suggest that you familiarize yourself with the manual pages for
the Pthreads API functions corresponding to the ones you are asked to
implement. Manual pages for those are also available on CoC Unix systems.
Simply substitute the "gtthread_" prefix with a "pthread_" prefix, e.g.
"gtthread_create" becomes "pthread_create".
Work incrementally! First try to implement thread switching, before moving on to the nice library interface.
1) I am not sure I agree with the characterization that the library is going to be running at all. The way user level libraries work is by being part of a user application and scheduling the user threads within that process. So yes its a 1:n (or rahter a m:n where m is the number of processors).
2) Assume that such information is provided externally (an environment variable is best). Affinity for the processor can be assigned using different systems calls (for different os). For linux you would use sched_set_affinity() sched_get_affinity().
I'll put this up on the website also.3) You should know enough about both pieces of codes to not have done a code copy from an unauthorized source. The quality of the matrix multiply is not important to me but I wouldn't like it if someone copied code without giving due credit (and obtaining proper permission).
- Hide quoted text - On 8/29/06, Vishakha GuptaHi Hasan,
I had a few doubts:
1. I wanted to make sure I have understood the assignment right. So this library would be running as any other user program (would it be a 1:n thread library?). Its just that whenever my library gets scheduled, it will run the thread which was running when it got swapped out and if during this time quantum, the timer event happens, my SIGALRM handler will act as the thread scheduler and switch out the currently running thread and get another thread to run. Am I correct?
2. How can a user level scheduler get information about how many processors are there in the system and how would it actually make the threads run on those? Same goes in terms of multicore caches. Is there some system call which can help me do this? I can maybe try and make a m:n model for threads but still not be sure that the m processes that i fork to run the n threads will be scheduled on different processors. Could you please clarify this point.
3. Do we have to understand and code the different matrix multiplication algos (pardon my ignorance but I am not sure what SPMD is although it seems like single processor multiple data sort of thing) or can we download their code from net?