Graduate Research Assistantship positions for Ph.D. students
are available in the areas of parallel and multicore algorithms,
high-performance computing, computational science and engineering,
large-scale optimization problems, and in application area of
computational biology and genomics. Several current projects are
described below. Additional new projects are anticipated by the next
Fall semester. Each research assistant will receive a competitive
stipend plus paid tuition.
Applicants should complete an official application for graduate
studies in either the "Computational Science and Engineering (College of Computing)" or the "Computer Science" graduate programs at Georgia Tech,
and select the Computational Science and Engineering track. APPLICATION DEADLINE: December 15.
The Georgia Tech graduate application is available
online at http://www.gradadmiss.gatech.edu/
On Page 1, question 14 (Program of Study), of the online application, click on "Search for Degree and Major", click on "Ph.D", and select "Computational Science and Engineering (College of Computing)" for Graduate Major, and select "GT-Atlanta" for the planned campus.
One Page 4 (Georgia Tech computer science application), question 1, select "High Performance Computing" as your first choice area of interest.
In your statement on Page 4, please include this sentence:
"I wish to be considered for a Graduate Research Assistantship
under the direction of Professor David A. Bader."
Please email Prof. David A. Bader (
)
with your
First and Last name once you have submitted your online
application and received an Order ID.
A team led by NVIDIA has been awarded a research grant of $25 million by the Defense Advanced Research Projects Agency (DARPA), the U.S. Defense Department's research and development arm, to address what the agency calls a "crisis in computing."
The four-year research contract, awarded under DARPA's Ubiquitous High Performance Computing (UHPC) program, covers work to develop GPU technologies required to build the new class of exascale supercomputers which will be 1,000-times more powerful than today's fastest supercomputers. The team -- which also includes Cray Inc., Georgia Institute of Technology, Oak Ridge National Laboratory and five other top U.S. universities -- is being funded by DARPA to address the challenge that conventional computing architectures are reaching the practical limits of energy usage and will not meet the challenges of exascale computing. The research team plans to develop new software and hardware technology to dramatically increase computing performance, programmability and reliability.
PROJECT CHASM: Challenge Applications and Scalable Metrics (CHASM) for
Ubiquitous High Performance Computing
Advanced computing is the backbone of the Department of Defense and of critical strategic importance to our nation's defense. All DoD sensors, platforms and missions depend heavily on computer systems. To meet the escalating demands for greater processing performance, it is imperative that future computer system designs be developed to support new generations of advanced DoD systems and enable new computing application code. Targeting this crucial need, the Defense Advanced Research Projects Agency (DARPA) has initiated the Ubiquitous High Performance Computing (UHPC) program to create an innovative, revolutionary new generation of computing systems that overcomes the limitations of current evolutionary approach. Georgia Institute of Technology was selected to lead CHASM, an Applications, Benchmarks and Metrics team, for evaluating the UHPC systems under development.
PROJECT STING: Graph Analytics for Streaming Data on Emerging Platforms
The growth of graph-structured data sets is outpacing analysis tools
rapidly. Social networks like Facebook are growing quickly, adding an
average of 17 million users per month over the past year to a present
total of 300 million users with 45 million messages posted per day.
Communication systems like Twitter add 25 million messages per day
with rich context linking messages, users, and topics. Even such
“sedate” topics as protein analysis generate millions of updates per
year. Each of these graphs already stress analysis tools for static,
unchanging graphs; simply repeating static analysis is insufficient
for current graph data. We are developing tools to analyze streaming,
dynamic graph data. These tools require adapting static analysis
algorithms and developing new dynamic algorithms. To implement these
algorithms efficiently, we are evaluating data structures and
programming techniques in emerging development platforms like X10 and
on new multithreaded hardware.
PROJECT GTFOLD: Combinatorial and Computational Methods for the Analysis, Prediction, and Design of Viral RNA Structures
(Funded by NIH)
The Human Genome Project and related efforts have generated enormous
amounts of raw biological sequence data. However, understanding how
biological sequences encode structural and functional information
remains a fundamental scientific challenge. In particular,
understanding and manipulating the base pairing, or secondary
structure, of single-stranded RNA sequences is crucial to advancing
knowledge about diseases caused by RNA viruses. The prediction of the
correct secondary structures of large RNAs is one of the unsolved
challenges of computational molecular biology. We are developing and
extending a new parallel multicore and scalable RNA structure
prediction program called GTfold. GTfold is one to two orders of
magnitude faster than the de facto standard programs and achieves
comparable accuracy of prediction. GTfold now optimally folds 11
picornaviral RNA sequences ranging from 7100 to 8200 nucleotides in 8
minutes, compared with the two months required by a previous study.
With the paradigm shift to multicore chips and parallelism, we must
extend and optimize GTfold to continue gaining performance with each
new generation of systems.
PROJECT PETA-APPS: Petascale Simulation for Understanding Whole-Genome Evolution
(Funded by NSF PetaApps program)
The advent of high-throughput sequencing and the consequent reduction
in cost of sequencing have produced an explosion in the amount of
genomic data of all types. Making biological sense of this genomic
data requires high-performance computing methods and an evolutionary
perspective, whether you are trying to understand how new functional
genes arise, why genes are organized into chromosomes, how species are
connected through the Tree of Life, or why arrangements are subject to
change. We have developped GRAPPA over many years to be the most
accurate method for genome rearrangement analysis. GRAPPA is a
massively parallel, state-of-the-art, freely-available, open source
phylogeny reconstruction code that reconstructs evolutionary histories
from thousands of organelle genomes. To tackle the growing scale of
available data, GRAPPA is being extended with new petascale algorithms
to scale to million-way parallelism and handle multi-chromosome
nuclear genomes. Developing and deploying GRAPPA for petascale data
is an exciting opportunity for algorithm development with real-world
impact. (See also the CSE feature.)
PROJECT GALAXY: Dynamically Scaling Parallel Execution for Cloud-based Bioinformatics
(Funded by NIH)
Increasingly inexpensive high-throughput DNA sequencing holds great
promise for biomedical research, but delivering upon this promise is
challenging. Biomedical researchers are not experts on compex
computational platforms necessary to tackle the volumes of data. We
address these problems by bringing together Galaxy, a system for
making complex computational analysis accessible and reproducible,
with “cloud computing”, an infrastructure model where computing
resources are purchased on demand as needed, making it possible for
investigators with no informatics expertise to perform data-intensive
analysis using cloud resources. The Galaxy tool model and execution
engine need extended to support dynamically scaled parallel execution
available in cloud resources. We are defining abstractions and
reusable components to ease integrating existing and future
tools. The landscape of analysis tools for NGS data is
changing rapidly along with the cloud resources available, so these
components must adapt quickly as new tools and best practices emerge.
PROJECT DOSA: Design Optimization Frameworks for High-Productivity Computing
(Funded by NSF)
High-performance computing (HPC) systems are taking a revolutionary
step forward with complex architectural designs that require
application programmers and compiler writers to perform the
challenging task of optimizing the computation in order to achieve
high performance. Realizing the gap between processor and memory
performance, several leading HPC vendors plan to incorporate into
their next-generation systems innovative architectural features that
alleviate this memory wall. These new architectural features include
hardware accelerators (e.g., reconfigurable logic such as FPGAs,
SIMD/vector processing units such as in IBM Cell, and graphics
processing units (GPUs)), adaptable general-purpose processors,
run-time performance advisors, capabilities for processing in the
memory subsystem, and power optimizations. With these innovations, the
multidimensional design space for optimizing applications is
huge. Software must be sensitive to data layout, cache parameters, and
data reuse, as well as dynamically changing resources, for highest
performance. Our research goal is to design a dynamic application
composition system that provides both a framework for optimizing
computational science and engineering applications and their
high-performance computing technologies and increased productivity.
PROJECT BURTON: Research Infrastructure for Multithreaded Computing Platforms
(Funded by NSF)
Computer scientists have long debated the merits of message-passing
versus shared-memory architectures for parallel systems. Message
passing with MPI on commodity (e.g. Linux) clusters dominates
high-performance computing today and has a strong infrastructure to
support development and research. The trend towards multicore
processors changes the situation. The major processor developers all
envision placing tens to hundreds of cores on a single die, each
running multiple threads. To take advantage of this, the CS community
must focus on how to develop efficient multithreaded programs in a
globally addressable memory space. Multithreaded computing needs to
grow a support infrastructure comparable to MPI quickly. As part of a
community of diverse groups of researchers with extensive experience
with shared-memory multithreading, we are developing the shared
infrastructure needed for multicore, multithreaded research and
development.
Future and on-going interests
High-performance computing on manycore and multicore archtectures
Rendering currenlty intractable problems feasible for researchers
in bioinformatics, genomics, and other scientific areas through
parallelism advanced algorithms
Exploring trade-offs in performance, energy efficiency, and
productivity in heterogeneous system architectures
Processing massive volumes of streaming data to provide low-latency
analytic results