Autotuning

Rich Vuduc uses a familiar routine to introduce his students to the concept of parallel programming. On the first day of class, the assistant professor hands a pile of papers to a student. That student takes one and passes the bundle to the next person, and so on. The stack moves along until everyone in the room has a paper.

"It took a long time for those papers to make their way around the room," he tells them. "Now suppose I had split the stack in half and started at two different places. It would have gone twice as fast."

Rich Vuduc quoteWithin his lab in the College of Computing's Computational Science and Engineering division, Vuduc is helping to facilitate the science of parallel computing. Although the study of parallel computing is more than 40 years old, its use has been largely restricted to a small number of programmers developing specialized applications—until recently.

For decades, computers relied on the sequential processing of single-core chips, but in recent years computer architects have discovered they were at or near the technology's performance limits. The emergence of multicore processors and the demand for more speed—which has never abated—are drawing new attention to parallel computing on a broader scale.

In parallel computing, a computational problem is broken up into smaller tasks, which are distributed among the several cores or microprocessors of an individual computer or network of computers. At the end, the products of these separate computations are combined in a single result.

Autotuning

Rich Vuduc, assistant professor in Computational Science and Engineering, says a universal autotuner for translating any program to run in parallel is still a ways off.

 

"[People are] hard-wired to do things sequentially, which makes learning how to program in parallel very hard," Vuduc says.

In addition, a program that runs well on a self-contained parallel system like a Macintosh laptop (which contains about 50 parallel processors) may not work well on a parallel configuration of several different kinds of computers, such as one might find at a supercomputing center, Vuduc says.

Autotuners mimic human programmers

At present, transitioning a sequential program to a parallel one requires an iterative, manual process called "tuning." Vuduc's research is aimed at automating the time-consuming procedure so that it becomes as easy to write efficient and portable parallel programs as it is to write sequential programs.

"The idea of autotuning is to develop technologies that let you write the program one way, and when you move the program to another machine, you let an autotuner figure out how to change the program to run on that machine as well," he explains. "The autotuner mimics what a human programmer does: It runs the program, finds the bottlenecks, makes changes and runs it again."

While this sounds straightforward, Vuduc says the design of anything approaching a universal autotuner is a long way off. For now, he and other researchers in the field are focusing on the modification of programs used in high-end scientific computing such as astronomy.

Besides modifying legacy programs, high-performance parallel computing will require new languages and models to analyze massive datasets. Vuduc and fellow CSE Assistant Professor Alex Gray are co-principal investigators in the development of a new programming model that enables rapid automatic implementation of customized parallel data analysis and data-mining tasks, all with minimal code writing.

Called the Tree-based High-Order Reduce (THOR) model, the project allows the user to specify certain mathematical parameters, from which THOR automatically produces scalable, autotuned, parallel programs.

These programs will provide for more sophisticated computation in areas that generate tremendous amounts of data, such as computer vision or voice recognition, and thus are well-suited to the power of parallel computing, Vuduc says.

"Climate modeling is another example," he says. "The interaction among all the things that we believe have some effect on climate is an incredibly complex process to simulate. So to understand how climate changes will develop over the next 50 years requires very detailed analysis of huge amounts of data."