Professor Alex Aiken delivered a School of Computer Science Distinguished Lecture on Wednesday, Jan. 31. His talk on programming models for large-scale parallel machines, Legion — Programming Heterogeneous, Distributed Parallel Machine, packed the auditorium in the Klaus Advanced Computing Building.
Aiken is the chair of computer science at Stanford and an ACM fellow. His research focuses on programming languages. The talk highlighted a new parallel programming system, known as Legion, which works on heterogeneous, hierarchical, and distributed parallel machines.
Legion is relevant in this world of supercomputers that require a lot of heterogeneity to function. A supercomputer can include central processing units (CPU), graphics processing units (GPU), field programming gate arrays, and more — all with different memory sizes and speeds. Yet, because latency – the delay between instructions for data transfer and the transfer itself –is fixed in these systems, the power of computation is often faster than the ability to move data around. This can make it difficult to effectively program these high-end machines.
The three primary goals in programming the current generation of supercomputers are:
- High performance
- Performance portability
Although these goals can be met today, they come at a great cost. It can take a decade of software investment to develop one application, yet the machines change every three to four years. Currently, it is the programmer’s responsibility to keep up with what can often be tedious and time-consuming work.
“The real problem is not describing the parallelism; the problem is really the data and the movement of the data within these machines,” Aiken said.
The goal of Legion is to shift some of this responsibility on to the programming system. When the organization of a program’s data is easily matched to a machine’s memory, it helps meet these performance goals.
Legion is a programming model and runtime system that describes hierarchical organizations for data and computation at an abstract level. A separate mapping interface lets programmers control how data and computation are placed onto the memories and processors of a machine.
The results are effective. According to Aiken, Legion is competitive on a single graph, runs 20 times faster on GPUs, and is much faster than CPU systems.
“You get a win on every single network we’ve tried just by doing partitioning and networking in a different way than is currently represented on other systems,” Aiken said.
The new programming model is already being used in the fields of chemistry and physics to model chemical reactions.
Ultimately, Aiken presents a new way of looking at programming. If a programmer is willing to put in the effort to organize data up front, it will approve efficiency overall. “It’s all about program productivity. How much effort are you willing to expend to get the desired level of performance?” Aiken said.