New Methods of Reordering Sparse Tensors Show Measurable Increases in Speed
A team of researchers, which includes several School of Computational Science and Engineering (CSE) faculty, has created two tensor reordering methods, BFS-MCS and Lexi-Order, that are proven to be more effective for storing and operating efficiently with sparse tensors than current state-of-the-art approaches.
On modern multicore central processing units, the reordering algorithm Lexi-Order obtains up to 4.14 times speedup on sequential HiCOO-MTTKRP – a sparse tensor format with one of the most computationally expensive kernels in sparse tensor computations – and 11.88 times speedup on its parallel counterpart.
For those new to tensors, a tensor is an algebraic object related to space that provides a natural and concise mathematical framework for formulating and solving problems in computational science. These objects are represented in the form of cubes, with the X and Y axis depths ranging according to the size of data entries they can hold.
Tensors are fed information in the form of numeric data from a variety of applications but, oftentimes, these entries mainly consist of zeroes. In these instances, the tensors are considered sparse and do not need to be stored or explicitly computed on.
The primary investigator of this research, Jiajia Li, is a staff scientist at the High Performance Computing (HPC) group of Pacific Northwest National Laboratory (PNNL) and recent doctoral graduate of CSE. Li has led research on sparse tensors in the past, including Supercomputing 2018 Best Student Paper Award winning work, HiCOO, which this new work builds upon.
“A sparse tensor is often a natural way to represent a multifactor or multi-relational dataset, and has found numerous applications in data analysis and mining for healthcare, natural language processing, machine learning, and social network analytics, among many others,” she said.
“This paper formalizes the problem of reordering a sparse tensor to improve the spatial and temporal locality of operations within it, and proposes two reordering algorithms for this problem.”
According to Li, there are a variety of data structures and techniques for storing and operating efficiently with sparse tensors. One technique is to reorder the tensor, which means relabeling the indices – and, therefore, reorganizing the nonzero structure of the tensor.
One form of reordering explored in prior work is to sort the input tensor which can improve locality, but simultaneously aggravates load balance.
She said, “We observe that reordering increases imbalance by as much as 6.7 times in practice. In this paper, we propose improvements to the data structure and corresponding algorithms that address this problem. A new partition-based superblock scheduling is also proposed for the HiCOO format to improve load balance.”
The team of investigators for this paper include Li, PNNL Team Lead Scientist Kevin Barker, French National Center for Scientific Research Scientist Bora Ucar, and CSE Professors Umit Catalyurek, Jimeng Sun, and Rich Vuduc.
The code is released as part of Parallel Tensor Infrastructure (ParTI!) and can be found here.