Distributed QR decomposition framework for training Support Vector Machines
Jyotikrishna Dass, V. N. S. Prithvi Sakuru, Vivek Sarin and Rabi N. Mahapatra
Texas A&M University, Texas A&M University, Texas A&M University, Texas A&M University

Support Vector Machines (SVM) belong to class of supervised machine learning algorithms with applications in classification and regression analysis. SVM training is modeled as a convex optimization problem that is computationally tedious and has large memory requirements. Specifically, it is a quadratic programming problem which scales rapidly with the training set size rather than the dimensionality of the feature space. In this work, we first present a novel QR decomposition framework (QRSVM) to efficiently model and solve a large scale SVM problem by capitalizing on low-rank representations of the full kernel matrix rather than solving the problem as a sequence of smaller sub-problems. The low-rank structure of the kernel matrix is leveraged to transform the dense matrix into one with a sparse and separable structure. The modified SVM problem requires significantly lesser memory and computation. Our approach scales linearly with the training set size which makes it applicable to large datasets. This motivates towards our another contribution; exploring a distributed QRSVM framework to solve large-scale SVM classification problems in parallel across a cluster of computing nodes. We also derive an optimal step size for fast convergence of the dual ascent method which is used to solve the quadratic programming problem.