Machine-Learning Based Performance Estimation for Distributed Parallel Applications in Virtualized Heterogeneous Clusters
Seontae Kim, Nguyen Pham, Woongki Baek and Young-Ri Choi

In a virtualized heterogeneous cluster, for a distributed parallel application which runs in multiple virtual machines (VMs) concurrently, there are a huge number of possible ways to place its VMs. This paper investigates a performance estimation technique for distributed parallel applications in virtualized heterogeneous clusters. We first analyze the effects of different VM configurations on the performance of various distributed parallel applications. We then present a machinelearning based performance model for a distributed parallel application. Using a heterogeneous cluster with two different types of nodes, we show that our machine-learning based models can estimate the runtimes of distributed parallel applications with modest error rates.