David A. Bader
IEEE Fellow
AAAS Fellow
Professor
College of Computing
Georgia Tech
Atlanta, GA 30332


 
 

 

Large Scale Complex Network Analysis using the Hybrid Combination of a MapReduce cluster and a Highly Multithreaded System

Complex networks capture interactions among entities in various application areas in a graph representation. Analyzing large scale complex networks often answers important questions -- e.g. estimate the spread of epidemic diseases -- but also imposes computing challenges mainly due to large volumes of data and the irregular structure of the graphs. In this paper, we aim to solve such a challenge: finding relationships in a subgraph extracted from the data. We solve this problem using three different platforms: a MapReduce cluster, a highly multithreaded system, and a hybrid system of the two. The MapReduce cluster and the highly multithreaded system reveal limitations in efficiently solving this problem, whereas the hybrid system exploits the strengths of the two in a synergistic way and solves the problem at hand. In particular, once the subgraph is extracted and loaded into memory, the hybrid system analyzes the subgraph five orders of magnitude faster than the MapReduce cluster.

Publication History

Versions of this paper appeared as:
  1. S. Kang and D.A. Bader, ``Large Scale Complex Network Analysis using the Hybrid Combination of a MapReduce cluster and a Highly Multithreaded System,'' 4th Workshop on Multithreaded Architectures and Applications (MTAAP), Atlanta, GA, April 23, 2010.

Download this report in Adobe PDF


 
 

Last updated: February 21, 2010

 




Computational Biology



Parallel Computing



Combinatorics