David A. Bader
IEEE Fellow
AAAS Fellow
Professor
College of Computing
Georgia Tech
Atlanta, GA 30332


 
 

 

Multithreaded Community Monitoring for Massive Streaming Graph Data

Analyzing static snapshots of massive, graph-structured data cannot keep pace with the growth of social networks, financial transactions, and other valuable data sources. Current state-of-the-art industrial methods analyze these streaming sources using only simple, aggregate metrics. There are few existing scalable algorithms for monitoring complex global quantities like decomposition into community structure. Using our framework STING, we present the first known parallel algorithm specifically for monitoring communities in this massive, streaming, graph-structured data. Our algorithm performs incremental re-agglomeration rather than starting from scratch after each batch of changes, reducing the problem's size to that of the change rather than the entire graph. We analyze our initial implementation's performance on multithreaded platforms for execution time and latency. On an Intel-based multithreaded platform, our algorithm handles up to 100 million updates per second on social networks with one to 30 million edges, providing a speed-up from 4x to 3700x over statically recomputing the decomposition after each batch of changes. Possibly because of our artificial graph generator, resulting communities' modularity varies little from the initial graph.

Publication History

Versions of this paper appeared as:
  1. J. Riedy and D.A. Bader, ``Multithreaded Community Monitoring for Massive Streaming Graph Data,'' 7th Workshop on Multithreaded Architectures and Applications (MTAAP), Boston, MA, May 24, 2013.

Download this report in Adobe PDF


 
 

Last updated: May 24, 2013

 




Computational Biology



Parallel Computing



Combinatorics