Yang Zhou

School of Computer Science
College of Computing
Georgia Institute of Technology

3012, Klaus Advanced Computing Building
266 Ferst Drive, Atlanta, GA 30332-0765

Email: yzhou86 At gatech Dot edu


I am currently a Ph.D. student in the College of Computing at the Georgia Institute of Technology, under the supervision of Prof. Ling Liu. My research interests lie in the areas of big data algorithms and systems, including data mining, machine learning, parallel systems, distributed systems, databases, cloud computing, and bioinformatics.

Research Summary
My research is dedicated to developing a wide spectrum of comprehensive solutions that span algorithm, system, and application: (1) big data mining and learning algorithms; (2) big data processing systems; (3) domain driven knowledge discovery frameworks; and (4) cloud-based big data management and analytics systems.

My research contributions have been published in top venues of data mining (SIGKDD, ICDM, TKDD, DMKD), database systems (VLDB), high performance computing (HPDC, SC), networking (JSAC) and software engineering (ISSTA).

Selected Publications (Full List by Topic) (DBLP) (Google Scholar) (ResearchGate)
  • Yang Zhou, Ling Liu, Sangeetha Seshadri, and Lawrence Chiu. Analyzing Enterprise Storage Workloads with Graph Modeling and Clustering. JSAC 2016. [pdf]

  • Yang Zhou, Ling Liu, Kisung Lee, and Qi Zhang. GraphTwist: Fast Iterative Graph Computation with Two-tier Optimizations. VLDB 2015. [pdf]

  • Yang Zhou, Ling Liu, and David Buttler. Integrating Vertex-centric Clustering with Edge-centric Clustering for Meta Path Graph Analysis. SIGKDD 2015. [pdf]

  • Yang Zhou and Ling Liu. Social Influence Based Clustering and Optimization over Heterogeneous Information Networks. TKDD 2015. [pdf]

  • Yang Zhou, Ling Liu, Kisung Lee, Calton Pu, and Qi Zhang. Fast Iterative Graph Computation with Resource Aware Graph Parallel Abstractions. HPDC 2015. (one of only 19 full papers). [pdf]

  • Yang Zhou and Ling Liu. Activity-edge Centric Multi-label Classification for Mining Heterogeneous Information Networks. SIGKDD 2014. [pdf] [slides]

  • Yang Zhou and Ling Liu. Social Influence Based Clustering of Heterogeneous Information Networks. SIGKDD 2013. [pdf] [slides]

  • Yang Zhou and Ling Liu. Clustering Analysis in Large Graphs with Rich Attributes. Dawn E. Holmes and Lakhmi C. Jain (Eds.), Data Mining: Foundations and Intelligent Paradigms: Volume 1: Clustering, Association and Classification, 2011. [link]

  • Yang Zhou, Hong Cheng, and Jeffrey Xu Yu. Clustering Large Attributed Graphs: An Efficient Incremental Approach. ICDM 2010. [pdf] [slides]

  • Yang Zhou, Hong Cheng, and Jeffrey Xu Yu. Graph Clustering Based on Structural/Attribute Similarities. VLDB 2009. [pdf] [slides]

Research Projects
  • GraphLens derives new insights into storage strategy planning and data placement guidance for enterprise storage systems.
  • GraphMap is a distributed graph processing system to maximize access locality and speed up distributed graph computations.
  • VEPathCluster is a meta path graph clustering algorithm to tightly integrate vertex clustering and edge clustering by mutually enhancing each other.
  • GraphTwist is a two-tier parallel graph processing system to accelerate graph computations on a shared-memory multiprocessor computer.
  • AEClass is an algorithm for activity-edge centric multi-label classification of heterogeneous multigraph with structure affinity and label vicinity.
  • SI-Cluster is a social influence based clustering algorithm for analyzing heterogeneous information networks with both self-influence and co-influence.
  • SPA is an customizable RDF data partitioning framework to support efficient distributed SPARQL query processing of big RDF graph data in the cloud.
  • ServiceRank is an intelligent tool to facilitate cloud service discovery and acquisition during server configuration.
  • Top-K LEAP helps developers discover software bugs with their location and context information through discriminative graph mining.

  • KDD Rising Star, 2016.
  • Ram Kumar Fellowship, 2015.
  • VLDB Intel Free Registration, 2015.
  • SIGKDD Student Travel Award, 2015.
  • Georgia Tech Graduate Student Travel Award, 2014.
  • Winner of Invention Day Patent Marathon, IBM Almaden Research Center, 2013.
  • First Patent Application Invention Achievement Award, IBM, 2013.
  • ICWS Student Travel Award, 2013.
  • ICDM Student Travel Award, 2010.

  • Interactive Acquisition of Remote Services, US Patent: 20,140,337,010.
  • Complex Service Network Ranking and Clustering, Filed: YOR820120950.

Professional Services
  • PC member: ICDM 2017.
  • Invited reviewer: Computational Intelligence 2016, DMKD 2016, GigaScience 2017, JCS 2016, JPDC 2016, 2017, Machine Learning 2015, PLOS ONE 2017, TDSC 2014, 2015, 2016, TKDD 2015, 2016, 2017, TNSE 2017, TOIT 2016, TSC 2015, 2017, TWEB 2015, 2016, WWWJ 2016.

  • Guest Lecturer, Large-scale Parallel and Distributed Graph Processing Frameworks, CS8803 Big Data Systems and Analytics, October 2015.
  • Guest Lecturer, Clustering Analysis in Heterogeneous Information Networks, CS4440 Emerging Database Technologies, April 2015.
  • Teaching Assistant, CS4675/CS6675 Advanced Internet Computing and Application Development, Spring 2014.
  • Teaching Assistant, CS4365/CS6365 Introduction to Enterprise Computing, Spring 2012.
  • Teaching Assistant, CS4400 Introduction to Database Systems, Fall 2011.
  • Teaching Assistant, CS4400/CS8803 Introduction to Database Systems, Spring 2011, Summer 2016.