Yang Zhou

School of Computer Science
College of Computing
Georgia Institute of Technology

3012, Klaus Advanced Computing Building
266 Ferst Drive, Atlanta, GA 30332-0765

Email: yzhou86 At gatech Dot edu


I am currently a Ph.D. student in the College of Computing at the Georgia Institute of Technology, under the supervision of Prof. Ling Liu. My research interests lie in the areas of big data algorithms and systems, including data mining, machine learning, parallel systems, distributed systems, databases, cloud computing, and bioinformatics.

Research Summary
My research is dedicated to developing a wide spectrum of comprehensive solutions that span algorithm, system, and application: (1) big data mining and learning algorithms; (2) big data processing systems; (3) domain driven knowledge discovery frameworks; and (4) cloud-based big data management and analytics systems.

My research contributions have been published in top venues of data mining (SIGKDD, ICDM, TKDD, DMKD), database systems (VLDB), high performance computing (HPDC, SC), networking (JSAC) and software engineering (ISSTA).

Selected Publications (Full List by Topic) (DBLP) (Google Scholar) (ResearchGate)
  • Yang Zhou, Ling Liu, Sangeetha Seshadri, and Lawrence Chiu. Analyzing Enterprise Storage Workloads with Graph Modeling and Clustering. JSAC 2016. [pdf]

  • Yang Zhou, Ling Liu, Kisung Lee, and Qi Zhang. GraphTwist: Fast Iterative Graph Computation with Two-tier Optimizations. VLDB 2015. [pdf]

  • Yang Zhou, Ling Liu, and David Buttler. Integrating Vertex-centric Clustering with Edge-centric Clustering for Meta Path Graph Analysis. SIGKDD 2015. [pdf]

  • Yang Zhou and Ling Liu. Social Influence Based Clustering and Optimization over Heterogeneous Information Networks. TKDD 2015. [pdf]

  • Yang Zhou, Ling Liu, Kisung Lee, Calton Pu, and Qi Zhang. Fast Iterative Graph Computation with Resource Aware Graph Parallel Abstractions. HPDC 2015. (one of only 19 full papers). [pdf]

  • Yang Zhou and Ling Liu. Activity-edge Centric Multi-label Classification for Mining Heterogeneous Information Networks. SIGKDD 2014. [pdf] [slides]

  • Yang Zhou and Ling Liu. Social Influence Based Clustering of Heterogeneous Information Networks. SIGKDD 2013. [pdf] [slides]

  • Yang Zhou and Ling Liu. Clustering Analysis in Large Graphs with Rich Attributes. Dawn E. Holmes and Lakhmi C. Jain (Eds.), Data Mining: Foundations and Intelligent Paradigms: Volume 1: Clustering, Association and Classification, 2011. [link]

  • Yang Zhou, Hong Cheng, and Jeffrey Xu Yu. Clustering Large Attributed Graphs: An Efficient Incremental Approach. ICDM 2010. [pdf] [slides]

  • Yang Zhou, Hong Cheng, and Jeffrey Xu Yu. Graph Clustering Based on Structural/Attribute Similarities. VLDB 2009. [pdf] [slides]

Research Projects
  • GraphLens derives new insights into storage strategy planning and data placement guidance for enterprise storage systems.
  • GraphMap is a distributed graph processing system to maximize access locality and speed up distributed graph computations.
  • VEPathCluster is a meta path graph clustering algorithm to tightly integrate vertex clustering and edge clustering by mutually enhancing each other.
  • GraphTwist is a two-tier parallel graph processing system to accelerate graph computations on a shared-memory multiprocessor computer.
  • AEClass is an algorithm for activity-edge centric multi-label classification of heterogeneous multigraph with structure affinity and label vicinity.
  • SI-Cluster is a social influence based clustering algorithm for analyzing heterogeneous information networks with both self-influence and co-influence.
  • SPA is an customizable RDF data partitioning framework to support efficient distributed SPARQL query processing of big RDF graph data in the cloud.
  • ServiceRank is an intelligent tool to facilitate cloud service discovery and acquisition during server configuration.
  • Top-K LEAP helps developers discover software bugs with their location and context information through discriminative graph mining.

  • KDD Rising Star, 2016.
  • Ram Kumar Fellowship, 2015.
  • VLDB Intel Free Registration, 2015.
  • SIGKDD Student Travel Award, 2015.
  • Georgia Tech Graduate Student Travel Award, 2014.
  • Winner of Invention Day Patent Marathon, IBM Almaden Research Center, 2013.
  • First Patent Application Invention Achievement Award, IBM, 2013.
  • ICWS Student Travel Award, 2013.
  • ICDM Student Travel Award, 2010.

  • Interactive Acquisition of Remote Services, US Patent: 20,140,337,010.
  • Complex Service Network Ranking and Clustering, Filed: YOR820120950.

Professional Services
  • Invited reviewer of ACM Transactions on Internet Technology (TOIT).
  • Invited reviewer of ACM Transactions on Knowledge Discovery from Data (TKDD).
  • Invited reviewer of ACM Transactions on the Web (TWEB).
  • Invited reviewer of Computational Intelligence.
  • Invited reviewer of Data Mining and Knowledge Discovery (DMKD).
  • Invited reviewer of IEEE Transactions on Dependable and Secure Computing (TDSC).
  • Invited reviewer of IEEE Transactions on Services Computing (TSC).
  • Invited reviewer of Journal of Computational Science (JCS).
  • Invited reviewer of Journal of Parallel and Distributed Computing (JPDC).
  • Invited reviewer of Machine Learning.
  • Invited reviewer of World Wide Web Journal (WWWJ).

  • Guest Lecturer, Large-scale Parallel and Distributed Graph Processing Frameworks, CS8803 Big Data Systems and Analytics, October 2015.
  • Guest Lecturer, Clustering Analysis in Heterogeneous Information Networks, CS4440 Emerging Database Technologies, April 2015.
  • Teaching Assistant, CS4675/CS6675 Advanced Internet Computing and Application Development, Spring 2014.
  • Teaching Assistant, CS4365/CS6365 Introduction to Enterprise Computing, Spring 2012.
  • Teaching Assistant, CS4400 Introduction to Database Systems, Fall 2011.
  • Teaching Assistant, CS4400/CS8803 Introduction to Database Systems, Spring 2011.