CSE 6240: Web Search And Text Mining

Georgia Tech / Spring 2021

This course introduces the fundamental knowledge of Web Mining. The topics covered in this course broadly lie in network science, text analysis, recommender systems, and social media analysis. The emphasis is on both the theoretical and empirical aspects. Students will be introduced to machine learning techniques and data mining tools apt to reveal insights from large-scale web-based datasets.

  • Instructor: Prof. Srijan Kumar
  • TAs: Sandeep Soni, Sejoon Oh, Kezhen Zhang
  • Office Hours:
    • Srijan Kumar: Thursday 10am-11am
    • Sandeep Soni: Wednesdays 10am-11am
    • Sejoon Oh: Tuesdays 11am-noon
    • Kezhen Zhang: Monday 1pm-2pm
  • Lectures: are on Monday and Wednesday 2:00 pm - 3:15 pm
  • Piazza: Enroll here. The students should use Piazza for all course-related queries.

Schedule and Slides

The schedule is subject to change. Reading materials will be posted periodically below.

The time for all deadlines used in this course is 23:59 Eastern Time (11:59 PM ET).

Date
Description Readings and Notes Events Deadlines
Jan 18
No class - MLK Holiday
Jan 20
Introduction
Jan 25
Web Networks and Properties Graph structure in the Web Project Teams due
Jan 27
Random Graph Models 1. Small world phenomenon
2. Collective dynamics of ‘small-world’ networks
Feb 1 Link Analysis (PageRank and HITS) 1. Book chapter 'Link Analysis' from 'Introduction to Information Retrieval'
2. The PageRank Citation Ranking: Bringing Order to the Web
3. Authoritative Sources in a Hyperlinked Environment
Feb 3
Personalized PageRank and Recommendations 1. Random walk with restart: fast solutions and applications
2. Pixie: A System for Recommending 3+ Billion Items to 200+ Million Users in Real-Time
Feb 8 Cascades, Contagion, and Epidemics 1. Epidemiological Modeling of News and Rumors on Twitter HW1 out
Project Proposal due
Feb 10
Node Representation Learning 1. DeepWalk: Online Learning of Social Representations
2. node2vec: Scalable Feature Learning for Networks
Feb 15
Graph Neural Networks 1. Semi-Supervised Classification with Graph Convolutional Networks
2. Blog post
3. Inductive Representation Learning on Large Graphs
Feb 17
Message Passing and Node Classification 1. REV2: Fraudulent User Prediction in Rating Platforms HW1 due
Feb 22
Belief Propagation and Applications 1. NetProbe: A Fast and Scalable System for Fraud Detection in Online Auction Networks HW2 out
Feb 24
Graph and Knowledge Graph Representation Learning 1. Anonymous Walk Embedding
2. Translating Embeddings for Modeling Multi-relational Data
3. Learning entity and relation embeddings for knowledge graph completion
Mar 1
Discrete-time Temporal Graph Representation Learning 1. EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs
2. DynGEM: Deep Embedding Method for Dynamic Graphs
Mar 3
Continuous-time Temporal Graph Representation Learning 1. DyRep: Learning Representations over Dynamic Graphs
HW2 due
Mar 8
Project Discussions and Feedback
Mar 10
Recommender Systems 1. MMDS book chapter
Mar 15
Recommender Systems II 1. MMDS book chapter
Mar 17
Deep Learning-Based Recommender Systems 1. Deep Learning Based Recommender System: A Survey and New Perspectives
2. Recurrent Recommender Networks
3. Latent Cross: Making Use of Context in Recurrent Recommender Systems

Mar 22
Deep Learning-Based Recommender Systems II 1. Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks HW3 out (moved)
Project Milestone due (moved)
Mar 24
Spring Break - No class
Mar 29
Self reading - No class
Mar 31
Take home exam
Apr 5 Web Search and Information Retrieval HW4 out (cancelled)
Apr 7 Information Retrieval (continued)
Apr 12 Project team 1:1 meetings with instructor
HW3 due (moved)
Apr 14 Project team 1:1 meetings with instructor
Apr 19 Advanced topics
Apr 21 Guest Lecture: Dr. Khalifeh Al Jadda, Home Depot
Apr 26
Project Final Reports are due
Apr 28
Project Presentations are due
Apr 30
Peer grades are due

Project