Web Search and Text Mining

CS 8803 WST: Web Search and Text Mining

Fall, 2007


Lecture: 3 hours, TR 3:05pm - 4:25pm, Howey (Physics) L3

Office hours: T 1:30pm - 2:30pm.

Instructor: Hongyuan Zha, Office: 1314 KACB, Phone: 404-385-1491


Course Description

We all have experience using search engines: typing a query into the search box and browsing the result sets to narrow down to the documents we need. In fact, Web search is the most popular online activities second only to email use, and it drives the business of some of the most successful Web companies such as Google and Yahoo! Commercial search engines are complicated engineering systems, and one of the purposes of this course is to take you behind the scenes to explore the enabling scientific and algorithmic advances. Besides issues in Web search, we will also explore other text mining methods such as document clustering and classification, information extraction, and many other ways to process and analyze free-text data and their use in bioscience and health information technology. We will start with very basic notions in information retrieval at a fairly slow pace, explain some of the fundamental algorithms, and eventually touch upon the research frontiers. The prerequisites for this course are very modest, basic exposure to computing and algorithms, and basic knowledge of calculus, linear algebra and statistics. The course will also have some programming assignments.

List of Topics


Class Policies

Grading

  1. Homework and Projects: 50%
  2. Participation and presentation: 50%

T-Square