CS 4495/7495
Computer Vision

Fall 2002
College of Computing 101
MWF 11:00 - 12:00 noon

Organization        Syllabus        Problem Sets        Final Projects


This course will provide an introduction to computer vision, tailored for undergraduate and graduate students who are interested in conducting research in this field. It is the introductory vision course in the curriculum for Computational Perception and Robotics in the College of Computing. This course is designed to provide

Instructor

Jim Rehg
Email: rehg@cc.gatech.edu
Office: CoC Bldg (CCB) 253
Office hours: After class
Phone: 404-894-9105 (email preferred)

Teaching Assistant

Ping Wang
Email: pingwang@cc.gatech.edu
Office hours: CCB 259, TR 9:00-10:00 AM

Text

Computer Vision: A Modern Approach
by David Forsyth and Jean Ponce
Prentice Hall 2002

Prerequisites

Familiarity with linear algebra and statistics, as well as basic signal and image processing will be essential. Equally important is familiarity with C++ programming. I will be providing some additional tutorial sessions on these subjects in the first few weeks of the semester. However, these tutorials will not be a substitute for adequate background in the prerequisite areas of study. Permission of the instructor in special cases only.


Organization

Grades will be assessed as follows:

Problem Sets 47%
Midterm Exam 18%
Final Project 30%
Participation 5%

There will be a take-home mid-term exam that will not require any programming.

Problem Sets

There will be approximately 6 problem sets of one or two weeks duration. A number of them will require implementation of a vision algorithm or small vision system using the Microsoft Visual C++ programming environment. In particular, you will learn to use a toolkit known as Viper which is a standard environment for vision programming here in the College.

Instructions on downloading the Viper toolkit and linking to the appropriate libraries are available here. Students who want to run Viper without mounting CoC file systems can download (33 Mb) all of the required libraries. There are 20 PC's on the first floor of CCB which are available for class use. 10 of them are in the States cluster, and 10 are in the UPL cluster right next door. Each machine has a sign saying that students in 4495/7495 have priority use. This means that students who are not in the class should surrender the machine to you if you need it.

Collaboration on problem sets is encouraged at the "white board interaction" level. That is, share ideas and technical conversation, but write your own code. A few problem sets may require you to work in teams of 2-3. I plan to grade and return problem sets promptly. As a result, I will require all problem sets to be turned in on time.

No late submissions will be accepted without prior permission of the instructor.

Instructions for turning in the problem sets:

  1. Make zip file containing the required material. Name the file LastName_FirstName_PSN.zip (where you fill in your name and the problem set number appropriately).
  2. Copy this file to /net/hi21/cs7495/PSN (where N is the PS number)
  3. Enter that directory and execute 'chgrp projdisp LastName_FirstName_PSN.zip' and 'chmod 740 LastName_FirstName_PSN.zip'. This will allow us to read your file but prevent others in the class from reading it. Note that you cannot do this under Solaris and will have to telnet to a Linux box or an SGI machine. If you do not have CoC access then email the file to Ping Wang.

Grading

Undergrads and grads will be graded on separate curves. I will expect more from a graduate project than an undergraduate project.

There are a number of ways to earn participation credit (and even extra credit): Participate in class, find bugs in the course text, and donate some data to my research project! (More about the latter in class).


Resources

The following supplemental texts may be helpful:

We will have the following tutorials on Fridays from 4-5 pm (attendance is optional):

There is a class newsgroup git.cc.class.cs7495 which is titled "Computer Vision". Please post here with questions or comments on the problem sets that you want to share with the other students. I will send all class announcements by email.


Syllabus

  1. Introduction and administrative issues [8/19/02]
  2. Themes and Issues [8/21/02, 8/23/02]
  3. Stereo Matching
  4. Camera Models
  5. Photometry
  6. Image Representations: Filtering, Pyramids, and Edge Detection
  7. The role of statistics
  8. Depth and shape recovery
  9. Motion
  10. Object recognition

Problem Sets

PS 1: Out Aug 23; Due Sept. 1: (Intro to Viper)  (2% of grade)

PS 2: Out Sept 10; Due Sept 24: (Stereo Matching)  (25% of grade)

PS 3: Out Oct 22; Due Oct 30: (Camera Calibration)  (20% of grade)

Midterm: Out Nov 4; Due Nov 5  (18% of grade)


Final Projects

Important Dates

The final project represents 35% of course grade and should therefore represent a significant effort. I will expect more from graduate student projects than undergraduate ones.

Project Proposal

Please write 3 paragraphs describing your project. Your proposal should answer the following questions:

Project Report

Project pages must be created by midnight Friday Dec 5. The project page must be accessible from the class swiki (either as a link to a page on the CoC server or as a separate page on the swiki itself). We need your pages in place to determine the order of the project presentations.

The project pages must contain the final project reports by midnight Sunday Dec 8.

Your project report should address the following:

Project Presentation

Project presentations will be given in CCB 101 from 1:00 - 4:00 PM on Tuesday Dec 10 (this is a bit earlier than the final exam period for our class). The order of presentation will be posted on the Swiki in advance. I expect everyone to be present for all of the presentations. Presentations will last from 15-20 minutes (the exact duration will be announced later). The duration of the presentations will be strictly enforced.

Your presentation should cover the material in your project report, but in a suitable form for a talk. It should briefly describe the problem and approach, present the results, and then present the analysis of the results. As with the report, this last part is the most important. It is fine if the presentation is also a web page, but please prepare a separate page from the report itself so that you can give a proper talk.

I encourage you to make your presentations accessible from the class swiki. Since the swiki does go down sometimes, I would suggest hosting your presentation on the CoC server and linking it from the swiki. In any event, you should bring a back-up copy of your presentation with you in case there is some problem with the network. I will be leaving town immediately after the presentations, so there will be no other opportunity to make them. The PC in the classroom will be used to host the presentations.

Some Final Project Choices

Here are some sample projects which I believe to be doable in the time available (some are perhaps more likely to be grad projects than undergrad ones). Of course it is also fine if you propose one of your own, perhaps based on your current research activity.

1. Color Constancy

Implement algorithm 6.1 from your text for determining the lightness of image patches. Exercises 6.8 and 6.9 are relevant here as well. Explore both the gray world and brightest patch approaches to establishing an absolute reference. Test your algorithm on Mondarin images as well as more complex scenes.

2. Real-Time Tracking

The Viper environment includes some infrastructure for capturing live video from a Sony Elura camera which can be checked out of the CPL lab (talk to me about it). This project would explore some tracking algorithms using this live video as input. This is a good way to quickly get intuition for what works (and doesn't work). Start with a standard patch-based SSD tracker, as we will describe in class. Modify this to use particle filtering to improve the robustness of the tracker to noise and background clutter. I can provide references to this material to anyone who is interested.

3. Loopy Stereo

In a recent paper by Shum et. al., a existing Bayesian inference algorithm known as loopy propagation was applied to the problem of stereo matching. This project would extend your stereo system developed in PS #2 to incorporate the loopy propagation idea and evaluate how well it works. Contact me for more details.

4. Object Recognition

Develop a program for detecting coins in images using Hough transform. Follow the instructions. Extend your program to recognize dollar bills as well as coins (using the same edge-based technique). Extend your program further to reason about possible occlusions of coins by dollar bills.

5. Texture Synthesis

Implement the texture synthesis procedure described in the paper by Heeger and Bergen. Validate your algorithm on the textures from their paper, and then explore the limitations of the algorithm. Grad students would be expected to also try using this framework for texture matching/discrimination.