Organization Syllabus Problem Sets Final Projects
This course will provide an introduction to computer vision, tailored for undergraduate and graduate students who are interested in conducting research in this field. It is the introductory vision course in the curriculum for Computational Perception and Robotics in the College of Computing. This course is designed to provide
Jim Rehg
Email: rehg@cc.gatech.edu
Office: CoC Bldg (CCB) 253
Office hours: After class
Phone: 404-894-9105 (email preferred)
Ping Wang
Email: pingwang@cc.gatech.edu
Office hours: CCB 259, TR 9:00-10:00 AM
Computer Vision: A Modern Approach
by David Forsyth and Jean Ponce
Prentice Hall 2002
Familiarity with linear algebra and statistics, as well as basic signal and image processing will be essential. Equally important is familiarity with C++ programming. I will be providing some additional tutorial sessions on these subjects in the first few weeks of the semester. However, these tutorials will not be a substitute for adequate background in the prerequisite areas of study. Permission of the instructor in special cases only.
Grades will be assessed as follows:
| Problem Sets | 47% |
| Midterm Exam | 18% |
| Final Project | 30% |
| Participation | 5% |
There will be a take-home mid-term exam that will not require any programming.
There will be approximately 6 problem sets of one or two weeks duration. A number of them will require implementation of a vision algorithm or small vision system using the Microsoft Visual C++ programming environment. In particular, you will learn to use a toolkit known as Viper which is a standard environment for vision programming here in the College.
Instructions on downloading the Viper toolkit and linking to the appropriate libraries are available here. Students who want to run Viper without mounting CoC file systems can download (33 Mb) all of the required libraries. There are 20 PC's on the first floor of CCB which are available for class use. 10 of them are in the States cluster, and 10 are in the UPL cluster right next door. Each machine has a sign saying that students in 4495/7495 have priority use. This means that students who are not in the class should surrender the machine to you if you need it.
Collaboration on problem sets is encouraged at the "white board interaction" level. That is, share ideas and technical conversation, but write your own code. A few problem sets may require you to work in teams of 2-3. I plan to grade and return problem sets promptly. As a result, I will require all problem sets to be turned in on time.
No late submissions will be accepted without prior permission of the instructor.
Instructions for turning in the problem sets:
Undergrads and grads will be graded on separate curves. I will expect more from a graduate project than an undergraduate project.
There are a number of ways to earn participation credit (and even extra credit): Participate in class, find bugs in the course text, and donate some data to my research project! (More about the latter in class).
The following supplemental texts may be helpful:
We will have the following tutorials on Fridays from 4-5 pm (attendance is optional):
There is a class newsgroup git.cc.class.cs7495 which is titled "Computer Vision". Please post here with questions or comments on the problem sets that you want to share with the other students. I will send all class announcements by email.
PS 1: Out Aug 23; Due Sept. 1: (Intro to Viper) (2% of grade)
PS 2: Out Sept 10; Due Sept 24: (Stereo Matching) (25% of grade)
PS 3: Out Oct 22; Due Oct 30: (Camera Calibration) (20% of grade)
Midterm: Out Nov 4; Due Nov 5 (18% of grade)
Important Dates
The final project represents 35% of course grade and should therefore represent a significant effort. I will expect more from graduate student projects than undergraduate ones.
Project Proposal
Please write 3 paragraphs describing your project. Your proposal should answer the following questions:
Project Report
Project pages must be created by midnight Friday Dec 5. The project page must be accessible from the class swiki (either as a link to a page on the CoC server or as a separate page on the swiki itself). We need your pages in place to determine the order of the project presentations.
The project pages must contain the final project reports by midnight Sunday Dec 8.
Your project report should address the following:
Project Presentation
Project presentations will be given in CCB 101 from 1:00 - 4:00 PM on Tuesday Dec 10 (this is a bit earlier than the final exam period for our class). The order of presentation will be posted on the Swiki in advance. I expect everyone to be present for all of the presentations. Presentations will last from 15-20 minutes (the exact duration will be announced later). The duration of the presentations will be strictly enforced.
Your presentation should cover the material in your project report, but in a suitable form for a talk. It should briefly describe the problem and approach, present the results, and then present the analysis of the results. As with the report, this last part is the most important. It is fine if the presentation is also a web page, but please prepare a separate page from the report itself so that you can give a proper talk.
I encourage you to make your presentations accessible from the class swiki. Since the swiki does go down sometimes, I would suggest hosting your presentation on the CoC server and linking it from the swiki. In any event, you should bring a back-up copy of your presentation with you in case there is some problem with the network. I will be leaving town immediately after the presentations, so there will be no other opportunity to make them. The PC in the classroom will be used to host the presentations.
Some Final Project Choices
Here are some sample projects which I believe to be doable in the time available (some are perhaps more likely to be grad projects than undergrad ones). Of course it is also fine if you propose one of your own, perhaps based on your current research activity.
1. Color Constancy
Implement algorithm 6.1 from your text for determining the lightness of image patches. Exercises 6.8 and 6.9 are relevant here as well. Explore both the gray world and brightest patch approaches to establishing an absolute reference. Test your algorithm on Mondarin images as well as more complex scenes.
2. Real-Time Tracking
The Viper environment includes some infrastructure for capturing live video from a Sony Elura camera which can be checked out of the CPL lab (talk to me about it). This project would explore some tracking algorithms using this live video as input. This is a good way to quickly get intuition for what works (and doesn't work). Start with a standard patch-based SSD tracker, as we will describe in class. Modify this to use particle filtering to improve the robustness of the tracker to noise and background clutter. I can provide references to this material to anyone who is interested.
3. Loopy Stereo
In a recent paper by Shum et. al., a existing Bayesian inference algorithm known as loopy propagation was applied to the problem of stereo matching. This project would extend your stereo system developed in PS #2 to incorporate the loopy propagation idea and evaluate how well it works. Contact me for more details.
4. Object Recognition
Develop a program for detecting coins in images using Hough transform. Follow the instructions. Extend your program to recognize dollar bills as well as coins (using the same edge-based technique). Extend your program further to reason about possible occlusions of coins by dollar bills.
5. Texture Synthesis
Implement the texture synthesis procedure described in the paper by Heeger and Bergen. Validate your algorithm on the textures from their paper, and then explore the limitations of the algorithm. Grad students would be expected to also try using this framework for texture matching/discrimination.