CS 4495 / 6476 Computer Vision

Fall 2015, MW 4:35 to 5:55, Van Leer W200
Instructor: James Hays

GTAs: Grady Williams, Brian Goldfain, Patsorn Sangkloy, Nam Vo, and Sen "Henry" Hu.
Undergraduate TAs: Carden Bagwell and Zhengyang Wu

Computer Vision, art by kirkh.deviantart.com

Course Description

This course provides an introduction to computer vision including fundamentals of image formation, camera imaging geometry, feature detection and matching, stereo, motion estimation and tracking, image classification and scene understanding. We'll develop basic methods for applications that include finding known models in images, depth recovery from stereo, camera calibration, image stabilization, automated alignment, tracking, boundary detection, and recognition. The focus of the course is to develop the intuitions and mathematics of the methods in lecture, and then to learn about the difference between theory and practice in the projects.

This offering of CS4495/6476 will emphasize the core vision tasks of scene understanding and recognition. We will train and evaluate classifiers to recognize various visual phenomena.

The difference between the undergraduate version of the class (CS4495) and the graduate version (CS6476) will be the requirements on the projects. In particular, more challenging extensions of the projects will be extra credit for CS4495 but required for CS6476.

The Advanced Computer Vision course (CS7476) will build on this course and deal with advanced and research related topics in Computer Vision, including Machine Learning, Graphics, and Robotics topics that impact Computer Vision.

Learning Objectives

Upon completion of this course, students will:

1. Become familiar with both the theoretical and practical aspects of computing with images building on the image processing approaches
2. Describe the foundation of image formation and image analysis.
3. Become familiar with theoretical foundations of the major technical approaches involved in computer vision based image analysis.
4. Understand basics of measurements and robust detection of features in images.
5. Describe various methods used for registration, alignment, and matching in images.
6. Understand the basics of 2D and 3D Computer Vision.
7. Get an exposure to advanced concepts leading to object and scene categorization from images.
8. Be able to connect issues from Computer Vision to Human Vision
9. Develop practical skills that are necessary for building computer vision applications.

Prerequisites

No prior experience with computer vision is assumed, although previous knowledge of visual computing or signal processing will be helpful. The following skills are necessary for this class:

Data structures: You'll be writing code that builds representations of images, features, and geometric constructions.
Programming: A good working knowledge of programming environments that support image and video analysis. All lecture code and project starter code will be in MATLAB. Students are strongly encouraged to use MATLAB and the TA's will support questions about MATLAB. If you've never used MATLAB that is OK.
Math: Linear algebra, vector calculus, and probability.

Grading

Your final grade will be made up from

80% 5 programming projects
20% 2 written quizzes

You will lose 10% each day for late projects. However, you have three "late days" for the whole course. That is to say, the first 24 hours after the due date and time counts as 1 day, up to 48 hours is two and 72 for the third late day. This will not be reflected in the initial grade reports for your assignment, but they will be factored in and distributed at the end of the semester so that you get the most points possible.

Graduate Credit

If you are enrolled in the graduate section CS 6476 then you will be expected to do additional work on each project. Each project will list several extra credit opportunities available and CS 6476 students will be required to do at least 10 points worth of extra credit (for which you will not get extra credit, unless you do more than 10 points worth).

Academic Integrity

Academic dishonesty will not be tolerated. This includes cheating, lying about course matters, plagiarism, or helping others commit a violation of the Honor Code. Plagiarism includes reproducing the words of others without both the use of quotation marks and citation. Students are reminded of the obligations and expectations associated with the Georgia Tech Academic Honor Code and Student Code of Conduct, available online at www.honor.gatech.edu. For quizzes, no supporting materials are allowed (notes, calculators, phones, etc).

Learning Accommodations

If needed, we will make classroom accommodations for students with documented disabilities. These accommodations must be arranged in advance and in accordance with the ADAPTS office (www.adapts.gatech.edu).

Important Links:

Piazza for CS 4495 / 6476. This should be your first stop for questions and announcements.
t-square.gatech.edu.
Matlab Tutorial

Contact Info and Office Hours:

If possible, please use Piazza to ask questions and seek clarifications before emailing the instructor or staff.

James: hays[at]gatech.edu
GTA Grady Williams: gradyrw[at]gatech.edu
GTA Brian Goldfain: bgoldfain3[at]gatech.edu
GTA Patsorn Sangkloy: patsorn_sangkloy[at]gatech.edu
GTA Nam Vo: namvo[at]gatech.edu
GTA Sen "Henry" Hu: henryhu[at]gatech.edu
UTA Carden Bagwell: cardenb[at]gatech.edu
UTA Zhengyang Wu: wzy930712[at]gatech.edu

Office Hours

James, Monday and Wednesday, 2:00-3:00, CCB 222 or 315.
Grady and Brian, Tuesday 2 to 6pm in CoC picnic area.

Tentative Assignments	Winning projects	All Results
Image Filtering and Hybrid images
Local Feature Matching
Scene Recognition with Bag of Words
Face Detection with a Sliding Window
Boundary Detection with Sketch Tokens

It is strongly recommended that all projects be completed in Matlab. All starter code will be provided for Matlab. Students may implement projects through other means but it will generally be more difficult.

Textbook

Readings will be assigned in "Computer Vision: Algorithms and Applications" by Richard Szeliski. The book is available for free online or available for purchase.

Tentative Syllabus

Class Date	Topic	Slides	Reading	Projects
Mon, Aug 17	Introduction to computer vision	.ppt, .pdf	Szeliski 1
Image Formation and Filtering (Szeliski chapters 2 and 3)
Wed, Aug 19	Light and Color, Cameras and Optics	.ppt, .pdf	Szeliski 2.2 and 2.1, especially 2.1
Mon, Aug 24	Cameras and Optics continued, Image Filtering	.ppt, .pdf	Szeliski 3.2	Project 1 out
Wed, Aug 26	Image filtering	.ppt, .pdf	Szeliski 3.2
	Thinking in frequency	.ppt, .pdf	Szeliski 3.4
	Image pyramids and applications	.ppt, .pdf	Szeliski 3.5.2 and 8.1.1
Feature Detection and Matching
	Edge detection	.ppt, .pdf	Szeliski 4.2
	Interest points and corners	.ppt, .pdf	Szeliski 4.1.1	Project 1 due
	Local image features	.ppt, .pdf	Szeliski 4.1.2	Project 2 out
	Feature matching and hough transform	.ppt, .pdf	Szeliski 4.1.3 and 4.3.2
	Model fitting and RANSAC	.ppt, .pdf	Szeliski 6.1
Multiple Views and Motion
	Stereo	.ppt, .pdf	Szeliski 11
	Epipolar Geometry and Structure from Motion	.ppt, .pdf	Szeliski 7
	Feature Tracking and Optical Flow	.ppt, .pdf	Szeliski 8.1 and 8.4
Machine Learning Crash Course
	Machine learning intro and clustering	.ppt, .pdf	Szeliski 5.3
	Machine learning: clustering continued	.ppt, .pdf	Szeliski 5.3	Project 2 due
	Machine learning: classification	.ppt, .pdf		Project 3 out
	No classes
	Quiz 1
Recognition
	Recognition overview and bag of features	.ppt, .pdf	Szeliski 14
	Large-scale instance recognition	.ppt, .pdf	Szeliski 14.3.2
	Detection with sliding windows: Viola Jones	.ppt, .pdf	Szeliski 14.1
	Detection continued and Quiz 1 discussion	See above	Szeliski 14.2
	Scene recognition with SUN database	.ppt, .pdf
	Mixture of Gaussians and advanced feature encoding	.ppt, .pdf		Project 3 Due
	Modern object detection	.ppt, .pdf	Szeliski 14.1
	Internet scale vision, pt 1	.ppt, .pdf	Szeliski 14.5	Project 4 out
	Internet scale vision, pt 2	.ppt, .pdf
	Guest lecture	Project page
	Human computation and crowdsourcing	.ppt, .pdf
	Attributes and more crowdsourcing	.ppt, .pdf
	Sketch Recognition and more crowdsourcing	.ppt, .pdf
	Modern boundary detection and Pb	.ppt, .pdf	Szeliski 4.2	Project 4 due
	Modern boundary detection and sketch tokens	.ppt, .pdf, gPb, Sketch Tokens	Szeliski 4.2
	Guest lecture
	Project 5 introduction	.ppt, .pdf	Szeliski 5.5	Project 5 out
	Context and Spatial Layout	.ppt, .pdf
	Context and Scene parsing	.ppt, .pdf
	Quiz 2
	Exam Period - not used			Project 5 due

Acknowledgements

The materials from this class rely significantly on slides prepared by other instructors, especially Derek Hoiem and Svetlana Lazebnik. Each slide set and assignment contains acknowledgements. Feel free to use these slides for academic or research purposes, but please maintain all acknowledgements.