CS 6476 Computer Vision
Fall 2018, MW 4:30 to 5:45, Clough 152
TAs: Cusuh Ham (head TA), Min-Hung (Steve) Chen, Sean Foley, Jianan Gao, John Lambert, Amit Raj, Sainandan Ramakrishnan, Dilara Soylu, Vijay Upadhya
Instructor: James Hays
Course DescriptionThis course provides an introduction to computer vision including fundamentals of image formation, camera imaging geometry, feature detection and matching, stereo, motion estimation and tracking, image classification and scene understanding. We'll develop basic methods for applications that include finding known models in images, depth recovery from stereo, camera calibration, image stabilization, automated alignment, tracking, boundary detection, and recognition. The focus of the course is to develop the intuitions and mathematics of the methods in lecture, and then to learn about the difference between theory and practice in the projects.
The Advanced Computer Vision course (CS7476) in spring (not offered 2019) will build on this course and deal with advanced and research related topics in Computer Vision, including Machine Learning, Graphics, and Robotics topics that impact Computer Vision.
Learning ObjectivesUpon completion of this course, students should be able to:
- 1. Recognize and describe both the theoretical and practical aspects of computing with images. Connect issues from Computer Vision to Human Vision
- 2. Describe the foundation of image formation and image analysis. Understand the basics of 2D and 3D Computer Vision.
- 3. Become familiar with the major technical approaches involved in computer vision. Describe various methods used for registration, alignment, and matching in images.
- 4. Get an exposure to advanced concepts leading to object and scene categorization from images.
- 5. Build computer vision applications.
PrerequisitesNo prior experience with computer vision is assumed, although previous knowledge of visual computing or signal processing will be helpful. The following skills are necessary for this class:
- Data structures: You'll be writing code that builds representations of images, features, and geometric constructions.
- Programming: Projects are to be completed and graded in Python. All project starter code will be in Python. TA's will support questions about Python. If you've never used Python that is OK, as long as you have programming experience.
- Math: Linear algebra, vector calculus, and probability. Linear algebra is the most important and students who have not taken a linear algebra course have struggled in the past.
GradingYour final grade will be made up from
- 80% 6 programming projects
- 20% 2 written quizzes
These late days are intended to cover unexpected clustering of due dates, travel commitments, interviews, hackathons, etc. Don't ask for extensions to due dates because we are already giving you a pool of late days to manage yourself.
Academic IntegrityAcademic dishonesty will not be tolerated. This includes cheating, lying about course matters, plagiarism, or helping others commit a violation of the Honor Code. Plagiarism includes reproducing the words of others without both the use of quotation marks and citation. Students are reminded of the obligations and expectations associated with the Georgia Tech Academic Honor Code and Student Code of Conduct, available online at www.honor.gatech.edu. For quizzes, no supporting materials are allowed (notes, calculators, phones, etc).
You are expected to implement the core components of each project on your own, but the extra credit opportunties often build on third party data sets or code. That's fine. Feel free to include results built on other software, as long as you are clear in your handin that it is not your own work.
You should not view or edit anyone else's code. You should not post code to Piazza, except for starter code / helper code that isn't related to the core project.
Learning AccommodationsIf needed, we will make classroom accommodations for students with documented disabilities. These accommodations must be arranged in advance and in accordance with the ADAPTS office (www.adapts.gatech.edu).
- Piazza for CS 6476. This should be your first stop for questions and announcements.
- canvas.gatech.edu will be used to hand in assignments.
Contact Info and Office Hours:If possible, please use Piazza to ask questions and seek clarifications before emailing the instructor or staff.
- James: hays[at]gatech.edu
- Cusuh Ham: cusuh[at]gatech.edu
- Vijay Upadhya: vupadhya6[at]gatech.edu
- Dilara Soylu: fdsoylu[at]gatech.edu
- Sainandan Ramakrishnan: sainandancv[at]gatech.edu
- Amit Raj: amit.raj[at]gatech.edu
- John Lambert: johnlambert[at]gatech.edu
- Jianan Gao: jianan[at]gatech.edu
- Sean Foley: seanremy[at]gatech.edu
- Min-Hung (Steve) Chen: cmhungsteve[at]gatech.edu
- James, Monday and Wednesday, 1 to 2 (CCB 315).
- TA hours: TBD.
|Image Filtering and Hybrid images|
|Local Feature Matching|
|Camera Calibration and Fundamental Matrix Estimation with RANSAC|
|Scene Recognition with Bag of Words|
|Face Detection with a Sliding Window|
TextbookReadings will be assigned in "Computer Vision: Algorithms and Applications" by Richard Szeliski. The book is available for free online or available for purchase.
|Mon, Aug 20||Introduction to computer vision||pdf, pptx||Szeliski 1|
|Wed, Aug 22||Cameras and Optics||pdf, pptx||Szeliski 2.1, especially 2.1.5||Project 1 out|
|Mon, Aug 27||Light and Color and Image Filtering||pdf, pptx||Szeliski 2.2 and 2.3|
|Wed, Aug 29||Thinking in Frequency||pdf, pptx||Szeliski 3.2 and 3.5.2 and 8.1.1 and 4.2|
|Mon, Sept 3||No classes, Institute holiday||Project 1 due|
|Wed, Sept 5||Interest points and corners||pdf, pptx||Szeliski 4.1.1 and 4.1.2||Project 2 out|
|Mon, Sept 10||Guest Lecture: Frank Dellaert|
|Wed, Sept 12||Local image features||pdf, pptx||Szeliski 4.1.3 and 4.3.2|
|Mon, Sept 17||Model fitting, Hough Transform||pdf, pptx||Szeliski 6.1 and 2.1|
|Wed, Sept 19||RANSAC and transformations||pdf, pptx|
|Mon, Sept 24||Stereo intro and Camera Calibration||pdf, pptx||Szeliski 11 and 6.2.1||project 2 due|
|Wed, Sept 26||Epipolar Geometry and Structure from Motion||pdf, pptx||Szeliski 7||project 3 out|
|Mon, Oct 1||Stereo Correspondence and Optical Flow||pdf, pptx||Szeliski 11 and 8.4|
|Wed, Oct 3||Quiz 1|
|Mon, Oct 8||No classes, Institute holiday|
|Wed, Oct 10||Machine learning crash course and recognition overview||pdf, pptx||Szeliski 5.3 and 14|
|Mon, Oct 15||Recognition and Bag of Words||pdf, pptx||Szeliski 14.3.2||Project 4 out|
|Wed, Oct 17||Large-scale retrieval: Spatial Verification, TF-IDF, Query Expansion, feature encoding||pdf, pptx|
|Mon, Oct 22||Large-scale category recognition and advanced feature encoding||pdf, pptx|
|Wed, Oct 24||Detection with sliding windows: Viola Jones||pdf, pptx||Szeliski 14.1|
|Mon, Oct 29||Detection with sliding windows: Dalal Triggs and Pascal VOC||pdf, pptx|
|Wed, Oct 31||No classes|
|Mon, Nov 5||Big Data||pdf, pptx||Szeliski 14.5 Szeliski 4.2|
|Wed, Nov 7||Crowdsourcing and Human Computation||pdf, pptx|
|Mon, Nov 12||Neural networks Basics and Convolutional Networks||pdf, pptx|
|Wed, Nov 14||Object Detectors Emerge in Deep Scene CNNs and Deeper Deep Architectures.||pdf, pptx|
|Mon, Nov 19||Structured Output from Deep Networks||pdf, pptx|
|Wed, Nov 21||No classes, Institute holiday|
|Mon, Nov 26||"Unsupervised" Learning and Colorization||pdf, pptx|
|Wed, Nov 28||Quiz 2|
|Mon, Dec 3||(optional) Ongoing research presentations from PhD students.|
|Wed, Dec 5||No classes, reading period|
|Final Exam Period - not used|