CS 4495 / 6476 Computer Vision
Fall 2015, MW 4:35 to 5:55, Van Leer W200
Instructor: James Hays
GTAs: Grady Williams, Brian Goldfain, Patsorn Sangkloy, Nam Vo, and Sen "Henry" Hu.
Undergraduate TAs: Carden Bagwell and Zhengyang Wu
Course DescriptionThis course provides an introduction to computer vision including fundamentals of image formation, camera imaging geometry, feature detection and matching, stereo, motion estimation and tracking, image classification and scene understanding. We'll develop basic methods for applications that include finding known models in images, depth recovery from stereo, camera calibration, image stabilization, automated alignment, tracking, boundary detection, and recognition. The focus of the course is to develop the intuitions and mathematics of the methods in lecture, and then to learn about the difference between theory and practice in the projects.
This offering of CS4495/6476 will emphasize the core vision tasks of scene understanding and recognition. We will train and evaluate classifiers to recognize various visual phenomena.
The difference between the undergraduate version of the class (CS4495) and the graduate version (CS6476) will be the requirements on the projects. In particular, more challenging extensions of the projects will be extra credit for CS4495 but required for CS6476.
The Advanced Computer Vision course (CS7476) will build on this course and deal with advanced and research related topics in Computer Vision, including Machine Learning, Graphics, and Robotics topics that impact Computer Vision.
Learning ObjectivesUpon completion of this course, students will:
- 1. Become familiar with both the theoretical and practical aspects of computing with images building on the image processing approaches
- 2. Describe the foundation of image formation and image analysis.
- 3. Become familiar with theoretical foundations of the major technical approaches involved in computer vision based image analysis.
- 4. Understand basics of measurements and robust detection of features in images.
- 5. Describe various methods used for registration, alignment, and matching in images.
- 6. Understand the basics of 2D and 3D Computer Vision.
- 7. Get an exposure to advanced concepts leading to object and scene categorization from images.
- 8. Be able to connect issues from Computer Vision to Human Vision
- 9. Develop practical skills that are necessary for building computer vision applications.
PrerequisitesNo prior experience with computer vision is assumed, although previous knowledge of visual computing or signal processing will be helpful. The following skills are necessary for this class:
- Data structures: You'll be writing code that builds representations of images, features, and geometric constructions.
- Programming: A good working knowledge of programming environments that support image and video analysis. All lecture code and project starter code will be in MATLAB. Students are strongly encouraged to use MATLAB and the TA's will support questions about MATLAB. If you've never used MATLAB that is OK.
- Math: Linear algebra, vector calculus, and probability.
GradingYour final grade will be made up from
- 80% 5 programming projects
- 20% 2 written quizzes
Graduate CreditIf you are enrolled in the graduate section CS 6476 then you will be expected to do additional work on each project. Each project will list several extra credit opportunities available and CS 6476 students will be required to do at least 10 points worth of extra credit (for which you will not get extra credit, unless you do more than 10 points worth).
Academic IntegrityAcademic dishonesty will not be tolerated. This includes cheating, lying about course matters, plagiarism, or helping others commit a violation of the Honor Code. Plagiarism includes reproducing the words of others without both the use of quotation marks and citation. Students are reminded of the obligations and expectations associated with the Georgia Tech Academic Honor Code and Student Code of Conduct, available online at www.honor.gatech.edu. For quizzes, no supporting materials are allowed (notes, calculators, phones, etc).
Learning AccommodationsIf needed, we will make classroom accommodations for students with documented disabilities. These accommodations must be arranged in advance and in accordance with the ADAPTS office (www.adapts.gatech.edu).
- Piazza for CS 4495 / 6476. This should be your first stop for questions and announcements.
- Matlab Tutorial
Contact Info and Office Hours:If possible, please use Piazza to ask questions and seek clarifications before emailing the instructor or staff.
- James: hays[at]gatech.edu
- GTA Grady Williams: gradyrw[at]gatech.edu
- GTA Brian Goldfain: bgoldfain3[at]gatech.edu
- GTA Patsorn Sangkloy: patsorn_sangkloy[at]gatech.edu
- GTA Nam Vo: namvo[at]gatech.edu
- GTA Sen "Henry" Hu: henryhu[at]gatech.edu
- UTA Carden Bagwell: cardenb[at]gatech.edu
- UTA Zhengyang Wu: wzy930712[at]gatech.edu
- James, Monday and Wednesday, 2:00-3:00, CCB 222 or 315.
- Grady and Brian, Tuesday 2 to 6pm in CoC picnic area.
|Image Filtering and Hybrid images|
|Local Feature Matching|
|Scene Recognition with Bag of Words|
|Face Detection with a Sliding Window|
|Boundary Detection with Sketch Tokens|
TextbookReadings will be assigned in "Computer Vision: Algorithms and Applications" by Richard Szeliski. The book is available for free online or available for purchase.
|Mon, Aug 17||Introduction to computer vision||.ppt, .pdf||Szeliski 1|
|Wed, Aug 19||Light and Color, Cameras and Optics||.ppt, .pdf||Szeliski 2.2 and 2.1, especially 2.1|
|Mon, Aug 24||Cameras and Optics continued, Image Filtering||.ppt, .pdf||Szeliski 3.2||Project 1 out|
|Wed, Aug 26||Image filtering||.ppt, .pdf||Szeliski 3.2|
|Thinking in frequency||.ppt, .pdf||Szeliski 3.4|
|Image pyramids and applications||.ppt, .pdf||Szeliski 3.5.2 and 8.1.1|
|Edge detection||.ppt, .pdf||Szeliski 4.2|
|Interest points and corners||.ppt, .pdf||Szeliski 4.1.1||Project 1 due|
|Local image features||.ppt, .pdf||Szeliski 4.1.2||Project 2 out|
|Feature matching and hough transform||.ppt, .pdf||Szeliski 4.1.3 and 4.3.2|
|Model fitting and RANSAC||.ppt, .pdf||Szeliski 6.1|
|Stereo||.ppt, .pdf||Szeliski 11|
|Epipolar Geometry and Structure from Motion||.ppt, .pdf||Szeliski 7|
|Feature Tracking and Optical Flow||.ppt, .pdf||Szeliski 8.1 and 8.4|
|Machine learning intro and clustering||.ppt, .pdf||Szeliski 5.3|
|Machine learning: clustering continued||.ppt, .pdf||Szeliski 5.3||Project 2 due|
|Machine learning: classification||.ppt, .pdf||Project 3 out|
|Recognition overview and bag of features||.ppt, .pdf||Szeliski 14|
|Large-scale instance recognition||.ppt, .pdf||Szeliski 14.3.2|
|Detection with sliding windows: Viola Jones||.ppt, .pdf||Szeliski 14.1|
|Detection continued and Quiz 1 discussion||See above||Szeliski 14.2|
|Scene recognition with SUN database||.ppt, .pdf|
|Mixture of Gaussians and advanced feature encoding||.ppt, .pdf||Project 3 Due|
|Modern object detection||.ppt, .pdf||Szeliski 14.1|
|Internet scale vision, pt 1||.ppt, .pdf||Szeliski 14.5||Project 4 out|
|Internet scale vision, pt 2||.ppt, .pdf|
|Guest lecture||Project page|
|Human computation and crowdsourcing||.ppt, .pdf|
|Attributes and more crowdsourcing||.ppt, .pdf|
|Sketch Recognition and more crowdsourcing||.ppt, .pdf|
|Modern boundary detection and Pb||.ppt, .pdf||Szeliski 4.2||Project 4 due|
|Modern boundary detection and sketch tokens||.ppt, .pdf, gPb, Sketch Tokens||Szeliski 4.2|
|Project 5 introduction||.ppt, .pdf||Szeliski 5.5||Project 5 out|
|Context and Spatial Layout||.ppt, .pdf|
|Context and Scene parsing||.ppt, .pdf|
|Exam Period - not used||Project 5 due|