CS 7641 Machine Learning
CSE/ISYE 6740 Computational Data Analysis
Lecture TimeTuesday and Thursday 1:35 - 2:55pm (starting Aug 19)
Lecture LocationClough 152
Machine learning studies the question "how can we build computer programs that automatically improve their performance through experience?" This includes learning to perform many types of tasks based on many types of experience. For example, it includes robots learning to better navigate based on experience gained by roaming their environments, medical decision aids that learn to predict which therapies work best for which diseases based on data mining of historical health records, and speech recognition systems that lean to better understand your speech based on experience listening to you.
The course is designed to answer the most fundamental questions about machine learning: How can we conceptually organize the large collection of available methods? What are the most important methods to know about, and why? How can we answer the question 'is this method better than that one' with some theoretical guidance or for a specific dataset of interest? What can we say about the errors our method will make on future data? What's the 'right' objective function? What does it mean to be statistically rigorous? Should I be a Bayesian? What computer science ideas can make ML methods tractable on modern large or complex datasets? What are the open questions?
This course is designed to give students a thorough grounding in the concepts, methods and algorithms needed to do research and applications in machine learning. The course covers topics from machine learning, classical statistics, data mining, Bayesian statistics and information theory. Students entering the class with a pre-existing working knowledge of probability, statistics, linear algebra and algorithms will be at an advantage.
If a student is not prepared for a mathematically rigorous and intensive class of machine learning, I suggest you take: Introductory Machine Learning (CS 4641) or Data and Visual Analytics (CSE 6242). If a student already has extensive experience in machine learning or have taken some online courses in machine learning, I suggest you take a more theory oriented class: Advanced Machine Learning (ML 8803) and Machine Learning Theory (CS 7545).
- Pattern Recognition and Machine Learning, Chris Bishop
- The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Trevor Hastie, Robert Tibshirani, Jerome Friedman.
The requirements of this course consist of participating in lectures, midterm and final exams, 4 assignments. The most important thing for us is that by the end of this class students understand the basic methodologies in machine learning, and be able to use them to solve real problems of modest complexity. The grading breakdown is the following
- Homework (4 assignments, 50%)
- Midterm exam (20%)
- Final exam (20%)
- Background test (4%)
- Participation (6%)
- If you are getting a score below 40%, the class may be too difficult for you and you should consider taking it next time when you are better prepared. If you stay, you will lose the 4% credit.
- If you are getting a score between 40-79% and you decide to take the class, you are required to attend the mandatory recitation session in order to get the 4% credit.
- If you are getting a score above 79%, then you will automatically get the 4% credit and you are not required to attend the recitation session.
- Homework should be submitted before the deadline set in T-Square. It is worth zero credit after the deadline.
- No late submission will be accepted through email, and we do not guarantee replies for such emails.
- We strongly encourage to use LaTeX for your submission. We will give 10 extra credits for using LaTeX or word-processor typed submissions as we understand it takes longer time. Unreadable handwriting is subject to zero credit.
- Any kind of academic misconduct is subject to F grade as well as reporting to the Dean of students. All answers and codes should be prepared by yourself. If you refer to any material, it should be properly cited.
The exams will be open book and open notes in class. No electronic devices will be allowed.
Le Song, Klaus 1340, Office Hours: Fri 4-5pm
TA office hour:
Session 1 (Joonseok Lee & Shuang Li): Mon 2-3pm, KACB 1315
Session 2 (Bo Xie & Amir Afsharinejad & Kaushik Patnaik): Wed 2-3pm, KACB 1315
We encourage you to discuss on Piazza discussion forum here. Note that this is mainly used for peer-discussion among students. If you have a question to the instructor or TAs, please email them directly at firstname.lastname@example.org.
Syllabus and Schedule
|Date||Lecture & Topics||Readings & Useful Links|
|Introduction and Backgrounds|
|Unsupervised Machine Learning Techniques (Data Exploration)|
|Supervised Machine Learning Techniques (Predictive Models)|
|*** Thu 10/9, Midterm Review ***|
|*** 10/11-10/14, Fall 2014 Student Recess ***|
|*** 10/16, Midterm Exam (in class, tentative) ***|
|Advanced topics (Complex Models)|
|*** 11/27-11/28 Thanksgiving Break ***|
|*** 12/2 Class Review ***|
|*** 12/11 Final Exam (2:50 - 5:40pm, same class room) ***|
Multivariate Gaussians. notes
Review of linear algebra by Zico Kolter. notes
Matrix Cookbook. notes
Matlab Python cheatsheet. notes