CSE6740/CS7641/ISYE6740
Machine Learning I: Computational Data Analysis
Fall 2013
Lecture Time
Tuesday and Thursday 1:35  2:55pm in Instructional Center 1234 (starting Aug 20)
Course Description
Machine learning studies the question "how can we build computer programs that automatically improve their performance through experience?" This includes learning to perform many types of tasks based on many types of experience. For example, it includes robots learning to better navigate based on experience gained by roaming their environments, medical decision aids that learn to predict which therapies work best for which diseases based on data mining of historical health records, and speech recognition systems that lean to better understand your speech based on experience listening to you.
The course is designed to answer the most fundamental questions about machine learning: How can we conceptually organize the large collection of available methods? What are the most important methods to know about, and why? How can we answer the question 'is this method better than that one' using asymptotic theory? How can we answer the question 'is this method better than that one' for a specific dataset of interest? What can we say about the errors our method will make on future data? What's the 'right' objective function? What does it mean to be statistically rigorous? Should I be a Bayesian? What computer science ideas can make ML methods tractable on modern large or complex datasets? What are the open questions?
This course is designed to give students a thorough grounding in the concepts, methods and algorithms needed to do research and applications in machine learning. The course covers topics from machine learning, classical statistics, data mining, Bayesian statistics and information theory. Students entering the class with a preexisting working knowledge of probability, statistics and algorithms will be at an advantage.
If a student is not prepared for a mathematically rigorous and intensive class of machine learning, I suggest you take: Introductory Machine Learning (CS 4641) or Data and Visual Analytics (CSE 6242). If a student already has extensive experience in machine learning or have taken some online courses in machine learning, I suggest you take a more theory oriented class: Advanced Machine Learning (ML 8803) and Machine Learning Theory (CS 7545).
Textbooks
 Pattern Recognition and Machine Learning, Chris Bishop
 The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Trevor Hastie, Robert Tibshirani, Jerome Friedman.
 Machine Learning, Tom Mitchell
Grading
The requirements of this course consist of participating in lectures, midterm and final exams, 7 problem sets. The most important thing for us is that by the end of this class students understand the basic methodologies in machine learning, and be able to use them to solve real problems of modest complexity. The grading breakdown is the following
 Homework (6 assignments, 60%)
 Midterm exam (20%)
 Final exam (20%)
Late Homework Policy
No late in homeworks. It is worth zero credit after that. You must turn in all of the homeworks, even if for zero credit, in order to pass the course.
Exams
The exams will be open book and open notes. Internet usage will not be allowed.
People
Instructor: Le Song, Klaus 1340, Office Hours: The half hour right after each lecture
Guest Lecturer: Bistra Dilkina
TA Office Hours: TBD
TA: Joonseok Lee, Klaus 1315, 12pm Monday
TA: Zhen Wang, Klaus 1315, 12pm Monday
TA: Nan Du, Klaus 1315, 23pm Friday
TA contact email: mlcda2013@gmail.com
Class Assistant: Mimi Haley, Klaus 1321
Mailing List
Discussion forum: https://groups.google.com/d/forum/cse6740fall2013
Mailing list: cse6740fall2013@googlegroups.com
Syllabus and Schedule
Date  Lecture & Topics  Readings & Useful Links  Handouts 

Tue 8/20 

Slides  
Unsupervised Machine Learning Techniques (Data Exploration)  
Thu 8/22 

Slides Codes 

Tue 8/27 

Slides Codes 

Thu 8/29 

Slides Codes 

Tue 9/3 

Slides Codes 

Thu 9/5 

Slides Codes 

Tue 9/10 

Slides Codes 

Thu 9/12 

Slides Codes 

Supervised Machine Learning Techniques (Predictive Models)  
Tue 9/17 

Slides Codes 

Thu 9/19 

Slides  
Tue 9/24 

Slides Assignment 3 

Thu 9/26 

Slides  
Tue 10/1 

Slides  
*** Thu 10/3, Midterm Review ***  
Thu 10/8 

Slides Assignment 4 

Tue 10/10 

Slides  
*** 10/1210/15, Fall 2013 Student Recess ***  
*** 10/17, Midterm Exam ***  
Tue 10/22 


Slides 
Thu 10/24 


Slides 
Tue 10/29 


Slides 
Advanced topics (Complex Models)  
Thu 10/31 


Slides 
Tue 11/5 


Slides Assignment 5 
Thu 11/7 

Slides  
Tue 11/12 

Slides  
Thu 11/14 

Slides  
Tue 11/19 

Slides Assignment 6 

Thu 11/21 

Slides  
Tue 11/26 

Slides  
*** 11/2811/29 Thanksgiving Break ***  
*** 11/3 Class Review ***  
*** 11/9 Final Exam *** 
Additional Materials:
Basic probability and statistics. notes1, notes2, notes3
Multivariate Gaussians. notes
Review of linear algebra by Zico Kolter. notes
Matrix Cookbook. notes
Matlab Python cheatsheet. notes