Date 
Topic 
Subtopics 
Reading 
Assignments 
Tue 8/24 
Data Mining and Analysis Overview I 
Examples of ML; tasks of ML; course logistics 
Chapter 1 Lecture 1 

Thu 8/26 
Data Mining and Analysis Overview II 
Parametric vs. nonparametric; parts of ML; generalization and over/underfitting; crossvalidation 
Chapter 2, 7.10, 13.3 Lecture 2 

Tue 8/31 
Basic Concepts of Probability and Statistics I 
Distributions; manipulating probabilities; statistics 
none (lecture only) Lecture 3 
HW1 out 
Thu 9/2 
Basic Concepts of Probability and Statistics II 
Parametric density estimation; maximum likelihood; generative classification 
none (lecture only) Lecture 4 

Tue 9/7 
Basic Concepts of Probability and Statistics III 
Density estimation task; mixture of Gaussians; optimization; EM algorithm 
Sections 6.8, 8.5, 10.10 intro, 10.10.1, 12.2.3 Lecture 5 
HW1 due; HW2 out 
Thu 9/9 
Basic Concepts of Probability and Statistics IV 
Nonparametric estimation; histogram; kernel density estimation; biasvariance tradeoff 
Sections 2.9, 6.1, 6.2 Lecture 6 

Tue 9/14 
Supervised Learning I 
Kernel discriminant analysis; kernel regression; temporalization 
Sections 6.3, 6.6, 6.8 Lecture 7 
HW2 due; HW3 out 
Thu 9/16 
Supervised Learning II 
Linear regression; ridge regression and lasso; regularization; neural networks 
Sections 3.1, 3.4, 11.111.8, 11.9, 4.5 Lecture 8 

Tue 9/21 
Supervised Learning III 
Logistic regression; linear support vector machine 
Sections 4.1, 4.2, 4.3, 4.4, 12.1, 12.2 Lecture 9 
HW3 due; takehome midterm out 
Thu 9/23 
Supervised Learning IV 
Kernel trick; kernels; kernelized support vector machine 
Sections 12.3.112.3.4, 5.8 Lecture 10 

Tue 9/28 
Supervised Learning V 
Nearestneighbor; decision trees 
Sections 9.1, 9.2 Lecture 11 
Takehome midterm due; HW4 out 
Thu 9/30 
Above Learning I 
Bootstrap; bagging 
Sections 8.2, 8.7 Lecture 12 

Tue 10/5 
Above Learning II 
Random forests; stacking; boosting 
Sections 15.115.4, 8.8, 10.110.6, 10.11 Lecture 13 
HW4 due; HW5 out 
Thu 10/7 
Above Learning III 
Feature selection; crossvalidation with feature selection 
Sections 3.3, 3.6, 10.13, 7.10 Lecture 14 

Tue 10/12 
Unsupervised Learning I 
Clustering; kmeans; how to choose k 
Sections 14.1, 14.3.114.3.8, 14.3.1014.3.11 Lecture 15 
HW5 due; HW6 out 
Thu 10/14 
Unsupervised Learning II 
Constrained clustering; hierarchical clustering; meanshift; biclustering 
Section 14.3.12 Lecture 16 

Tue 10/19 
Fall recess 



Thu 10/21 
Unsupervised Learning III 
Association rules 
Section 14.2 Lecture 17 
HW6 due; HW7 out 
Tue 10/26 
Unsupervised Learning IV 
Dimension reduction; principal component analysis 
Section 14.5.1 Lecture 18 

Thu 10/28 
Unsupervised Learning V 
Independent component analysis; multidimensional scaling; manifold learning 
Section 14.714.9 Lecture 19 
HW7 due; HW8 out 
Tue 11/2 
Practical Issues and Validation I 
Asymptotic distributions; statistical inequalities; confidence bands 
none (lecture only) Lecture 20 

Thu 11/4 
Practical Issues and Validation II 
Computation: fast sums and searches; multidimensional trees 
none (lecture only) Lecture 21 
HW8 due; HW9 out 
Tue 11/9 
Practical Issues and Validation III 
Computation: unconstrained optimization; constrained optimization 
none (lecture only) Lecture 22 

Tue 11/16 
Practical Issues and Validation IV 
Comparing learners; hypothesis testing 
none (lecture only) Lecture 23 
HW9 due; HW10 out 
Tue 11/23 
Practical Issues and Validation V 
Data issues: types of data (structured, nonvector, compressed); outliers and robustness; corrupted, noisy, expensive, and heterogeneous data 
none (lecture only) Lecture 24 
HW10 due 
Thu 11/25 
Holiday 



Tue 11/30 
Practical Issues and Validation VI 
Visualizing and presenting data 
none (lecture only) Lecture 25 

Thu 12/2 
Practical Issues and Validation VII 
The entire data analysis process; styles of methods; when to use which methods; things I didn't teach you 
none (lecture only) Lecture 26 

Tue 12/7 
Review Session 



Thu 12/16 
Final Exam 


Exam: 2:505:40pm 