**Recommended (but not required) textbooks:**

*An Introduction to Computational Learning Theory*by M. Kearns and U. Vazirani*A Probabilistic Theory of Pattern Recognition*by L. Devroye, L. Györfi, G. Lugosi.

- 08/19: Introduction.
The
consistency model.

See also Chapter 1 in the Kearns-Vazirani book. - 08/21: The PAC
model for passive supervised learning.

See also Chapter 1 in the Kearns-Vazirani book. - 08/26: The PAC
model for passive supervised learning (cont'd).

See also Chapter 1 in the Kearns-Vazirani book. - 08/28: The Perceptron algorithm for learning linear separators.
- 09/05: The
mistake bound model.

See also: An Online Learning Survey by Avrim Blum (Section 3 in particular). - 09/09: The
Winnow algorithm.

See also: The original Winnow paper. See also: Kakade and Tewari lecture notes. - 09/11: Tail Inequalities. Simple sample complexity results for the agnostic case.
- 09/16: The Vapnik Chervonenkis dimension. Sauer's
lemma .

See also Chapter 3 in the Kearns-Vazirani book. - 09/18: Sample complexity results for infinite hypothesis spaces (cont'd)..

See also Chapter 3 in the Kearns-Vazirani book. - 09/23: Sample
complexity lower bounds for passive supervised learning.

See also Chapter 3.6 in the Kearns-Vazirani book. - 09/25: Sample complexity results for infinite hypothesis spaces (cont'd).
- 09/30: Boosting: Weak Learning and Strong Learning. See Avrim Blum's lecture notes.

See also Chapter 4 in the Kearns-Vazirani book. - 10/02: Adaboost. See Rob
Schapire's
lecture
notes.

See also: A Short Introduction to Boosting, by Yoav Freund and Rob Schapire. - 10/07: Adaboost. Generalization error bounds: naive and margins-based.
- 10/09: Kernels methods.
- 10/16:
Kernels methods (cont'd). Properties of Legal Kernel Functions. More Examples of Kernels.

See also: Kernels as Features: On Kernels, Margins, and Low-dimensional Mappings, Machine Learning Journal 2006. - 10/21: Support Vector Machines. See Andrew Ng's notes. See also Steve Boyd's lectures on Convex Optimization.
- 10/23: Generalization bounds based on the Rademacher Complexity.
- 10/28:
Generalization bounds based on the Rademacher Complexity (cont'd).

See Introduction to Statistical Learning Theory. by O. Bousquet, S. Boucheron, and G. Lugosi. - 10/30:
Semi-Supervised Learning.

See also The Encyclopedia of Machine Learning entry on Semi-Supervised Learning by X. Zhu. - 11/04:
Theoretical Foundations for Semi-Supervised Learning.

See also: A Discriminative Model for Semi-Supervised Learning, JACM 2010. - 11/06:
Active Learning.

For the CAL and A^2 algorithms, as well as the disagreement coefficient analysis see Shai Shalev-Shwartz's notes. - 11/11:Computationally efficient with nearly optimal label complexity active learning.

See also: Active and Passive Learning of Linear Separators under Log-concave Distributions, COLT 2013. - 11/13:
Online Learning, Combining Experts Advice, and Regret Minimization. The
Weighted Majority Algorithm.

See also The original Weighted Majority paper by N. Littlestone and M. Warmuth.

- PASCAL video lectures.
*Kernel Methods for Pattern Analysis*by N. Cristianini and J. Shawe-Taylor. Cambridge University Press, 2004.*An Introduction to Support Vector Machines (and other kernel-based learning methods)*by N. Cristianini and J. Shawe-Taylor. Cambridge University Press, 2000.*Learning in Neural Networks: Theoretical Foundations*by M. Anthony and P. Bartlett. Cambridge University Press, 1999.*Statistical Learning Theory*by V. Vapnik. Wiley, 1998.*Foundations of machine learning*by M. Mohri, A. Rostamizadeh, and A. Talwalkar. MIT Press, 2012*Learning theory*by A. Tewari and P. L. Bartlett. E-Reference Signal Processing, 2013