Machine Learning Theory (Weizmann course, Fall 2006-7)
In theory, this course will cover models and algorithms for machine learning. What classes of functions are learnable and what algorithms should be used? Where possible, we will try to prove appealing properties of algorithms that are also useful in practice. This course will also demystify many buzzwords such as: PAC learning, VC dimension, Boosting, Fourier Techniques, Statistical Queries, Parity with Noise, Decision Tree learning, Support Vector Machines, and the Multi-Armed Bandit Problem.
Note: Another course being offered this semester is Probabilistic Graphical Models by Eran Segal. His course is disjoint from this course and would be excellent to take in conjunction.
Location: Weizmann Institute, Ziskind Room 1
Time: Tue. 14:00-16:00, Oct. 31, 2006-Feb.
6, 2007
Instructor: Adam Tauman Kalai.
Grader: Ariel Gabizon.
This course is partly based on an excellent Machine Learning Theory course taught by Avrim Blum at CMU.
Mailing list
The course mailing list is here. You can join or simply read the messages.
Recommended textbooks
Michael Kearns and Umesh Vazirani. An introduction to computational learning theory.
Tom Mitchell. Machine Learning. (More applied)
Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The elements of statistical learning. (More statistics-oriented)
Homework
Homework should be turned in either in class or may be put in the MACHINE LEARNING course mail box, which is on the second floor of Ziskind.
Problem set 1 is now available. Due Nov. 28, 2006.
Problem set 1 hint is now available.
Problem set 2 is now available. Due Dec. 14, 2006.
Problem set 3, theory or implementation is now available. You have your choice. Due Jan. 11, 2007.
Problem set 4 is now available. Due Feb. 8, 2007. (Clarifications added 8/2/07.)
Lectures
This schedule is certain to change.
Introduction and online learning
Oct 31. Introduction. Learning an OR via the WINNOW algorithm, and online adaptation. An interesting article, The discipline of machine learning by Tom Mitchell.
Nov 7. The Weighted Majority and Perceptron algorithms.
An interesting related paper is the following:
Smoothed Analysis of the Perceptron Algorithm for Linear Programming by Avrim Blum and John Dunagan. Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, (SODA 2002), pp. 905-914, 2002.
Nov 14. Holiday.
Batch learning
Nov 21. Batch learning, online to batch conversion, and Occams razor. See also the handout on tail inequalities from Avrim Blum. Slides are available here.
Nov 28. VC dimension. Slides are available.
Dec 5. No class.
See interesting related lecture on Sunday, Dec. 10.
Dec 12. Boosting: AdaBoost. Here is a survey of boosting, and some excellent slides by Schapire.
Dec 19. Decision trees and Noisy/Real boosting. Slides are available. Note, course meets in Ziskind 261.
Dec 26. Completion of real-valued boosting and decision regression graphs,
and additive models. Last weeks notes plus this weeks slides cover this week.
Jan 2. Class canceled.
Jan 9. Statistical queries and learning parity with noise.
Jan 16. Support vector machines and fourier methods. We will use slides by Martin Law.
Jan 23. SVMs, Fourier, and Agnostic Learning.
Other models of learning
Jan 30. Online optimization. See paper by Zinkevich. For extension to the bandit setting, see this paper.
Feb 6. Learning in repeated games. See Regret in the On-line Decision Problem by Foster and Vohra for a great introduction. To see the specific reduction from external to internal regret I was talking about, see From External to Internal Regret by Avrim Blum and Yishay Mansour.