# CS 4641 Machine Learning

## Summer 2012 Syllabus

Last updated on 2012-07-25 at 09:58.

# Instructor

Chris Simpkins

chris.simpkins@gatech.edu

http://www.cc.gatech.edu/~simpkins/

# Schedule and Office Hours

• Classes: Monday and Wednesday at 100-1145 in CoC 101
• Office Hours: Mondays and Wednesdays after class, other times by appointment, always available by email.

# Prerequisites

• Official: Undergraduate Semester level CS 1331 (Programming 1) Minimum Grade of C
• Unofficial: Basic probability and statistics, basic linear algebra

# Course Description

CS 4641 is an introductory survey of modern machine learning. Machine learning is an active and growing field that would require many courses to cover completely. This course aims at the middle of the theoretical versus practical spectrum. We will learn the concepts behind several machine learning algorithms wtihout going deeply into the mathematics and gain practical experience applying them. We will consider pattern recognition and artificial intelligence perspectives, making the course valuable to students interested in data science, engineering, and intelligent agent applications.

Required:

Recommended:

# Projects

Projects give you practical experience applying machine learning algorithms to real-world data. Each project is designed to highlight specific conceptual and practical issues to develop your intuition about how to use machine learning for your own problems.

• Project1: Supervised Learning (KNN, Decision Trees, Neural Networks)
• Project2: Unsupervised and Supervised Learning (Clustering, Dimensionality Reduction, SVMs, Boosting)
• Project3: Hidden Markov Models and Reinforcement Learning

Conceptual understanding and practical application will be graded equally:

• Projects: 50%
• Exams: 50%

Final letter grades will be determined by clustering the distribution of course averages. This clustering will be as generous as possible consistent with applicable policy. Guaranteed grades will be determined by:

```val letterGrade = courseAverage match {
case avg if avg >= 90 => 'A
case avg if avg >= 80 => 'B
case avg if avg >= 70 => 'C
case avg if avg >= 60 => 'D
case _ => 'F
}
```

Note that in the past, scores have been much lower than the guaranteed scores listed above. For example, on the last midterm exam a score of 30 was a B. It's the distribution that matters, not the raw score.

# Resources

Class resources, including all assignments, will be maintained on this open-access web site. Projects will be submitted and grades will be maintained in T-Square.

• Machine Learning at GT
• Weka - Machine learning software you'll be using for some of your projects.
• UCI Machine Learning Repository - An online repository of data sets that can be used for machine learning experiments.
• Introduction to Probability by Bertsekas and Tsitsiklis. The first chapter on probability is available free on the book's web site and is all the background you need for bayesian decision theory. The next three chapters on distributions and random variables are useful background for parametric methods and unsupervised learning. I bought a copy of this book for my own self-study refresher.
• I don't have a favorite linear algebra refresher. I used a Strang textbook when I put a gun to my head and enrolled in MATH 4305 Linear Algebra years ago, and it was fine. As with probability and statistics, you only need conceptual understanding for this course. As long as you understand what a matrix decompsition is, what a projection is, and the general concepts of lines, planes, and vector spaces, you're fine.

# Lectures and Assignments

• Readings should be completed before class.
• Homeworks should be done after the class in which they are assigned and completed before the next class.
• Assignments must be submitted on T-Square by 23:59:59 on the due date.
• Future schedule may change.
Date Topics Assignments
2012-05-14 Introduction

HW: I2ML 1.1,6,7,8,9

2012-05-16 Supervised Learning

HW: I2ML 2.1, 2, 4, 5, 7, 8

2012-05-21

Bayesian Decision Theory

Nonparametric Methods

HW: I2ML 3.1, 2, 5; 8.2, 3, 4

2012-05-23

Decision Trees

ML Experiments

HW: I2ML 9.1, 2, 4

Assignment: Project1

2012-05-28 Memorial Day Holiday
2012-05-30

Linear Discrimination

Multilayer Perceptrons

HW: 10.1,7-9 ; 11.1-3

2012-06-04 Online Review Work on Project 1
2012-06-06 Online Office Hours Work on Project 1
2012-06-11 Exam 1

Due 2012-06-11: Project1

2012-06-13 Exam 1 and Project 1 Review Withdrawal deadline: 2012-06-15
2012-06-18 Parametric & Multivariate Reading: I2ML 4, 5
2012-06-20 Dimensionality Reduction Reading: I2ML 6, PCA Primer
2012-06-27 Kernel Machines

Assignment: Project2

2012-07-02 Combining Learners Reading: I2ML 17
2012-07-04 Independence Day Holiday
2012-07-06 Exam 2 Review Session 15:00 in CoC 101
2012-07-09 Exam 2 Reading: I2ML 4-7, 12-13, 17
2012-07-16 Reinforcement Learning

Due 2012-07-16: Project2

2012-07-18 Reinforcement Learning

Assignment: Project3

2012-07-23 Hidden Markov Models Reading: I2ML 15
2012-07-25 Course Review