CS 4650: Natural Language Processing
Georgia Tech (Spring 2021)
This undergraduate-level course provides an introduction to modern natural language processing using machine learning approaches. Content includes linguistics fundamentals (syntax, semantics, distributional properties of language), machine learning models (classifiers, sequence taggers, deep learning models), key algorithms for inference, and applications to a range of problems. In-person attendance is not required for the course.
- Class Meets (Eastern Time)
- Mondays and Wednesdays, 3:30-4:45pm, Bluejeans
- Piazza (discussion, announcements, QA)
- Canvas (quiz, etc.)
- Gradescope (homework submission, grading)
- Online Office Hours (Eastern Time)
- Jingfeng Yang: Thursday 10:00-11:00pm
- Kaige Xie: Wednesday 6:00-7:00pm
- Sarmishta Velury: Monday 6:00-7:00pm
While natural language processing is super cool, it requires usage of many modern machine learning algorithms and involves a lot of math and programming. To be successful in the class, on the math side, you should feel comfortable with probability, linear algebra, and calculus. For example, there will be partial derivatives and multivariable chain rule in some of the lectures. If you are not yet familiar with calculus, the best course of action would be taking Math 2550 or Math 2551 or Math 2561 first, then coming back in a later semester to take 4650. On the programming side, assignments will be in Python; you should understand basic computer science concepts (e.g., recursion), data structures (e.g., trees, graphs), and key algorithms (e.g., search, sorting, etc.). The official prerequisite for CS 4650 is CS 3510/3511, “Design and Analysis of Algorithms.” Ideally, you also have taken CS 3600, the intro-level “Machine Learning” class.
Subject to change as the term progresses.
First day of Class. No Class
Background Test Out, Background Test Template, Due on Jan 21
MLK National Holiday
Slides, HW1 Out, HW1 Template (coming soon)
Slides (coming soon)
- 5% Background Test (individual)
This background test is designed to help you determine whether you have enough math and programming background to succeed in this class.
- 55% Homework Assignments (individual)
- Homework 1: 15% (written + 1st programming)
- Homework 2: 10% (written)
- Homework 3: 15% (written + 2nd programming)
- Homework 4: 15% (written + 3rd programming)
- 20% Final Project (group of 1-3) or the 4th programming homework (individual)
The final project is an open-ended assignment, with the goal of gaining experience applying the techniques presented in class to real-world datasets. Students should work in groups of 1-3 (groups of 2-3 preferred, 1 is possible). It is a good idea to discuss your planned project with the instructor to get feedback. The final project report should be 4 pages. The report should describe the problem you are solving, what data is being used, the proposed technique you are applying in addition to what baseline is used to compare against.
Alternatively, you may choose to complete the 4th programming homework individually, instead of the group final project.
- 10% Quizzes
- 10% Participation
You will receive credit for engaging in class discussion, asking and answering questions related to the homework on Piazza discussion board.
Student can at most be late for 2 homework, each for 3 days. Each late day extends the deadline by 24 hours. Using late days will not affect your grade. However, homework submitted late after all late days have been used will receive no credit. Please email your homework to the instructor in case there are any technical issues with submission. No late submission for the final project and quizzes will be accepted.
No late penalties for medical reasons or emergencies. Please see GT Catalog for rules about contacting the office of the Dean of Students.
The class is full. Can I still get in?
Sorry. The course admins in CoC control this process. Please talk to them.
I am graduating this Fall and I need this class to complete my degree requirements. What should I do?
Talk to the advisor or graduate coordinator for your academic program. They are keeping track of your degree requirements and will work with you if you need a specific course.
I have a question. What is the best way to reach the course staff?
Registered students - your first point of contact is Piazza (so that other students may benefit from your questions and our answers). If you have a personal matter, email the instructor.
- (J+M) Jurafsky and Martin, Speech and Language Processing, 3rd edition (Dec 2020 draft)
- (E) Jacob Eisenstein, Natural Language Processing (2018)