NLP: CS4650/7650
MW 3:304:45pm, Hybrid
Course Information
This course gives an overview of modern datadriven techniques for natural language processing. The course moves from shallow bagofwords models to richer structural representations of how words interact to create meaning. At each level, we will discuss the salient linguistic phenomena and most successful computational models. Along the way we will cover machine learning techniques which are especially relevant to natural language processing.
Slides, materials, and projects information for this iteration of NLP courses are borrowed from Jacob Eisenstein, Yulia Tsvetkov and Robert Frederking at CMU, Dan Jurafsky at Stanford, David Bamman at UC Berkeley, Noah Smith at UW, KaiWei Chang at UCLA.
Teaching Assistants
 Class Meets
 Mondays and Wednesdays, 3:304:45pm
 Piazza
 piazza.com/gatech/fall2020/cs7650cs4650
 Staff Mailing List
 cs46507650f20staff@googlegroups.com
 Online Office Hours (Eastern Time)
 Nihal Singh: Monday 23 pm.
 Kaige Xie: Tuesday 67 pm.
 Jingfeng Yang: Tuesday 1011 pm.
 Sandeep Soni: Wednesday 1011 am.
 Yuval Pinter: Thursday 1011 am.
 Haard Shah: Friday 23 pm.
Schedule
Note: tentative schedule is subject to change.
Date  Topic  Optional Reading 
Aug 17 
Introduction, Review
Slides 

Aug 19 
Text Classification (1)
Slides 

Aug 24 
Text Classification (2)
Slides, HW1 Out, HW1 Template 

Aug 26 
Language Modeling (1)
Slides 

Aug 31 
Language Modeling (2)
Slides 

Sep 2  Deep Learning Basics
Slides, Deep Leaning, Pytorch, HW2 Out, HW2 Template HW1 Due. 

Sep 7  Holiday  Labor Day


Sep 9 
Word Embedding (1)
Slides 

Sep 14  Word Embedding (2)
Slides Survey Team Sign Up Due, Sign Up Form 

Sep 16  Word Embedding (3)
Slides, HW3 Out, HW3 Template HW2 Due 

Sep 21  Sequence Labeling (1)
Slides 

Sep 23  Sequence Labeling (2)
Slides 

Sep 28  Constituency Parsing (1)
Slides 

Sep 30  Constituency Parsing (2)
Slides HW3 Due 

Oct 5  Inperson Touchpoint
HW4 Out, HW4 Code 

Oct 7  Parsing
Slides Survey Proposal Due 

Oct 12  Midterm Review
Slides 

Oct 14  Midterm  
Oct 19  Question Answering
Slides HW4 Due 

Oct 21  Machine Translation
Slides, HW5 Out, HW5 Template, HW5 Code Template Midterm Due 

Oct 26  Alan Ritter's Guest Lecture:
Machine Reading with Less Supervision Slides 

Oct 28  Information Extraction
Slides 

Nov 2  Dialogue Systems
Slides 

Nov 4  Dialogue Systems + Review
Slides HW5 Due 

Nov 9  Computational Social Science
Slides 

Nov 11  Xu Wei's Guest Lecture
Slides Survey Report Due 

Nov 16  Review
Slides 

Nov 18  Computational Ethics
Slides 
Grading
 55% Homework Assignments
 Homework 1: 11%
 Homework 2: 11%
 Homework 3: 11%
 Homework 4: 11%
 Homework 5: 11%
 15% Midterm Exam (Takehome Exam)
 20% Project/Survey
 Survey Proposal: Due Oct 7th, 11:59pm ET
 Survey Report: Due Nov 11th, 11:59pm ET
 Survey Final: Due Nov 22nd, 11:59pm ET
 10% Quiz
Policies
Late Policies:
Student will have a total of five late days to use when turning in homework assignments; each late day extends the deadline by 24 hours. There are no restrictions on how the late days can be used (e.g., all 5 could be used on one homework). Using late days will not affect your grade. However, homework submitted late after all late days have been used will receive no credit.
Class Policies:
Attendance will not be taken, but you are responsible for knowing what happens in every class. The instructor will try to post slides and notes online, and to share announcements, but there are no guarantees. So if you cannot attend class, make sure you check up with someone who was there.
Prerequisites
The official prerequisite for CS 4650 is CS 3510/3511, “Design and Analysis of Algorithms.” This prerequisite is essential because understanding natural language processing algorithms requires familiarity with dynamic programming, as well as automata and formal language theory: finitestate and contextfree languages, NPcompleteness, etc. While course prerequisites are not enforced for graduate students, prior exposure to analysis of algorithms is very strongly recommended.
Furthermore, this course assumes:
 Good coding ability, corresponding to at least a third or fourthyear undergraduate CS major. Assignments will be in Python.
 Background in basic probability, linear algebra, and calculus.
 Familiarity with machine learning is helpful but not assumed. Of particular relevance are linear classifiers: perceptron, naive Bayes, and logistic regression.
People sometimes want to take the course without having all of these prerequisites. Frequent cases are:
 Junior CS students with strong programming skills but limited theoretical and mathematical background,
 NonCS students with strong mathematical background but limited programming experience.
Students in the first group suffer in the exam and don’t understand the lectures, and students in the second group suffer in the problem sets. My advice is to get the background material first, and then take this course.
FAQs

The class is full. Can I still get in?
Sorry. The course admins in CoC control this process. Please talk to them.

I am graduating this Fall and I need this class to complete my degree requirements. What should I do?
Talk to the advisor or graduate coordinator for your academic program. They are keeping track of your degree requirements and will work with you if you need a specific course.

I have a question. What is the best way to reach the course staff?
Registered students – your first point of contact is Piazza (so that other students may benefit from your questions and our answers). If you have a personal matter, email us at the class mailing list cs46507650f20staff@googlegroups.com