CS 4344/7344 - Winter 1998

CS 4344/7344 - Natural Language Understanding by Computer


Instructor: Kurt Eiselt
Office: College of Computing 112
Electronic Mail: eiselt@cc.gatech.edu
Phone: 404-894-8386
Office Hours: TBA

Instructor: Jen Holbrook
Office: GCATT 142
Electronic Mail: holbrook@cc.gatech.edu
Phone: don't bother, she's never there
Office Hours: Wednesdays 9:15am to 10:15am in the picnic area

Teaching Assistant: Lyman Taylor
Office: ???
Electronic Mail: lyman@cc.gatech.edu
Phone: 404-894-????
Office Hours: TBA

Newsgroup: git.cc.class.4344

Course Description: Of all the different phenomena that we recognize as intelligent behavior, the ability to express an infinite range of concepts with a finite set of symbols is unique to humans. It is not surprising, then, that when we consider what abilities we would like to see in intelligent machines, we often think of the ability to understand natural language. CS4344/7344 offers the student an opportunity to explore in some depth the topic of natural language understanding by computer, a major subfield of artificial intelligence (AI) research. In this course, the student is exposed to theories about the components of a language, the kinds of knowledge a computer must have in order to extract the explicit meaning of texts or utterances, and how a computer might be made to infer meaning that is not explicitly represented in those same texts or utterances. Some of the issues raised will be specific to language understanding, but many of the theoretical problems encountered in this course are common to a number of other subfields of AI. Besides exposing the student to the theoretical aspects of natural language understanding, CS4344/7344 also gives the student a chance to implement a working natural language understanding system from scratch. This helps to give the student an appreciation of the difficulties in turning possibly vague theories of artificial intelligence into working AI systems.

Prerequisites: Everyone in this course should have passed CS 3361 (or an equivalent course), and have a solid conceptual understanding of the material covered there. Anyone enrolled in CS 4344 should also be comfortable with Common LISP programming; if you don't already know LISP, this isn't a very good place to learn. If you're enrolled in CS 7344 and intend to work on the assignments for CS 4344, you too should have a good grasp of LISP programming.

Required Texts: Natural Language Understanding (second edition), by James Allen (Benjamin/Cummings, 1995).

Recommended Texts: Any good, comprehensive Common LISP manual. Common LISP: The Language (second edition) by Guy L. Steele (Digital Press) may be especially useful, and is regarded as biblical by serious LISP hackers. The contents of this book are available via the World Wide Web.

Course Requirements and Grading: For both undergraduates and graduate students alike, there will be three exams in this course. The two midterm exams will each count for 15% of your grade. The final exam will count for 30% of your grade. (Occasionally we encounter students who are at a disadvantage in exams because of a reading or other disability. If you have been diagnosed as having such a disability, please let us know so that we can arrange for a more appropriate method of evaluation for you.) For those students enrolled in CS 4344, there will be three programming assignments, each of which will be worth 10% of your grade. There will also be two non-programming assignments worth 5% each. If you're taking CS 4344 on a pass/fail (or satisfactory/unsatisfactory) basis, rest assured that you won't pass unless you do well on all the exams and homework assignments, so you might as well take it for a letter grade. Students enrolled in CS 7344 will have the option of completing the same assignments as the students in CS 4344, or they may pursue a term project of their own choosing, subject to approval of the instructors. The term project will be worth 40% of the grade and will be substantial, so it just might be easier to do the programming assignments.

Late Assignments: Assignments are due by the day and time specified in the assignment description. Late assignments will lose 25% of their value for each day they are late, including weekends and holidays. In other words, if an assignment is due on Monday and you don't turn it in until Wednesday, it will be graded as if it had been turned in on time, but your score will then be reduced by 50%. If you wait until Friday, you will receive no credit for the assignment. Computer downtime is not a mitigating circumstance, so start your assignments early.

Computing: All programming assignments are to be done in Common LISP. Common LISP is available on Georgia Tech's Macintoshes in several clusters on campus, on PCs in the College of Computing, and on Sun workstations in the College of Computing and in the Rich Building. The College of Computing holds site licenses for Macintosh Common LISP from Digitool and LispWorks for Windows from Harlequin; these licenses allow the College to distribute this software to its students. Talk to Kurt about how to get a copy of either of these LISP systems.

Academic Misconduct: Because you are being graded relative to other students in this course and not on an arbitrary, predetermined scale, any student's attempt to increase his or her grade through dishonest means will unfairly decrease the grades of other honest students. The homework assignments in this course are not intended to be collaborative exercises, but on the other hand I don't want to discourage discussion between students about ideas pertaining to natural language understanding. So here's how things work in this course: if you incorporate into your homework assignments ideas that did not originate with you, or did not come from the obvious sources--your instructor, teaching assistant, textbooks, lectures, or supplementary reading materials provided in this course--you must give credit to your sources. In other words, if you submit homework which is not entirely the result of your own efforts, you must explain which parts are due directly or indirectly to an outside source, and who or what that outside source is. Failure to do so constitutes plagiarism, and plagiarism carries severe penalties. Of course, there is to be no collaboration whatsoever during exams. If you haven't already done so, you should take the time to become familiar with Georgia Tech's definition of academic misconduct and the policies and procedures pertaining to academic misconduct. This information can be found in the 1997-99 general catalog on pages 364-371.

Tentative Course Schedule:

Last revised: March 19, 1998