Word spotting: An interface to facilitate searching audio


Sponsor Gregory Abowd
abowd@cc.gatech.edu
380 CRB and 240 CCB
Area HCI, Software Engineering and Future Computing Environments

Problem
In the Summer of 1997, the Classrooom 2000 project developed the capability to search the audio of a lecture. This functionality will be provided in two ways. The purpose of this project is to create some sample interfaces for allowing a student to search the audio of one or more classes.

The audio search system was developed by Peter Cardillo, a graduate student in ECE. The system works by using a Hidden Markov Model to convert an audio file (as a WAV file) to a sequence of timestamped phonemes. A phoneme dictionary is used to convert words into their corresponding phoneme representation. The phoneme representation is then used to perform a search of the audio, returning a series of ranked hits. The final stage of the system then delivers a pointer to stream-based audio player (such as RealAudio) that plays the audio from where the word was spotted. Here is a demonstration of the word spotting system with a simple interface.

In this project, you are to build on this simple interface to allow for the following new features:



Background
Deliverables
Evaluation
If all of the above deliverables are provided, you will receive full credit. In the event that you are unable to complete all parts, then you will need to meet with Dr. Abowd to discuss evaluation.