Gerhard Sagerer, University
Bielefeld
Abstract:
Human Computer interaction using means of communication which are
natural to humans, like speech or gestures, is still a challenging
task. The machine should be able to process acoustic and visual input
and react in an adequate manner by producing speech output or by some
manipulation of objects or other reactions in the environment. Within
a research project we follow the paradigm of cognitive human machine
interaction. We combine the perceived input data with a priori stored
and acquired
knowledge about objects, language, actions, and plans. Three domains
will be discussed. In detail cooperative construction of a toy-airplane
with parts from a wooden construction kit for children will be presented.
The other applications are interaction with a mobile robot and information
retrieval for image data bases.
The structure of
the systems will be outlined. Especially, the problem of fusing
the understanding of spoken instructions with the visual perception
of the environment will be addressed. The components
building up both the speech and the vision modality and their interaction
will be described. The architecture follows a hybrid and distributed
organization. It copes with uncertainty and errors of perception
results and world models as well as with different time scales and
overlapping capabilities of the various modules. Except different
hierarchies, competing algorithm are provided for one level within
a certain hierarchy. Complex objects are modeled due to various
aspects, like a top down decomposition into parts, arrangements
of data driven detected important points and their classification,
or contours and local features. In such a network of processing
paths, competing modules, and top down as well as bottom up data
flow the scoring of intermediate results, their combination and
estimation is an important issue. Learning of object descriptions
and categories is based on information achieved during dialogs using
speech and gesture.
Biosketch
Gerhard Sagerer received the diploma and the Ph.D. (Dr.-Ing.) degree
in computer science from the University of Erlangen-Nrnberg, Erlangen,
Germany, in 1980 and 1985 respectively. In 1990 he received the
venia legendi (Habilitation) in computer science from the Faculty
of Technology of this university. From 1980 to 1990 he was with
the research group for pattern recognition (Institut fr Informatik,
Mustererkennung) at the University of Erlangen-Nrnberg, Erlangen,
Germany. Since 1990 he is a professor of computer science at the
University of Bielefeld, Germany, and head of the research group
for Applied Computer Science (Angewandte Informatik). During 1991-1993
he was a member of the academic senate of the university. During
1993-1995 and 1997-2001 he was dean of the Faculty
of Technology of the university. In 1995 he was chairman of the
annual conference of the German Society for Pattern Recognition.
He is on the Scientific Board of the German Section of Computer
Scientists for Peace and Social Responsibility (Forum InformatikerInnen
fr Frieden und gesellschaftliche Verantwortung), FIFF.
His fields of research are image and speech understanding including
artificial intelligence techniques and the application of pattern
understanding methods to natural science domains. He is author,
coauthor, or editor of several books and technical articles. Dr.
Sagerer is member of the German Computer Society (GI), the European
Society for Signal Processing (EURASIP) and the Institute of Electrical
and Electronics Engineers (IEEE).
|