Appears in A. Ram & K. Moorman (editors), Understanding Language Understanding: Computational Models of Reading, MIT Press.
 

Towards a theory of reading and understanding

Ashwin Ram
Kenneth Moorman
College of Computing
Georgia Institute of Technology
 




 
 

Motivations

The human ability to understand and use language remains one of the unsolved mysteries of modern science. Language is one of the crucial aspects of human intelligence; in fact, some have argued that it is the central aspect (e.g., Fodor, 1975; Johnson, 1987; Lakoff & Johnson,1980; Whorf, 1956; Wittgenstein, 1968). Although the human language processing system has been studied extensively by researchers from a number of perspectives, including technical, social, and psychological perspectives, it is still unclear how humans process language and even what a scientific theory or explanation of this ability might look like.

In this volume, we focus on one of the tasks that the human language processing system is responsible for--reading. By reading we mean the task that takes as its input a body of text in a natural language [1] and produces as its output an understanding of that text. An obvious question to be addressed is the nature of this understanding: what is it, how it is represented, for what and how it is used, and how it might be measured. Another important question is the nature of the task itself: how is it carried out, what its constituent tasks are, and how we (as researchers) might describe this task and how it works. Implicit in this approach is the assumption that a theory of reading must account not only for what reading produces as its result (an understanding of the given text) but also how exactly reading works such that it can produce the said result from the given text. In other words, we seek an explanatory theory or model of the reading process and not simply a descriptive account.

Our goal is to address the problem of reading comprehension--processing and understanding a natural language text, narrative or story. This constrains our endeavor in two ways. First, an account of reading must explain how the reader can understand text, that is, understand the situations described in the text, explain who did what to whom, and how, and why, and construct a coherent interpretation of the text that "makes sense." A theory which focuses, for example, only on syntactic parsing of sentences is, by this metric, not a theory of reading comprehension or text understanding, although it might certainly be an important piece of a complete theory. The second constraint is that an account of reading must explain how the reader can understand "real" natural language texts—narratives, stories, newspaper articles, dialogs, advertisements, and so on. This rules out models which focus only on the processing of single sentences taken out of context or of small researcher-constructed "stories." Although such models are certainly important in that they provide crucial stepping stones towards the "big picture" and may even be a piece of the complete theory of reading, they do not by themselves constitute a satisfactory account of the human reading capability. Methodologically, of course, researchers must often concentrate on narrower subtasks of the reading process (such as syntactic parsing, or explanation construction, or belief modeling) and/or on a narrower range of textual inputs (such as individual sentences, or short newspaper articles, or simple question-and-answer scenarios); the point is that the eventual goal of the endeavor that has come to be known as natural language processing (NLP) is to produce a theory of reading comprehension "in the large."
 
 

Assumptions

What might a theory of reading look like? We make two assumptions in this volume. First, a scientific understanding of how agents read is best expressed in terms of a functional-computational-representational model of the reading process. [2]  By functional we mean that the process will be defined in terms of its inputs and outputs and that it may be decomposed into one or more interactive subtasks and further sub-subtasks which, in turn, will be defined in terms of their inputs and outputs as well as their interactions with each other. Once defined, the theory will also explain how exactly each subtask works such that it can perform its function of transforming its inputs into its outputs. By computational we mean that this transformation will be described using an information-processing or computational model—an explanatory, step-by-step account of how exactly the reading system (human or machine) can derive the required outputs from the given inputs. By convention, this account will be written down using the language of computer algorithms and implemented using a computer program that can be executed to provide evidence that the model does, in fact, do what is claimed of it. This requirement forces the theory to be described precisely and provides a means for experimentation; these and other benefits of the "computational psychology" or "cognitive modeling" approach will be discussed below. Finally, by representational we mean that the reading process is expected to make use of extensive background knowledge in order to understand a text and produce as its output some description of the information conveyed by the text; both the background knowledge as well as the output description will be represented in some manner inside the reading system. The form, content, and organization of these representations is as much a research issue as is the process that utilizes and produces them.

The second assumption underlying this volume is that inasmuch as a theory of reading is concerned with accounting for the human ability to read, it is important that the functions, processes, and representations postulated by the theory, and the behaviors exhibited by the model, be cognitively plausible and justified to the extent possible through psychological experimentation. Where it is not possible to obtain detailed psychological data to verify or refute fine-grained assumptions of a theory, these assumptions may be justified in teleological terms (for example, computational, functional, ecological, evolutionary, or philosophical arguments for why a subsystem works the way it does) or at least via a sufficiency argument that demonstrates that the proposed model is able to produce the behaviors that are being accounted for (see, for example, Ram & Jones, 1995). This demonstration is facilitated by the presence of an executable computer model.
 
 

The modeling approach

Before we visit the reading task in more detail, let us discuss the computational modeling approach that we will take to address this task. It is probably the case that many of the models described in this volume will appear too limited to appear to be actually "reading" the texts which they are given in the full sense of the word. Then what purpose do such models serve? In the computational modeling approach, the model itself is not the end of the research cycle; instead, the model is used as a tool by the researcher in order to refine the overarching theory behind it. As Margaret Boden expressed it (Boden, 1986):
...artificial intelligence is the use of programs as tools in the study of intelligent processes, tools that help in the discovery of the thinking-procedures and epistemological structures employed by intelligent creatures.
As a tool, then, what power does the computational model give to the intelligence researcher? Boden suggests a set of what she calls Lovelace questions which explore the "usefulness" of computer modeling with respect to the study of creativity (see Boden, 1991). These questions are easily adapted to be applicable to the study of reading as well.

First, can a computational model ever perform in a way such that it appears to read and understand? The answer to this is "yes," as many of the models depicted in this volume will show, albeit perhaps in a manner or domain that is narrower than the full human reading capacity can handle. However, this is an uninteresting question. After all, ELIZA (Weizenbaum, 1966) appeared to comprehend quite a bit using nothing more than simple pattern matching, substitution, and a human willingness to believe. Of course, the ways in which we tend to measure the appearance of cognitive ability are now more strict, but even then most of the models here will at least appear to be performing some aspect of reading.

If the first question is not that interesting, a reasonable followup might be: Can a computational model ever really be able to read and understand material? Unfortunately, it is not clear exactly how one can distinguish "true" comprehension from the mere appearance of comprehension; thus, this question is best left to computational philosophers.

If the appearance of comprehension is uninteresting and the reality of comprehension is beyond the scope this volume, where does that leave us? The issue is not whether an implemented computer program can actually read and understand text but whether building such programs is a reasonable way to approach the problem of producing an explanatory theory of reading and understanding. The third Lovelace question, therefore, is the one we will concentrate on: Can computational models help us understand how human reading is possible? We believe the answer to be "yes" for a number of reasons:

Reading is a large, complicated, and ill-defined cognitive behavior, and one that is extremely difficult to capture theoretically. However, for the above reasons, computational modeling is a promising approach towards this problem. Even if implemented models are still primitive with respect to human performance, the endeavor of theorizing about, building, evaluating, and revising these models can add significantly to our knowledge of the human reading capacity.
 
 

The tasks of reading

A theory of reading, as we have defined it, must deal with a wide range of issues and account for a wide range of behaviors and capabilities. Consider the following example (Henry, 1986), which is the first paragraph of a longer story:
One dollar and eighty-seven cents. That was all. And sixty cents of it was in pennies. Pennies saved one and two at a time by bulldozing the grocer and the vegetable man and the butcher until one's cheeks burned with the silent imputation of parsimony that such close dealing implied. Three times Della counted it. One dollar and eighty-seven cents. And the next day would be Christmas.
Some of the pieces of this puzzle include: The chapters in this volume span this range of tasks that reading research has been concerned with. We begin with Rapaport and Shapiro's discussion (Chapter 2) of cognitive models of reading, and the relationship between cognition and fiction. They explore the epistemological questions of how a cognitive agent could represent fictional entities and their properties, and reason about such entities, and their relationship with non-fictional entities, during the course of reading a story. Following this, Mahesh, Eiselt, and Holbrook (Chapter 3) discuss psycholinguistic issues in sentence processing, focusing in particular on how multiple types of information, such as syntactic and semantic information, can be integrated while understanding a sentence. They present a computational model that can resolve ambiguous interpresentations of a sentence and recover from conclusions that turn out to be erroneous. Next, Domeshek, Jones, and Ram (Chapter 4) discuss issues of form, content, and organization in knowledge representation. They discuss how a reader can represent the meaning of a text as well as the inferential knowledge that is required to understand the text.  Wharton and Lange (Chapter 5) discuss how a reader's episodic memory might be organized and deployed to provide support for the reader's inferential processes. They argue that the process by which some text is understood should be integrated with the process by which it is used to recall relevant information from memory, and present a computational model of the combined process. Langston, Trabasso, and Magliano (Chapter 6) further the discussion of inference, presenting a model of text comprehension along with psychological data supporting their model. They explore the differences between on-line processing during text comprehension and off-line processing after the text has been read.

Following these chapters, we turn our attention to issues of contextualization of the reading processes in the structure of the text as well as the overarching tasks that the reader is engaged in. Meyer (Chapter 7) discusses how the reader can use the structure of the text to support the comprehension of that text. Different genres of text are read in different ways because the individual characteristics of the readers interact with the individual characteristics of the texts and of the authors of those texts. Ram (Chapter 8) discusses the influence of the reader's learning goals on the manner and depth to which the text is processed. He presents a model of reading as an active process in which the reader subjectively processes the text while seeking information, creating hypotheses, asking questions, and pursuing interesting ideas.

We then move on to discuss issues of learning and creativity. Peterson and Billman (Chapter 9) present a model that explains how a reader handles linguistic novelty. They present a computational model that can read and interpret sentences containing novel verbs using underlying semantic information about the language. Moorman and Ram (Chapter 10) discuss a model of creative understanding which enables a reader to comprehend texts that contain novel concepts. They show how a reader can creatively understand novel concepts in a science fiction story using analogical reasoning and problem reformulation supported by a principled representation of knowledge. Cox and Ram (Chapter 11) discuss parallels between reading and learning, arguing that there are many similarities between these two tasks: identification of interesting input, elaboration of input concepts, determination of the agent's goals, and determination and execution of the strategies to be used to process the input in pursuit of those goals.

While this volume is primarily concerned with functional-computational-representational models of reading, be they symbolic or distributed (e.g., connectionist) models, Riloff (Chapter 12) presents a number of alternative recent approaches which, while they share much with the previous models, deviate from many of the assumptions underlying these models. She argues that information extraction approaches, concerned with identifying and extracting specific types of information from text rather than in-depth knowledge-intensive analysis of text, can provide significant leverage in story understanding. Gerrig (Chapter 13) discusses of what human reading is really like, and provides several directions which future research on reading will need to pursue. He describes the reader's experience of being transported into the narrative world of a text and mentally participating in that narrative world during the reading process. Finally, Fletcher (Chapter 14) concludes with his perspective on the endeavor of building computational models of reading, such as those presented in this volume, arguing that it is productive to invest resources and intellectual energy in this enterprise. 
 

References

Boden, 1991
M.A. Boden. The Creative Mind: Myths and Mechanisms. Basic Books, Inc., New York, 1991.
Boden, 1986
M.A. Boden. Artificial Intelligence and Natural Man. Basic Books, Inc., New York, second edition, 1986.
Cohen, 1995
P.R. Cohen. Empirical Methods for Artificial Intelligence. MIT Press, Cambridge, MA, 1995.
Fodor, 1975
J.A. Fodor. The Language of Thought. Thomas Y. Crowell, New York, 1975.
Henry, 1986
O. Henry. Gifts of the magi. In Paul J. Horowitz, editor, Collected Stories of O. Henry. Avenel Books, New York, 1986.
Hintzman, 1991
D.L. Hintzman. Why are formal models useful in psychology? In William E. Hockley and Stephen Lewandowsky, editors, Relating Theory and Data: Essays on Human Memory in Honor of Bennet B. Murdock. Lawrence Erlbaum Associatates, Publishers, Hillsdale, NJ, 1991.
Johnson, 1987
M. Johnson. The body in the mind: Bodily basis of meaning, imagination, and reason. University of Chicago Press, Chicago, 1987.
Lakoff & Johnson, 1980
G. Lakoff and M. Johnson. Metaphors We Live By. University of Chicago Press, Chicago, IL, 1980.
Ram & Jones, 1995
A. Ram and E. Jones. Foundations of foundations of artificial intelligence. Philosophical Psychology, 8(2):193-199, 1995.
Weizenbaum, 1966
J. Weizenbaum. ELIZA—A computer program for the study of natural language communication between man and machine. Communications of the ACM, 9:36-45, 1966.
Whorf, 1956
B. L. Whorf. Science and linguistics. In J. B. Carroll, editor, Language, Thought, and Reality. MIT Press, Cambridge, MA, 1956.
Wittgenstein, 1968
L. Wittgenstein. Philosophical investigations. Macmillan, New York, 1968. Translated by G. E. M. Anscombe.

Footnotes

[1] ...text in a natural language
A natural language is a language that has evolved through use in a social system (for example, English, Spanish, French, or Hindi) as opposed to one that has been designed by people for a specific purpose (for example, Fortran or Java). Languages which are engineered but evolve through social action (for instance, Esperanto, American Sign Language, and Klingon) are also examples of natural languages.
[2] ...functional-computational-representational model of the reading process
This does not imply that all research into reading or natural language processing must necessarily involve computational modeling; on the contrary, a range of psychological, social, and computational research is needed to work towards the common goal of producing a detailed functional-computational-representational model of reading.