Cognitive Modeling and Intelligent Tutoring

Anderson, J.R., Boyle, C.B., Corbettm, A.T., & Lewis M.W. (1990). Cognitive modeling and intelligent tutoring. In Artificial Intelligence, Clancy (Ed.), Elsevier, 7-49.

Prepared by Jonah Lunken 1/17/94


 Two goals of choosing intelligent tutors: (p.7)
 1. develop systems for automating education
 2. explore epistemological issues concerning the nature of the knowledge 
 that is being tutored and how that knowledge can be learned. 
 
 ACT* 
 - theory that made claims about the organization and acquisition of 
 complex cognitive skills
 to interpret the student's behavior ACT* constructs:
 1. performance models of how students actually execute the skills that are 
 to be tutored
 - set of correct and incorrect rules for performing the skills in question
 - is used in a paradigm we call model tracing
 - in this paradigm the student's responses to the rules in the model are 
 compared in an attempt to follow in real time the cognitive states that 
 the 
 student goes through in solving a problem
 
 2. learning models of how these skills are acquired
 - a set of assumptions about how the students state changes after each 
 step in solving a problem
 - employees knowledge tracing to track changes in the student's 
 knowledge across problems 
 the information that results from knowledge tracing can be used to 
 1. disambiguate alternative interpretations in model tracing
 2. for selecting problems to optimize learning
 
 Application (p.8)
 - LISP programming
 - high-school geometry
 - algebraic manipulations and word problems
 
 Why these domains
 - involve the acquisition of well-defined skills
 - can catch students as the point where they are just beginning to learn 
 the skill
  
 
 Section 1:  Cognitive Theory (p. 8)
 - not necessarily what is in the tutor
 - theory that forms the basis of the tutors
 - * if the mind functions according to our theory, then the tutors should 
 prove to optimize the learning process
 
 Predictions from our cognitive theory
 - used in the tutor
 - influenced the tutor code
 - tutor code is just a derivation of the theory
 
 Distinction between declarative and procedural knowledge (p.9)
 Declarative knowledge 
 - can be encoded quickly and without commitment to how it will be used
  - instructions or reading text
 - "What"
 
 Procedural knowledge 
 - can only be acquired through the use of the declarative knowledge, often 
 after trial and error practice
 - embodies knowledge in a highly efficient and use-specific way
 - by-product of the interpretative use of declarative knowledge
 - "How"
 
 Knowledge compilation 
 - is the learning process which creates the procedural knowledge
 
 1.1 Procedural knowledge:  Productions (p. 9)
 Procedural knowledge is represented by a set of production rules that 
 define the skill
 Goal in tutoring is to create experiences that will cause students to 
 acquire the production rules which would be possessed by the competent 
 problem solver
 
 see examples p. 9-11
 - goal decomposition (p. 9)
 - forward and backward inference (p. 10)
 - grain size - level at which to model the student (p. 11)
 
 level of decomposition
 - upper level skills
 - communicate the ideal problem-solving structure of that domain
 
 1.2 Declarative knowledge:  PUPS structure (p.12)
 Knowledge is initially encoded declaritively in what we have come to call 
 PUPS structure
 At first these structures are used by weak problem-solving productions
 As a result of this activity, the knowledge is converted into specific 
 production form
 
 PUPS structures 
 - are basically schema-like structures which are distinguished by the fact 
 that they have certain special slots which provide critical to their 
 interpretive application in problem solving.
 Include:
 - function slot
 - serves to indicate the function of the entity represented by the 
 structure
 form slot
 - indicates its form or physical appearance
 - precondition slot
 - states any preconditions that must be satisfied for that form to achieve 
 that function
 
 Points:
 A critical issue for learning is correct interpretation of the 
 instructions
 One problem with virtually all instructional material is that it omits 
 many 
 things that the student needs to know in order to perform that tasks, and 
 the student is left to figure them out by trial and error experimentation
 The ideal student model provides a cognitive analysis of what the student 
 really needs to know
 Instructions can be designed to communicate all the information in the 
 ideal model
 In communicating unfamiliar material there is the inevitable difficulty of 
 the 
 student being weak on the key concepts
 One important role of the tutor is to monitor these errors of 
 misunderstanding and correct them as they show up in the performance 
 of a task
 
 1.2.1 Interpretive use of declarative knowledge (p.14)
 Assume that declarative PUPS structures are deposited in memory as the 
 product of language comprehension.
 Important that the necessary structures get encoded correctly
 - but this is not the end state of the learning process
 - these structures do not directly lead to performance
 - necessary to interpret them to get performance
 Interpretation high demanding cognitively
 - major cause of slip in performance
 - create productions like the ones in the ideal model which will 
 automatically apply the knowledge
 Double loop of inefficiency
 Outer loop - search through the operations the student knows to find an 
 appropriate next step
 Inner loop - involves the analogical application of a declarative PUPS-
 structure representation of an operation to the problem at hand in order 
 to 
 produce a response.
 
 1.2.2 Analogy (p. 14)
 major way in which student solve problems involving concepts is by 
 analogy
 
 1.3 Knowledge compilation (p.15)
 analogical reasoning is not optimal for problem solving
 - it is costly to compute the mapping
 - it will only work when there is an example at hand
 
 Knowledge compilation - first
 - tries to analyze the essence of the analogical solution and
 - generate a production rule that can produce the solution at will.
 How?
 - by looking at the problem state before and after generating the 
 analogical solutions and 
 - creating a production rule that maps one onto the other
 - essential to know what was critical in the before situation and what was 
 critical in the solution. 
 Second
 - eliminate some of the blind search that characterizes early problem 
 solving
 
 1.4 Strengthening (p.17)
 - simple strengthening of declarative and procedural knowledge with use
 - as knowledge becomes strengthened it comes to be applied more 
 rapidly and reliably
 - ample empirical evidence for this learning process though the nature is 
 in dispute
 Strengthening for tutoring concerns the introduction of new knowledge
 - as execution of acquired knowledge becomes more proficient 
 - more capacity is left to properly process (and acquire) new knowledge
 
 1.5 Other learning mechanisms? (p.17)
 
 
 Section 2:  Converting Theory to Tutoring:  Model-tracing  (p. 18)
 Review:  learning in this theory involves
 1) acquisition of new declarative knowledge by the processing of 
 experience through existing productions (e.g. for language 
 comprehension)
 2) application of declarative knowledge to new situations (i.e. situations 
 for which productions do not exist) by means of analogy and pure search
 3) compilation of domain-specific productions
 4) strengthening of declarative and procedural knowledge
 
 Are these assumption sufficient to account for all knowledge acquisition
 How to test?
 tutor is the methodology for testing the theory
 Success of tutor 
 - in post testing
 - total learning time
 - one test of theory
 Is detailed analyses of the student's interaction with the tutor in accord 
 with theoretical predictions
 
 Model tracing
 - mapping of the underlying theory to tutoring methodology
 Performance model
 - how a student's knowledge state will map onto performance on a 
 particular problem
 - can be used to interpret the student's performance on a particular 
 problem
 - instructions to address confusions and to keep the student on the 
 correct 
 solution path
 
 Learning model
 - specifies how the student's knowledge state will change as a result of 
 problem-solving experiences
 - can be used to trace the student's knowledge state over time
 - problems and accompanying instructions are selected to practice the 
 student on productions that are diagnosed as weak or missing in the 
 student's knowledge state
 - given this structure of the learning situation we trust the automatic 
 mechanism in (1) - (4) above to move the student forward on an optimal 
 learning trajectory
 
 2.1 The LISP tutor - example (p.19)
 
 2.2 The geometry tutor - example (p. 21)
 
 2.3 The algebra tutor - example (p.25)
 
 2.4 Summary of the tutors (p.29)
 Underlying each tutor is an ideal model of how students should solve the 
 respective problems and a model of how students err.
 
 Error model
 used to recognize and remediate errors
 
 Ideal model
 is used to guide students along a correct solution path if necessary
 
 generic model
 - combination of the ideal and error model
 - defines the model-tracing methodology
 
 Tutor
 - traces out the path the student tries to take through the generic model 
 and insists that the student stay on a correct path.
 - highly interactive interface that lets the student know when they have 
 diverted from the ideal solution and where the deviation has occurred
 - instructions are highly procedural
 
 2.5 Evaluating the model-tracing methodology (p.30)
 - lack of empirical feedback on tutor and proposed mechanism's success
 - some systematic tests of the effectiveness of the tutor
 
 2.5.1 LISP (p.30)
 - time to learn was less for the tutor in advanced sessions
 - after performance tests showed statistically  significantly improvement 
 for tutored students in advanced lessons
 
 2.5.2 Geometry
 - statistically significant improvement from pre-test to post-test
 - statistically significant difference between control and tutored 
 students
 
 2.5.3 Algebra
 - produces learning (students learn when using it)
 
 
 Section 3:  Implementing the Model-Tracing Methodology (p.33)
 Prerequisite to implementing a model-tracing tutor 
 - create production rules that will be involved in the tracing
 - and an adequate set of buggy rules to account for the errors
 
 Tutor design - three largely independent modules
 Student module
 - trace the student's behavior through its nondeterministic set of 
 production rules
 
 Pedagogical module
 - embodies the rules for interacting with the student, for problem 
 selection 
 and for updating the student model
 - controls the interaction among the three modules
 
 Interface
 - responsible for interacting with the student
 
 3.1 The student module (p.34)
 - deliver to the pedagogical module an interpretation of a piece of 
 behavior 
 - this is delivered in terms of the various sequences of production rules 
 that might have produced that piece of behavior
 
 methodology
 run the nondeterministic student model forward and see what paths 
 produce matching behavior
 
 3.1.1 Nondeterminacy (p.34)
 - major source of problems in implementing the model-tracing 
 methodology
 - occurs whenever multiple productions in the student module produce the 
 same output
 - a special case occurs when productions produce no overt output (i.e. 
 mental calculation)
 - set of potential paths can explode exponentially
 - potential for actually effectively tutoring these steps is weakened the 
 greater the distance between the mental mistake and the feedback on  
 that decision
 - difficult to design an interface which can trace planning in a way that 
 does not put an undue burden on the student
 - misunderstandings and slips can often produce the identical behavior
 
 3.1.2 Production system efficiency (p. 35)
 - real time diagnosis
 
 Inherent computational problems of production systems are exacerbated 
 in tutoring because:
 1) grain size is often smaller than is necessary in an expert system and 
 the production patterns required to expose the source, of the student 
 confusions is often considerable
 2) the system has to consider enough productions at any point to be able 
 to recognize all next steps that a student might produce
 3) often it is not clear which of a number of solution paths a student is 
 on
 
 Pattern matcher
 - decide how much detain of the actual problem should be represented
 
 Computational cost associated with implementing such a production 
 system
 - space as well as a time dimension
 Efficiency issues impact on the range of topics we handle
 1) problems tend to become more costly as they become larger
  - production system working memory tends to increase
  - nondeterminism increases too
 2) advanced topics are limited according to their computational burden
 3) actual tutoring interactions become limited by the need to reduce 
 nondeterminacy
 
 3.2 Compiling the model tracing (p.37) 
 - look at all possible sequences of productions that can be generated in 
 any of our models
 - generate the problem space beforehand and just use the student's 
 behavior during problem solving to trace through this pre-completed 
 problem space
 Other advantages to having the complete problem space compiled in 
 advance of the actual tutoring session
 - easy for the tutor to look ahead and see where a step in the problem 
 solution will lead
 If tutor recommended dead-end steps, just because the ideal model 
 makes them, the student would quickly loose faith in the tutor
 
 3.3 The pedagogical module (p.37)
 - decoupling of the pedagogical strategy from the domain knowledge
 - relates student model and interface
 - does not require any domain expertise
 
 Concerned with:
 1) what productions can apply in the student model
 2) what responses the student generates, and whether these responses 
 match what the production would generate
 3) what tutorial dialogue templates
 
 Strategy
 - optimal tutoring strategy will be domain-free
 - current - separate
 - conflicting considerations as to what the optimal features of the common 
 tutoring strategy should be
 
 3.3.1 Immediacy of feedback
 - stay on correct path
 - immediately flags errors
 - minimize problems of indeterminacy
 
 Reasons
 1) feedback on an error is effective to the degree that it is given in 
 close 
 proximity to the error
   - easier for the student to analyze the mental state that led to the 
 error 
 and make appropriate correction
 2) immediate feedback makes learning more efficient because it avoids 
 long episodes in which the student stumbles through incorrect solutions
 3) tends to avoid the extreme frustration that builds up as the student 
 struggles unsuccessfully in an error state
 
 Problems with immediate feedback
 a) carefully designed to force the student to think
   - forced to calculate the correct answer rather than just being given 
 the 
 answer
   - generate the answer rather than copy the answer form the feedback
 b) self-correction is preferable when it would happen spontaneously
   - people tend to remember better what they generated themselves
 c) students can find immediate correction annoying
   - especially experienced students
   - novice programmers generally liked the immediate feedback
 d) difficult to explain why a student's choice is wrong at the point at 
 which 
 the error is first manifested because there is not enough context
 
 Variations
 - feedback after "complete" expressions
 - some opportunity for self-correction
 
 3.3.2 Sensitivity to student history (p. 39)
 - only student model is a generic student model
 - generic model is a composite of all correct and incorrect moves that a 
 student can make
 - if a students make an error the tutor gives the same feedback 
 independent of the history
 
 Theoretically justified
 - theory does not expect individual differences
 - all people learn in basically the same way
 
 Not derived from theory
 - past history of use with a rule implies nothing about interpretation of 
 a 
 current error
 
 3.3.3 Problem sequence
 Mastery model 
 - controls selection of problems to present
 - maintain assessment of the student's performance on various rules
 - have knowledge of what problems exercise what rules
 Tutor will not let the student move on to problems involving new rules 
 until
 - student is above threshold of competence on the current rules
 - demonstrates mastery
 
 Why mastery?
 - optimal learning load
 - don't want to over burden the learner
 
 What is the mastery level?
 - current level is set ad hoc
 - need to investigate optimal level
 
 3.3.4 Declarative instruction
 - in test or classroom
 
 How should it be structured?
 - analogy
 - mapping into problem solution
 - learn first superficially
 
 3.4 The interface (p.41)
 - design of the interface can make or break the effectiveness of the tutor
 - tutor must be evaluation the current works of the student (in real time)
 - provide feedback on the point the student is fixated on (it is currently 
 the 
 most relevant to the student)
 - syntax is not the issue of the learning, to minimize this the system has 
 a 
 real time parse to flag errors and to prompt formatting
 -  visual or graphical interfaces lend greater understanding to some 
 material
 - make verbal communications as brief and as understandable as 
 possible
 - a facility to bring up the problem statement at any point in time, and 
 when there is room on the screen, the problem statement is now 
 automatically displayed
 - the all visual medium of the tutor is a disadvantage (esp. compared to 
 human tutors) in that the student must remove their eyes from the 
 problem in order to read the instructions
 
 Important points:
 a) it is important to have a system that makes it clear to a student where 
 he or she is in the problem solution and where their errors are
 b) it is important to minimize working memory and processing load 
 involved in the problem solving
 
 Desired properties:
 - easy to learn and use
 - its learnability is enhanced if it is as congruent with past experience 
 as 
 possible
 - structure that is as congruent as possible with the problem structure
 - actions should be as internally consistent as possible
 
 4 Conclusion (p.43)
 To what degree does the tutor experience confirm the theory?
 - students do seem to learn from the tutors
  - they took cognitive models of information processing, embedded them 
 in instructional systems, and nothing fell apart
  - better than standard classroom instruction
 *- interactions generated from group principles, not individually 
 developed
 *- some evidence of student learning can be gained from the speed in 
 which student type specific word or groups of words and these 
 correspond with the firing of productions learned
 - knowledge acquired does seem to have the expected range of 
 applications
 - students are able to apply new combinations of rules to solve new 
 problems, as long as the contextual heuristics that recommend the 
 application of these rules are ones they have already encountered
 - however if the student is to solve a problem in which they know all the 
 rules, but requires applying a new contextual heuristic, the student 
 experiences difficulty
Jonah Lunken
lunken@winhitc.atlantaga.NCR.COM