LEARNING %---------------------------------------------------------------------------- What is learning? What does it mean to learn? Learning covers a wide range of phenomena, from SKILL REFINEMENT (e.g., learning to ride a bicycle) to KNOWLEDGE ACQUISITION. Knowledge acquisition, in turn, can be of several types: 1. Rote learning - storage of computed information. 2. Taking advice from others. (Advice may need to be operationalized.) 3. Learning from problem solving experiences - remembering experiences and generalizing from them. (May add efficiency but not new knowledge.) 4. Learning from examples. (May or may not involve a teacher.) 5. Learning by experimentation and discovery. (Decreasing burden on teacher, increasing burden on learner.) %---------------------------------------------------------------------------- Learning: The ability to change through experience to better one's behavior. 1. Acquisition of new knowledge that enhances reasoning capabilities by allowing it to do something it couldn't do before ("knowledge level learning"). 2. Reorganization of existing knowledge (e.g., compilation) that, although it doesn't enhance the reasoning capabilities in the above sense, makes the system able to do its task better or more efficiently ("speedup learning"). 3. Acquisition or modification of reasoning methods ("meta-learning"). %---------------------------------------------------------------------------- Learning task: A learning task can be characterized by the (input) SOURCE and the (output) TARGET structures. Possible target structures include whatever would aid a reasoner to do its job (better). For example, - a new association, a modified association - a new plan, a modified plan - a new criterion for recognizing an object - a new word definition - a new value for a slot in a schema, a new slot, a new schema - a new case, a modified case - a new index - a new model, a modified model --- --- --- %---------------------------------------------------------------------------- Basic phases in learning: 1. Tracking down what needs to be learned (credit and blame assignment). 2. Proposing the new knowledge using some learning algorithm or method. 3. Debugging the proposed knowledge and integrating it with the old knowledge. The "credit (or blame) assignment" problem: Identifying the faulty knowledge structures when an error is made. The "deciding what to learn" problem: Identifying what needs to be learned ("formulating learning goals"). The "deciding how to learn" problem: Identifying appropriate learning strategies (in a multistrategy learning system). Most current systems use only one learning strategy. The "indexing" problem: Indexing the new knowledge back into memory such that it is retrieved in appropriate situations in the figure. %---------------------------------------------------------------------------- Types of Learning Situations Supervised Learning: an oracle provides the agent with input (e.g., examples) and/or feedback (on success/failure) Unsupervised Learning: the agent receives input and feedback solely from autonomous interaction with the enviornment. %---------------------------------------------------------------------------- Inductive learning: Learning from examples Similarity-based learning, Inductive learning, Induction, Concept learning, Category formation, Learning class descriptions from examples. Given: Observational statements (facts) A tentative inductive assertion Background knowledge Find: An inductive assertion (hypothesis) that satisfies the background knowledge and tautologically implies the observational statements. %---------------------------------------------------------------------------- Basic performance task: CLASSIFICATION Given a rule: In SITUATION do ACTION Need to be able to recognize SITUATION, i.e., classify input into the categories that define SITUATION. Basic idea: Build an initial hypothesis If a rule applied but it shouldn't have: specialize it If a rule did not apply but it should have: generalize it e.g.: Variabilization: replacing a variable with a constant ISA Generalization: climbing the ISA hierarchy Disjunct Addition: adding a new disjunct (Conjunct Addition -> specialization) Univeralization: introducing universal quantifiers Issues: What to apply the specialization/generalization rules to? SITUATION IDENTIFICATION PROBLEM: figuring out which features of the situation to ignore in revising rules e.g., how does Winston's program know not to generalize the time of day? how does it know to represent the predicate "touch"? What representations to use? DESCRIPTION PROBLEM: figuring out the representations that make learning easier e.g., (samples/positive instances, negative instances, queries) magic numbers: 7, 60, 70, 546, 627 base 9: 7 66 77 666 766 not magic: 364, 9, 626, 708 base 9: 444 10 765 866 is this magic: 628? 69? 71? base 9: 767? 76? 78? can't represent the right generalization until the right representation is used Which rules to revise? CREDIT ASSIGNMENT PROBLEM: figuring out what the program needs to know or why it made a mistake, and which rules are responsible What operators to allow? DISJUNCTIVE CONCEPT PROBLEM: If you allow disjunctions, you get overfitting. If you don't, you can't learn disjunctive concepts (at best, you can learn a generalization that includes other things as well). (LEX's solution: Multiple version spaces) How to match the new example to the current hypothesis? MATCHING PROBLEM: Which parts in the input description match which parts of the hypothesis? Even worse for specialization: algorithm must determine there is no way to do the match. e.g., match: (large red cube) with: (small red sphere), (large green cube), (small blue cube) What is there are more than one ways to match subparts? e.g., posts of an arch %---------------------------------------------------------------------------- Winston's learning program (Winston 1972) Task: To derive descriptions of complex objects (e.g., house, arch) based on a series of examples presented by a teacher. Main features: - Learns a "target concept" (the concept that the teacher wants it to learn) by induction. - Proposes and then refines new knowledge (a series of "models" of the target concept). - Attached to a vision program that recognizes primitive objects (e.g., blocks, wedges, spheres). - Teacher presents a series of examples that can be: - SAMPLES: An instance of the target concept. - NEAR MISSES: An example that is not an instance of the target concept, but is "nearly so"; it differs only in one "feature" or "aspect". Example: /\ Sample: (BRICK A) / \ (WEDGE B) / B \ (SUPPORTS A B) ------ ------ Model: (BRICK ?x) | A | (WEDGE ?y) | | (SUPPORTS ?x ?y) ------ /\ ------ Near miss: (BRICK AA) / \ | | (WEDGE BB) / BB \ | AA | ------ ------ Refined model: (BRICK ?x) (WEDGE ?y) (MUST (SUPPORTS ?x ?y)) ------ Sample: (BRICK AAA) | | (BRICK BBB) | BBB| (SUPPORTS AAA BBB) ------ ------ Refined model: (BRICK ?x) | AAA| (PRISM ?y) | | (MUST (SUPPORTS ?x ?y)) ------ /\ Near miss: (WEDGE AAAA) / \ (WEDGE BBBB) /BBBB\ (SUPPORTS AAAA BBBB) ------ /\ Refined model: (MUST-BE-A (BRICK ?x)) / \ (PRISM ?y) /AAAA\ (MUST (SUPPORTS ?x ?y)) ------ Algorithm: - Start with initial model. Now, for each subsequent example: - Match example to model to find differences. - If example is a near miss, then specialize the model. Specialization heuristics: - REQUIRE-LINK: If the evolving model has a link in a place where a near miss does not, convert the model's link to a MUST form. - FORBID-LINK: If the near miss has a link in a place where the evolving model does not, add a MUST-NOT link to the model. - If example is a sample, then generalize the model. Generalization heuristics: - CLIMB-TREE: If something in the evolving model corresponds to something different in the sample, replace the model's clause with a MUST-BE-A link pointing to the most specific shared class connecting the things in the model and sample. - ENLARGE-SET: If something in the evolving model corresponds to a different thing in the sample, replace the model's clause with a MUST-BE-A link pointing to a new class composed of the union of the classes of the two things. - DROP-LINK: If the things that are different in the model and the sample form an exhaustive set, or when the model has a link that is not in the sample, delete the link from the model. - CLOSE-INTERVAL: When the number or interval in the model corresponds to a number in a sample, replace the number (or interval) by an interval spanning the model's number and the sample's number (or an enlarged interval that includes the sample's number, respectively). Contributions: - Showed incremental learning (learning in small steps). - Pointed out the importance of training sequences. - Pointed out the importance of good, consistent representations, without which the difference matching could not be done. Drawbacks: - Learning method requires one-to-one correspondence. - Requires that examples be presented in the correct order. - Disregards the fact that in the real world there are usually many features of objects and scenes. A FOCUS MECHANISM is necessary to distinguish features that are SALIENT. - Negative examples must be "near misses" (differ from the current model in only one way). %---------------------------------------------------------------------------- Learning by problem-solving Mitchell's version spaces/Candidate elimination algorithm: Generalize the ideas in Winston's arch program Least commitment - version space pruned as little as possible Exhaustive breadth-first search throgh the version space (Winston is depth-first) Example: LEX (Mitchell 1982) acquires and refines problem-solving heuristics in the domain of symbolic integration. A form of learning from examples, but less directed than Winston's program. Goal: To devise methods by which heuristic problem-solvers improve their expertise through experience, by: - generating selected problems in the domain - solving them using available heuristics - analyzing their solutions to learn better heuristics Given: - Initial state involving integral - Goal state without integrals - Operators and preconditions (about 40 operators) - Initially, LEX has no heuristics for applicability of operators, but it has a language for representing heuristics. Heuristics take the following form: IF current problem state S matches applicability condition P THEN apply operator O with variable binding B. Language for describing applicability conditions are based on a grammar for algebraic expressions. This grammar is fixed, which restricts the range of heuristics that can be learned. Heuristics for operators represent subclasses of problem states to which the operator SHOULD be applied. Compare these with preconditions which represent when the operator CAN be applied. Example: /-\ | \ x transc(x) dx --> OP2 (integration by parts) \ with u = x | and dv = transc(x) dx \-/ Modules in LEX: - PROBLEM SOLVER - Outputs a detailed trace of the search performed - CRITIC - Analyzes the trace and outputs a set of positive and negative training instances - POSITIVE INSTANCES: Desirable search steps - NEGATIVE INSTANCES: Undesirable search steps - GENERALIZER - Proposes and refines heuristics to improve performance - PROBLEM GENERATOR - Attempts to generate practice problems for the problem solver to work on that will be informative, i.e., that will lead to proposing or refining heuristics, but yet are solvable using current heuristics PROBLEM GENERATOR / ^ practice / \ partially learned problem / \ heuristics v \ PROBLEM SOLVER GENERALIZER \ ^ problem \ / training solving \ / instances trace v / CRITIC Example: Problem generator proposes int(3x cos(x) dx) as an interesting problem to solve. (This was done by hand in the actual implementation of LEX.) Problem solver solves this problem using search. Existing heuristics are used to constrain the search. int(3x cos(x) dx) / \ / \ OP2 with u=3x, dv=cos(x)dx / \ ... 3x sin(x) - int(3 sin(x) dx) / \ OP1 / \ / \ 3x sin(x) - 3 int(sin(x) dx) ... | | OP5 | 3x sin(x) - +3 cos(x) + C Problem solving trace (the above search tree) is passed on to the critic. Critic notices which decisions led to the correct result and which ones didn't pan out, and suggests possible training instances to try out. For example: IF int(3x cos(x) dx) THEN apply OP2 with u = 3x, dv = cos(x) dx Generalizer creates a proposed heuristic or refines an old one. Version spaces: Tentative heuristics are represented using VERSION SPACES (most general and most specific forms). For example: S: IF int(3x cos(x) dx) THEN apply OP2 with u = 3x G: IF int(f1(x) f2(x) dx) THEN apply OP2 with u = f1(x) These represent the most specific heuristic and the most general heuristic one might learn. The actual heuristic it should learn is probably in between somewhere (it lies in the "version space"). Refinement occurs by generalizing S or specializing G, until the heuristic converges (hopefully) to one that works well. Thus the VERSION SPACE represents all the alternative PLAUSIBLE DESCRIPTIONS of the heuristic. A plausible description is one that is applicable to all known positive examples and no known negative example. Algorithm: CANDIDATE ELIMINATION Given: - A representation language - A set of positive and negative examples expressed in that language Compute: A concept description that is consistent with all the positive examples and none of the negative examples. Method: - Initialize G to contain one element: the null description (all features are variables). - Initialize S to contain one element: the first positive example. - Accept a new training example. - If it is a positive example: - Remove from G any descriptions that do not cover the example. - Update the S set to contain the most specific set of descriptions in the version space that cover the example and the current elements of the S set (i.e., generalize the elements of S as little as possible so that they cover the new training example). - If it is a negative example: - Remove from S any descriptions that cover the example. - Update the G set to contain the most general set of descriptions in the version space that do not cover the example (i.e., specialize the elements of G as little as possible so that the negative example is no longer covered by any of the elements of G). - If S and G are both singleton sets, then: - if they are identical, output their value and halt. - if they are different, the training cases were inconsistent. Output this result and halt. - else continue accepting new training examples. Example (from Rich & Knight): Learning the concept of "Japanese economy car" Features: Origin, Manufacturer, Color, Decade, Type POSITIVE EXAMPLE: (Japan, Honda, Blue, 1980, Economy) Initialize G to singleton set that includes everything Initialize S to singleton set that includes first positive example G = {(?, ?, ?, ?, ?)} S = {(Japan, Honda, Blue, 1980, Economy)} NEGATIVE EXAMPLE: (Japan, Toyota, Green, 1970, Sports) Specialize G to exclude negative example G = {(?, Honda, ?, ?, ?), (?, ?, Blue, ?, ?) (?, ?, ?, 1980, ?) (?, ?, ?, ?, Economy)} S = {(Japan, Honda, Blue, 1980, Economy)} POSITIVE EXAMPLE: (Japan, Toyota, Blue, 1990, Economy) Prune G to exclude descriptions inconsistent with positive example Generalize S to include positive example G = {(?, ?, Blue, ?, ?) (?, ?, ?, ?, Economy)} S = {(Japan, ?, Blue, ?, Economy)} NEGATIVE EXAMPLE: (USA, Chrysler, Red, 1980, Economy) Specialize G to exclude negative example (but staying within version space, i.e., staying consistent with S) G = {(Japan, ?, Blue, ?, ?) (Japan, ?, ?, ?, Economy)} S = {(Japan, ?, Blue, ?, Economy)} POSITIVE EXAMPLE: (Japan, Honda, White, 1980, Economy) Prune G to exclude descriptions inconsistent with positive example Generalize S to include positive example G = {(Japan, ?, ?, ?, Economy)} S = {(Japan, ?, ?, ?, Economy)} S = G, both singleton => done! %---------------------------------------------------------------------------- Copyright (c) Ashwin Ram, 1990-93 Assistant Professor, College of Computing Georgia Institute of Technology, Atlanta, Georgia 30332-0280 E-mail: %----------------------------------------------------------------------------