LOGIC and DEDUCTION Symbolic logic is a language that is well-suited for representing factual knowledge about the world and making deductive inferences based on this knowledge. %---------------------------------------------------------------------------- TOPICS Symbolic Logic: Notation Axioms Rules of inference Theorem proving Zeroth Order Logic: Propositional Calculus WFF (Well formed formula) Constants, variables, functions, predicates Connectives Modus ponens Semantic interpretation First Order Logic: Predicate Calculus Quantifiers Scope Unification Resolution Theorem Proving %---------------------------------------------------------------------------- Originally proposed as the underlying paradigm of intelligence (controversial). What's it good for? Representing knowledge about the world. Representing inference rules. A formal system to underly all intelligence? Criticism: Systems built around FOPC weren't very intelligent. People don't work that way. (Does this matter?) %---------------------------------------------------------------------------- PRIMITIVES Constants (upper case) Variables (lower case) Stands for constants Functions (lower case) Maps its argument(s) to constant Returns a single valued constant Predicates (upper case) Truth valuable (maps to T or F) given a semantic interpretation Example: MARRIED (father(JOHN), mother(JOHN)) ---> a WFF in FOPC ^^^^^^^ ^^^^^^ ^^^^ ^^^^^^ | | | | | | constant | | | | | ------functions----- | predicate (true or false) Can have multiple arguments. Examples: RED(block1) LIVES(tom,house1) T CAN-FLY(bird) LIVES(jane,house1) F TREE(giraffe) LIVES(jane,house2) F %---------------------------------------------------------------------------- SEMANTIC INTERPRETATION The assignment of a ``meaning'' for all symbols. Each symbol must stand for ``some thing'' in the given real-world domain. Once a semantic interpretation is provided, all WFFs containing no variables (and some that do) are truth-valuable. Not possible without the semantics. Example: Semantics for above example: Logic English words describing symbols the real world domain ------- ------------------------ JOHN means (stands for) John Doe, 13 years old, lives at ... father returns the father of its argument (e.g., FRED for JOHN) mother returns the mother of its argument (e.g., MAUD for JOHN) MARRIED is T iff its arguments are married. Thus MARRIED(father(JOHN),mother(JOHN)) is T iff FRED and MAUD are married. %---------------------------------------------------------------------------- Well formed formula (WFF): Informally, an expression is a WFF if it is legitimate, i.e., has a predicate whose arguments are constants, variables, or functions, and those functions also have constants, variables or other functions as arguments (but no predicates). Note: Predicates are NOT functions. They don't "return" values -- can't use their truth values as arguments to other functions or predicates. %---------------------------------------------------------------------------- CONNECTIVES and conjunction or disjunction not negation implies implication iff equivalence If p, q are WFFs, then the WFF: p and q is T iff both p and q are T p or q is T iff either p or q or both are T p implies q is equivalent to (q or (not p)); also written (if p q) not p is T iff p if F p iff q is T iff both p and q are T or both p and q are F Note: If x is a mammal, then x is an animal. mammal(cow) => animal(cow) is T mammal(snake) => animal(snake) is T (not anomalous; p => q is always T if q is T, regardless of the truth value of p) Note: "and", "if" (in (if p q)), etc. are NOT English! Don't confuse the everyday meanings of these terms with their formal meanings in logic. %---------------------------------------------------------------------------- TRUTH TABLES p and q | F T ------------------ etc. F | F F T | F T %---------------------------------------------------------------------------- PRECEDENCE RULES not highest and,or implies equivalence lowest Example: Given: barks(dog) T fur(dog) T fur(emu) F bird(emu) T not fur(emu) /\ fur(dog) => bird(emu) \/ barks(dog) | F T T T ------- | --------------- T | T ---------------- | T | ---------------------------- T %---------------------------------------------------------------------------- LAWS OF TRANSFORMATION The power of this representation arises from the fact that the syntax of WFFs can be formally manipulated and the semantics (meaning) of the corresponding WFFs stays "correct" as intended. The "formal manipulation" is done according to a set of transformation laws. not(not(A)) == A A or B == not(A) => B DeMorgan's Laws not (A and B) == not(A) or not(B) not (A or B) == not(A) and not(B) Commutative Law A and B == B and A A or B == B or A Associative Law (A and B) and C == A and (B and C) (A or B) or C == A or (B or C) Distributive Law A and (B or C) == (A and B) or (A and C) A or (B and C) == (A or B) and (A or C) Contrapositive Law A => B == not(B) => not(A) %---------------------------------------------------------------------------- PROPOSITIONAL CALCULUS: Simplest form of logic. No variables or quantifiers, deals specifically with propositions about things. PREDICATE CALCULUS: Adds variables and quantifiers. %---------------------------------------------------------------------------- QUANTIFIERS A quantified WFF is also a WFF (and therefore truth-valuable). Universal quantification: for-all for-all(x) p(x) Existential quantification: there-exists exists(x) p(x) | variable Examples: for-all(x) ELEPHANT(x) => GRAY(x) "All elephants are gray" exists(x) BLUE(x) "There exists something blue" SCOPE OF A QUANTIFIER: That part of the following expression over which it applies. Parentheses may be used to delimit scope and to disambiguate ambiguous expressions. Examples of WFFs: exists(x) {for-all(y) [P(x,y) and Q(y,x) => R(x)]} not for-all(q){exists(x) [P(x) or R(q)]} not P(A,g(A,B)) Note: These WFFs just exist (and are T or F); they don't get evaluated by a process (unlike LISP). These are not WFFs: not f(A) Can't negate a function f(A) The "top level" must be truth-valuable f(P(A)) Functions don't take predicates as arguments Q(f(A),(P(B) => Q(C))) WFFs are not arguments to WFFs A or not implies (C) Does not obey syntax rules for connectives %---------------------------------------------------------------------------- TAUTOLOGY: Always T (regardless of truth values applied) Examples: (x => (y and z)) == ((x => y) and (x => z)) x == x or y FALLACY: Always F Examples: not (x and y) == not (not(y) and not(x)) %---------------------------------------------------------------------------- Ok, now that we have all this formalism, what do we do with it? Representing knowledge. Making inferences. %---------------------------------------------------------------------------- REPRESENTING KNOWLEDGE Example: Representing facts. "Tom owns Rover." OWNS(tom,rover) Example: Representing rules. "Every boy owns a dog." for-all(x) [ BOY(x) => exists(y)(DOG(y) and OWNS(x,y)) ] More Examples: "Every city has a dog catcher who has been bitten by every dog in town." for-all(x) [ CITY(x) => exists(y) [ DOG-CATCHER(y,x) and for-all(z) [ DOG(z) and LIVES(z,x) => BITTEN(y,z) ] ] ] "There is one and exactly one thing that is blue." exists(x) [ BLUE(x) and for-all(y) [ BLUE(y) => EQ(x,y) ] ] %---------------------------------------------------------------------------- REPRESENTATIONAL POWER OF LOGIC The database may consist of a set of WFFs which represent what the system believes. Example of the use of logic for representing natural language: "Jack caught a ball." (catch-object JACK-21 BLOCK-32) --> internal representation | | | (formula inside the machine) predicate constants The internal representation must Remove referential ambiguity (which Jack?) Remove word sense ambiguity (catch a ball vs. catch a cold) Representations in predictae calculus are UNAMBIGUOUS, but NOT CANONICAL. There may be more than one representation that mean the same thing. Two WFFs mean the same thing if they can be transformed into one another using some set of rules. Example: p => q == q or not(p) not(not(p) == p COMPLETENESS: Predicate calculus is known to be complete. CONSISTENCY: Predicate calculus is known to be consistent. %---------------------------------------------------------------------------- DEDUCTIVE INFERENCE Done using a set of inference rules. If original WFFs are true, then WFFs derived using the rules of inference are guaranteed to be true. This is known as DEDUCTIVE INFERENCE: Conclusions are logically correct. If the premises are true, the conclusion is guaranteed to be true. %---------------------------------------------------------------------------- RULES OF INFERENCE: UNIVERSAL SPECIALIZATION (INSTANTIATION, for-all ELIMINATION): Given: for-all(x) WFF(x) ----------------- Infer/Assert: WFF(X) Example: Given: for-all(x) P(x) for-all(x) (CANARY(x) => YELLOW(x)) --------------- ----------------------------------- Infer/Assert: P(A) CANARY(TWEETY) => YELLOW(TWEETY) MODUS PONENS: Given: p, p => q MAN(socrates), MAN(x) => MORTAL(x) ------------ ---------------------------------- Infer/Assert: q MORTAL(socrates) (Note: In this example, we used universal instantiation to specialize MAN(x) => MORTAL(x) to MAN(SOCRATES) => MORTAL (SOCRATES), and then modus ponens to infer MORTAL (SOCRATES). Note also we left off the "for-all(x)"; this is only convenient shorthand.) In other words: (p and (p => q)) => q MODUS TOLENS: Given: p => q, not(q) --------------- Infer/Assert: not(p) %---------------------------------------------------------------------------- CHAINS OF INFERENCE: SISTER(mary), SISTER(x) => GIRL(x) ---------------------------------- GIRL(mary) GIRL(mary), GIRL(x) => PERSON(x) -------------------------------- PERSON(mary) AXIOMS: The set of WFFs that we start out with (given). THEOREMS: The set of WFFs that we infer from the axioms (may be infinite even if axioms are finite.) FORWARD CHAINING: From known facts, infer whatever else may be true by matching facts to l.h.s. of rules and inferring the r.h.s. Repeated application of modus ponens. BACKWARD CHAINING: From goal, infer what may lead to the goal by matching goal to r.h.s. of rules and trying to prove the l.h.s. as a subgoal. Stop when known fact is reached. Note: This inferring is different from the IMPLEMENTATION of this inferring in a computer. The efficiency of the implementation may depend on the order of application of rules, etc. But we are only asserting logical inferences without thinking about how they will be computed by a machine. The practical problem is that of SEARCH -- which rules to use, which inferences to make? How to avoid combinatorial explosion? %---------------------------------------------------------------------------- UNIFICATION Inferencing requires "unification", i.e., matching. E.g., must match MAN(SOCRATES) to MAN(x) in above example before we can infer MORTAL(SOCRATES). Unification requires generating a substitution list (in our example, (SOCRATES/x)), meaning that the term SOCRATES must be substituted for the variable x. If unification is successful, the inference can be used. FREE AND BOUND VARIABLES: Consider: for-all(x) exists(y) [ P(x,z) => Q(y) ] The variables x, y are BOUND in the WFF (within the scope of the quantifiers.) The variable z is FREE in the WFF. SUBSTITUTION LIST: A list of pairs of terms ti and variables vi: (t1/v1 t2/v2 ...) Example: (A/z g(L)/y ...) A/z means substitute the term A for the variable z. Note: We don't allow f(x)/x or (x/y y/x) because these are CIRCULAR. Two expressions are UNIFIABLE if there is a substitution that makes the expressions identical. Example: Q(f(x),z,A) and Q(a,z,y) are unifiable under the substitution (f(x)/a A/y) The substitution results in the unification instance Q(f(x),z,A) MGU: In general, there may be several possible substitution lists that unify the expressions. The MOST GENERAL UNIFIER (MGU) places the fewest restrictions on the variable bindings. Example: Q(x) and Q(y) unify to Q(A) under the substitution (A/x A/y), but this is not the MGU. The mgu here is (x/y) or (y/x). Note that the mgu may not be unique. Example: Q(A) and Q(x) have the mgu (A/x), which unifies them to Q(A). Some more examples: Literal pairs mgu instance P(x) P(A) P(A) P( f(x,g(A,y)),g(A,y) ) P( f(x,g(A,y)), g(A,y) ) P( f(x,z), z ) P(x) not unifiable Q(x) R(x,y,A) not unifiable R(x,y,B) Q(x,y,A) Q(B,y,A) Q(B,y,z) R(x) not unifiable R(f(x)) Why do we need UNIFICATION? Because in order to use logic to do inference, you have to unify what you know (or are trying to prove) with the facts and rules in the system's database. One way to do inference is to use RESOLUTION. %---------------------------------------------------------------------------- RESOLUTION THEOREM PROVING An inference method that is: COMPLETE finds all the theorems from a set of axioms RECURSIVELY ENUMERABLE can prove that a sentence is a theorem BUT NOT RECURSIVE can't always prove that a sentence is not a theorem In other words, resolution will always stop with a proof if you start with a theorem, but may never terminate if you start with a non-theorem. BASIC INFERENCE RULE: (E1 \/ p2 \/ p3 \/ ...), (not(E1) \/ q2 \/ q3 \/ ...) ----------------------------------------------------- (p2 \/ p3 \/ ... \/ q2 \/ q3 \/ ...) From two clause-form expressions containing a complementary pair E1 and not(E1), eliminate the complementary. Example: Given: RED(CHALK) or YELLOW(BOARD) is true not(RED(CHALK)) or GREEN(BOARD) is true Conclude: YELLOW(BOARD) or GREEN(BOARD) is true (and we don't know about RED(CHALK) which could be either true or false.) Resolution is the repeated application of this rule. Note that both terms being combined are disjunctions of literals and negated literals (only). %---------------------------------------------------------------------------- CLAUSE FORM To use resolution, we must have our system of WFFs in a special form known as CLAUSE FORM: NEGATION only of single items each subexpression is a DISJUNCTION the entire expression is a CONJUNCTION of these (In order words, a conjunction of disjunctions, also known as CONJUNCTIVE NORMAL FORM.) To convert something to clause form, we must: 1. Eliminate implications. p => q == not(p) or q Example: for-all(x) [ BIRD(x) => CAN-FLY(x) ] -> for-all(x) [ not(BIRD(x)) or CAN-FLY(x) ] 2. Reduce scope of negation symbols. not(p and q) == not(p) or not(q) not(p or q) == not(p) and not(q) not(not(p)) == p not(for-all(x) P(x)) == exists(x) not(P(x)) not(exists(x) P(x)) == for-all(x) not(P(x)) Example: not ( for-all(y) [ not(Q(x,y)) or P(y) ] -> exists(y) [ not [ not(Q(x,y)) or P(y) ] ] -> exists(y) [ not(not(Q(x,y))) and not(P(y)) ] -> exists(y) [ Q(x,y) and not(P(y)) ] 3. Standardize variables. Rename each variable so that it is unique. Example: for-all(x) [ P(x) => exists(x) Q(x) ] -> for-all(x) [ P(x) => exists(y) Q(y) ] 4. Remove existential quantifiers (Skolemization). for-all(y) [ exists(x) P(x,y) ] -> for-all(y) [ P(g(y),y ] Intuitively, for each value of y there is a particular x, perhaps depending on y, for which P(x,y) is true. So y is just some function g of x. (g is known as the SKOLEM FUNCTION.) Example: "Every boy owns a dog." for-all(x) [ BOY(x) => exists(y) [ DOG(y) and OWNS(x,y)] ] -> for-all(x) exists(y) [ BOY(x) => [ DOG(y) and OWNS(x,y)] ] -> for-all(x) [ BOY(x) => DOG(dog-of(x)) and OWNS(x,dog-of(x))] 5. Move all universal quantifiers to the left (Prenex form). Since only universal quantifiers are left and all variable names are unique, this is easy. Example: for-all(x) [ P(x) or for-all(y) [ Q(x,y) ] ] -> for-all(x) for-all(y) [ P(x) or Q(x,y) ] 6. Rewrite in conjunctive normal form. Move disjunctions down to the literals. Example: A or (B and C) == (A or B) and (A or C) 7. Eliminate universal quantifiers. Since there are only universal quantifiers, we need not write them. 8. Eliminate /\ symbols (separate into clauses). Example: (A or B) and (A or C) gives us two clauses: (A or B) (A or C) 9. Rename variables. Make sure the same variable does not appear in more than one clause. %---------------------------------------------------------------------------- RESOLUTION ALGORITHM Now we can use resolution (finally!) 1. Negate theorem to be proved (and put it in clause form). 2. Add this to the set of axioms (which are also in clause form). 3. Resolve repeatedly until: (a) Empty clause ("false" or "contradiction") derived => original theorem was true. (b) No more resolvable clauses => original theorem was false. %---------------------------------------------------------------------------- RESOLUTION EXAMPLE 1. Whoever can read is literate. for-all(x) [ R(x) => L(x) ] 2. Dolphins are not literate. for-all(x) [ D(x) => not(L(x)) ] 3. Some dolphins are intelligent. exists(x) [ D(x) and I(x) ] To prove: 4. Some who are intelligent can't read. exists(x) [ I(x) and not(R(x)) ] In clause form: 1. not(R(x) or L(x) 2. not(D(y) or not(L(y)) 3. D(A) I(A) (A is a Skolem constant.) Negation of goal: 4'. not (exists(x) [ I(x) and not(R(x)) ] -> for-all(x) [ not( I(x) and not (R(x)) ) ] -> for-all(x) [ not(I(x)) or not(not(R(x))) ] -> for-all(x) [ not(I(x)) or R(x) ] -> not(I(x)) or R(x) Proof: 3 and 4' resolve to give: 5. R(A) 5 and 1 resolve to give: 6. L(A) 6 and 2 resolve to give: 7. not(D(A)) 3 and 7 resolve to give: 8. false So the original theorem was true. %---------------------------------------------------------------------------- PROBLEMS WITH RESOLUTION THEOREM PROVING AND LOGIC Slow. Combinatorial explosion - how to select which clauses to resolve? May never stop if original theorem if false (semi-decidable). Some knowledge may not be representable in logic (e.g., vision). Appears to be cognitively implausible (except for narrow, specialized tasks). %---------------------------------------------------------------------------- CONCLUSION? Logic allows mechanized inference of logically correct formulas from pre-existing formulas. You can assert facts and rules about the world, and the computer will automatically make the correct inferences (although it may be inefficient). BUT: Logic does not tell you about how the predicates, constants, variables, functions, are to be selected. Also need a CONTENT theory. %---------------------------------------------------------------------------- Copyright (c) Ashwin Ram, 1990-93 Assistant Professor, College of Computing Georgia Institute of Technology, Atlanta, Georgia 30332-0280 E-mail: