420 likes | 527 Views
CS621: Artificial Intelligence. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 37–Prolog- cut and backtracking; start of Fuzzy Logic. A Typical Prolog program. Compute_length ([],0). Compute_length ([Head|Tail], Length):- Compute_length (Tail,Tail_length), Length is Tail_length+1.
E N D
CS621: Artificial Intelligence Pushpak BhattacharyyaCSE Dept., IIT Bombay Lecture 37–Prolog- cut and backtracking; start of Fuzzy Logic
A Typical Prolog program Compute_length ([],0). Compute_length ([Head|Tail], Length):- Compute_length (Tail,Tail_length), Length is Tail_length+1. High level explanation: The length of a list is 1 plus the length of the tail of the list, obtained by removing the first element of the list. This is a declarative description of the computation.
Prolog Program Flow, BackTracking and Cut Controlling the program flow
Prolog’s computation • Depth First Search • Pursues a goal till the end • Conditional AND; falsity of any goal prevents satisfaction of further clauses. • Conditional OR; satisfaction of any goal prevents further clauses being evaluated.
Control flow (top level) Given g:- a, b, c. (1) g:- d, e, f (2) If prolog cannot satisfy (1), control will automatically fall through to (2).
Control Flow within a rule Taking (1), g:- a, b, c. If a succeeds, prolog will try to satisfy b, succeding which c will be tried. For ANDed clauses, control flows forward till the ‘.’, iff the current clause is true. For ORed clauses, control flows forward till the ‘.’, iff the current clause evaluates to false.
What happens on failure • REDO the immediately preceding goal.
Fundamental Principle of prolog programming • Always place the more general rule AFTER a specific rule.
CUT • Cut tells the system that IF YOU HAVE COME THIS FAR DO NOT BACKTRACK EVEN IF YOU FAIL SUBSEQUENTLY. ‘CUT’ WRITTEN AS ‘!’ ALWAYS SUCCEEDS.
For example: belongs-to predicate (1) belongs-to(E,[E|T]). (2) belongs-to(E,[E1|T]) :- belongs-to(E,T). What about (3) belongs-to(E,[E|T]) :- !. (4) belongs-to(E,[E1|T]) :- belongs-to(E,T). (4) will not be executed!!!
Fail • This predicate always fails. • Cut and Fail combination is used to produce negation. • Since the LHS of the neck cannot contain any operator, A ~B is implemented as B :- A, !, Fail.
Predicate Calculus • Introduction through an example (Zohar Manna, 1974): • Problem: A, B and C belong to the Himalayan club. Every member in the club is either a mountain climber or a skier or both. A likes whatever B dislikes and dislikes whatever B likes. A likes rain and snow. No mountain climber likes rain. Every skier likes snow. Is there a member who is a mountain climber and not a skier? • Given knowledge has: • Facts • Rules
A wrong prolog program! 1. member(a). 2. member(b). 3. member(c). 4. mc(X);sk(X) :- member(X) /* X is a mountain climber or skier or both if X is a member; operators NOT allowed in the head of a horn clause; hence wrong*/ 5. like(X, snow) :- sk(X). /*all skiers like snow*/ 6. \+like(X, rain) :- mc(X). /*no mountain climber likes rain; \+ is the not operator; negation by failure; wrong clause*/ 7. \+like(a, X) :- like(b,X). /* a dislikes whatever b likes*/ 8. like(a, X) :- \+like(b,X). /* a dislikes whatever b likes*/ 9. like(a,rain). 10. like(a,snow). ?- member(X),mc(X),\+sk(X).
How to represent in Prolog: Ram dislikes whatever Shyam likes and likes whatever Shyam dislikes • Prolog uses Horn clauses, and so no operator, including negation, is allowed on the Head of any rule • However, Prolog's machinery of backtracking (which tries to prove the goal in a different path), ! (Cut} operator which cuts-out backtracking) and Fail (which always returns false) can be used for solving the above problem.
Solution • like(ram,X) :- like(shyam,X), !, Fail. • like(ram,X). • Justification: • Suppose it is asserted in the database that • like(shyam,apple) • Now on query • ?like(ram, apple)
Solution (contd.) • like(shyam,apple) will get satisfied, control will move forward, ! will get executed, Fail will get executed, returning No for like(ram,apple). • The system will NOT backtrack, since ! was executed before, and NOT try to resatisfy the next assertion like(ram, X).
Solution (contd.) • Suppose, now, it is asserted that • like(shyam, apple) :- Fail. /*i.e., Shyam does not like apple • Now on query • ?like(ram, apple) • like(shyam,apple) will fail. • The system WILL backtrack and resatisfy the next assertion like(ram, X), returning like(ram, apple) as TRUE, binding X to apple.
Prolog’s way of making and breaking a list Problem: to remove duplicates from a list rem_dup([],[]). rem_dup([H|T],L) :- member(H,T), !, rem_dup(T,L). rem_dup([H|T],[H|L1]) :- rem_dup(T,L1). Note: The cut ! in the second clause needed, since after succeeding at member(H,T), the 3rd clause should not be tried even if rem_dup(T,L) fails, which prolog will otherwise do.
1. Cross Lingual Information Retrieval The goal is to study the techniques of information retrieval when the language of the query is different from the language of the web page. After covering basics of information retrieval (crawling, indexing, ranking etc.), the student is expected to study the complexities of cross lingual search (see http://clef-campaign.org). Considerable work on this has happened at IIT Bombay which leads the national project on CLIR (http://www.clia.iitb.ac.in/clia-beta-ext; the site may not be up always). 2. Semantic Role Labeling (UNL) Sentences have inherent structure in terms of agent, object, instrument, time, place and such relationships between words. When extracted, such information finds usage in a large number of applications like machine translation, entailment, question answering, information extraction etc. We will study semantic roles in the form of Universal Networking Language (UNL; http://www.undl.org) and the ongoing work on UNL extraction spanning over last several years (visit http://www.cse.iitb.ac.in/~pb, click on publications and see publications on “Semantics”).
3. Empirical Methods in NLP-ML This is a foundational topic proposing to investigate mathematical and statistical principles and methods cutting across different areas of NLP and Machine Learning. Examples of these are Spectral Analysis, Expectation Maximization, Graphical Models, Lexical Semantic Association Techniques and so on. Additional topics of study will be NLP-suitable probability distributions, e.g., Dirichlet distribution. Specific NLP and ML applications will always be kept in focus. Browse proceedings of EMNLP conference (http://www.isca-students.org/emnlp_2009) and CoNLL conference (http://www.cnts.ua.ac.be/conll/). 4. Text Entailment This studies the methodologies involved in deciding if a piece of text (H) is inferrable from another (e.g., “France beat Brazil in the FIFA World Cup semi final” entails “Brazil bowed out of the World Cup”). Grounded on predicate calculus, there has been considerable work recently on using machine learning techniques for entailment (http://pascallin.ecs.soton.ac.uk/Challenges/RTE/). A number of students have worked with me on this topic and the use of UNL in entailment.
5. Foundation of Artificial Intelligence We will study work by Minsky, Newel and Simon, McCarthy, Penrose, Dreyfus, Hofstadter, Marr, Chomsky and so on, including Indian thoughts on intelligence and consciousness. 6. Machine Translation, Principles and Paradigms Starting with Interligua Based Machine Translation using UNL, we have made inroads in Statistical Machine Translation (http://www.cse.iitb.ac.in/~pb/papers/acl09-smt.pdf). This topic will study many different approaches to machine translation which is a key problem (along with CLIR) for a multilingual country like India. We are members of large national projects on English-to-Indian-Language and Indian-language-to-Indian-Language machine translation. 7. Shallow Parsing for Indian Languages This is ongoing critical work attempting to build common platforms for morphology, POS tag and chunk processing for Indian Languages- especially Hindi and Marathi. Talk to seniors Mugdha Bapat (mbapat@cse) and Harshada Gune (harshada@cse).
Uncertainty Studies Uncertainty Study Qualitative Reasoning Information Theory based Fuzzy Logic Based Probability Based Markov Processes & Graphical Models Probabilistic Reasoning Entropy Centric Algos Bayesian Belief Network
To-play-or-not-to-play-tennis data vs. Climatic-Condition from Ross Quinlan’s paper on ID3 (1986), C4.5 (1993)
Outlook Sunny Cloudy Rain Humidity Yes Windy F High Low T No No Yes Yes
Rule Base R1: If outlook is sunny and if humidity is high then Decision is No. R2: If outlook is sunny and if humidity is low then Decision is Yes. R3: If outlook is cloudy then Decision isYes.
Fuzzy Logic tries to capture the human ability of reasoning with imprecise information Models Human Reasoning Works with imprecise statements such as: In a process control situation, “If the temperature is moderate and the pressure is high, then turn the knob slightly right” The rules have “Linguistic Variables”, typically adjectives qualified by adverbs (adverbs are hedges).
Underlying Theory: Theory of Fuzzy Sets Intimate connection between logic and set theory. Given any set ‘S’ and an element ‘e’, there is a very natural predicate, μs(e) called as the belongingness predicate. The predicate is such that, μs(e) = 1, iffe∈ S = 0, otherwise For example, S = {1, 2, 3, 4}, μs(1) = 1 and μs(5) = 0 A predicate P(x) also defines a set naturally. S = {x | P(x) is true} For example, even(x) defines S = {x | x is even}
Fuzzy Set Theory (contd.) Fuzzy set theory starts by questioning the fundamental assumptions of set theory viz., the belongingness predicate, μ, value is 0 or 1. Instead in Fuzzy theory it is assumed that, μs(e) = [0, 1] Fuzzy set theory is a generalization of classical set theory also called Crisp Set Theory. In real life belongingness is a fuzzy concept. Example: Let, T = set of “tall” people μT (Ram) = 1.0 μT (Shyam) = 0.2 Shyam belongs to T with degree 0.2.
Linguistic Variables μtall(h) 1 0.4 4.5 0 1 2 3 4 5 6 height h Fuzzy sets are named by Linguistic Variables (typically adjectives). Underlying the LV is a numerical quantity E.g. For ‘tall’ (LV), ‘height’ is numerical quantity. Profile of a LV is the plot shown in the figure shown alongside.
Example Profiles μpoor(w) μrich(w) wealth w wealth w
Example Profiles μA (x) μA (x) x x Profile representing moderate (e.g. moderately rich) Profile representing extreme
Concept of Hedge tall somewhat tall 1 very tall μtall(h) 0 h Hedge is an intensifier Example: LV = tall, LV1 = very tall, LV2 = somewhat tall ‘very’ operation: μvery tall(x) = μ2tall(x) ‘somewhat’ operation: μsomewhat tall(x) = √(μtall(x))
Representation of Fuzzy sets Let U = {x1,x2,…..,xn} |U| = n The various sets composed of elements from U are presented as points on and inside the n-dimensional hypercube. The crisp sets are the corners of the hypercube. μA(x1)=0.3 μA(x2)=0.4 (0,1) (1,1) x2 (x1,x2) U={x1,x2} x2 A(0.3,0.4) (1,0) (0,0) Φ x1 x1 A fuzzy set A is represented by a point in the n-dimensional space as the point {μA(x1), μA(x2),……μA(xn)}
Degree of fuzziness The centre of the hypercube is the “most fuzzy” set. Fuzziness decreases as one nears the corners Measure of fuzziness Called the entropy of a fuzzy set Fuzzy set Farthest corner Entropy Nearest corner
(0,1) (1,1) x2 (0.5,0.5) A d(A, nearest) (0,0) (1,0) x1 d(A, farthest)
Definition Distance between two fuzzy sets L1 - norm Let C = fuzzy set represented by the centre point d(c,nearest) = |0.5-1.0| + |0.5 – 0.0| = 1 = d(C,farthest) => E(C) = 1
Definition Cardinality of a fuzzy set [generalization of cardinality of classical sets] Union, Intersection, complementation, subset hood
Note on definition by extension and intension S1 = {xi|xi mod 2 = 0 } – Intension S2 = {0,2,4,6,8,10,………..} – extension How to define subset hood?