Review of Schank’s Scripts: consist of a set of slots.

Review of Schank’s Scripts: consist of a set of slots. Associated with each slot may be information about the kinds of values it may contain, as well as default values. Scripts have causal structure – events connected to earlier events that make them possible, and later events they enable. Headers of scripts indicate when a script should be activated Related to the concept of Frames (Minsky) which was earlier and for more static structures (e.g. a room). Scripts more like a big verb dictionary, Frames more like one for nouns.

What background knowledge do we need to understand a story? What information does the writer expect us to infer? Are we likely to have both in a predetermined script? How do when know when a story has stopped following a script? (Compare: how do we know when the person we are talking to has changed the subject--some people never notice!)

De Jong’s ‘sketchy script matcher’ FRUMP • At Yale around 1977 DeJong developed a new form of SAM (Lehnert’s Script Applier Mechanism) • It sought only to fill initially determined predicate values of interest to a user • It worked mainly on newspaper stories about terrorism.

For example, FRUMP wants to find out type of car, object it collided with, location of accident, number of people killed/injured, who was at fault. Skims new story to identify appropriate script. Then tries to answer expectations. Connected to UPI wire service.

UPI Story.Pisa, Italy. Officials today searched for the black box flight recorder aboard an Italian air force transport plane to determine why the aircraft crashed into a mountainside killing 44 persons. They said the weather was calm and clear, except for some ground level fog, when the US-made Hercules C130 transport plane hit Mt Serra moments after takeoff Thursday. The pilot described as one of the country’s most experienced, did not report any trouble in a brief radio conversation before the crash. FRUMP summary: 44 people were killed when an airplane crashed into a mountain in Italy today.

FRUMP is not like a ‘full’ restaurant script (for air disaster) but it simply fills a small number of slots (not necessarily ordered) like NUMBER_DEAD, WHERE_CRASH, WHEN_ CRASH. • FRUMP was never statistically evaluated

But FRUMP was the forerunner of a 1990’s technology • Information Extraction, where ‘templates’ of slots and fillers are filled from web or newspaper text at high speed and huge volume) • This new AI technology was created by US Government funding in the 1990s and is highly statistical and competitive between groups/universities/companies.

How do humans perform tasks? Part of the aim of research on Script as was to find a way of giving a program the same knowledge that humans use to understand a story--and Script theory was very influential in Psychology. Similarly, in research on Expert Systems, aim is to capture, and apply, the knowledge that human experts have. And in earlier examples, e.g. GPS, idea was to mimic human problem solving ability.

It makes sense to emulate humans in Artificial Intelligence research. • One of the original motivations for AI research was to understand human mind. • But also to get computers to do clever things, no matter how! • Difficult to provide an account of intelligence without reference to what humans can do. Although our changed conception of intelligence now is less human-based e.g. perhaps a bee is capable of intelligent behaviour. But if we are concerned to emulate humans, we need to find out how humans think, if we think psychology has ways of telling us that reliably

Ways of finding out how people work…….. • Introspection (most AI experiments, like CD/Sripts) • Protocol analysis (Activity reports--GPS) • Psychology experiments • One problem for expert systems is that the introspection of experts is unreliable (plumbers cant always tell you how they do it). • Much psychology is unsurprising but sometimes helpful--e.g. that people usually cant remember surface words only content--which is consistent with CD’s claims.

Return to Expert Systems SHRDLU, and blocks microworld. Domain-specific knowledge (as opposed to domain-general knowledge). Understood substantial subset of English by representing and reasoning about a very restricted domain. Had knowledge of microworld, (but no real understanding). But program too complex to be extended to real world. Expert systems: also relied on depth of knowledge of constrained domain. But commercially exploitable. ‘Real’ applications.

SHRDLU Dead end: program very complex, also little to do with real world. General realisation that programs that performed well within limits of microworlds, could not capture complexity of everyday human reasoning. Remember that SHRDLU would have to process AN INTERESTING BOOK by accessing all the books it knew in its database and all the interesting things! Hubert Dreyfus (1972): criticism of idea that reasoning and intelligence could be captured by logical rules.

Dreyfus was part of the first major reaction against the claims of AI in the 1970s (cf. UK Govt. Lighthill Report). Weizenbaum (1976): pointing out that his ELIZA ‘had come close to passing Turing Test.(!) Humans too ready to attribute intelligence to unintelligent devices. Risk of oversold programs. But some of this was just breast beating for profit (Weizenbaum’s Computer Power and Human Reason was Reader’s Digest Book of the Month!). Overselling how much one had done even while repenting!

References for Knowledge Representation Rich and Knight (1991) Artificial Intelligence, McGraw-Hill, Inc. Chapter 4. Cawsey, A. (1997) Essentials of Artificial Intelligence, Prentice-Hall. (see also web reference on course page) Russell and Norvig (1995) Artificial Intelligence: A modern approach. Chapter 3.

Introspective evidence of stages of learning a skill or expertise – e.g. car driving or chess playing • Novice. Car driver or chess player is consciously following rules. • Expert: can decide what to do ‘ without thinking’ – making decisions about what to do based on resemblance of current situation to many previously experience situations. • best chess players can usually instantly recognise what is a good move. • expert driver knows when slowing down is needed without thinking about it. (e.g. becomes difficult to drive if you consciously reflect about gear shifting and try to decide what to do).

If this intuition is correct, there is more to real expert understanding than following rules. BUT a few problems where (rule driven) expert systems can perform as well as experts. And even in the absence of claims that expert systems think like humans, these may well be a useful tools. Probably work best when used as consultant or aide to human expert or novice. Examples are medical diagnostic systems, optimal layout systems for space, and scheduling algorithms. Feigenbaum’s DENDRAL at Stanford predicts chemical compounds.

Criticisms by Hubert Dreyfus Dreyfus: points out ways in which AI theorists have overclaimed about what they can do. e.g. Feigenbaum claims that ‘DENDRAL has been in use for many years at university and industrial chemical labs around the world’. But ‘..when we called several university and industrial sites that do mass spectroscopy, we were surprised to find that none of them use DENDRAL..’ Dreyfus: Programming attempts to capture ordinary, or common sense knowledge and reasoning ability are doomed to failure. Such knowledge cannot be captured by programs because it is too contextual and open-ended. For Dreyfus, the real expert is not following rules

Strong AI: building programs that actually think (or striving towards this) Weak AI 1: Applications – trying to perform tasks that would require intelligence if performed by humans. • Some attempt to simulate human solutions Weak AI 2: Modelling human cognition Expert Systems sometimes do better than human experts. e.g. Buchanan, 1982, MYCIN did better than panel of experts in evaluating ten selected meningitis cases. But expert systems benefit from being applied in an area where computer can exploit an ability to follow rules.

Four major problems for expert systems • Brittleness. Cannot fall back on general knowledge – e.g. if mistake in entering data for medical expert system, entering that patient is 130 years old, and weighs 40 pounds. ES would not guess values switched. • No Meta-knowledge. Expert systems do not know their own limitations. • Knowledge acquisition. Still bottleneck in new domains. • Validation. Difficult to know what to compare it to (unless compared to human experts diagnosing real world problems).

Domain-specific knowledge versus domain-independent knowledge Expert systems: good at domain-specific knowledge, bad at domain-independent. PUFF knows nothing about medical complaints except conditions of the lung (i.e. knowledge very specific), and may not even know whether lungs are above or below knees (example of common knowledge about human anatomy). Does that matter? Would we care if it diagnosed us efficiently? Why are we obsessed with being a human whole?

Is an ES like an Idiot savant: person who is basically retarded, but able to perform very well in one limited domain. e.g. calculating day on which particular dates fall. From Lenat and Guha (1990) (in Rich and Knight, 1991, Artificial Intelligence) System: How old is the patient? Human: (looking at his 1957 chevrolet) 33 System: Are there any spots on the patients body? Human: (noticing rust spots) Yes. System: What colour are the spots? Human: Reddish-brown. System: The patient has measles (probability 0.9) More like ‘automated reference manuals’ (Copeland, 1993).

Advantages of Expert Systems Human experts can lose expertise. Ease of transfer of artificial expertise. No effect of emotion in artificial expertise. Expert systems are a low cost alternative – expensive to develop but cheap to operate. Limitations: Lack of creativity, not adaptive, lack sensory experience, narrow focus, and no commonsense knowledge (or meta-knowledge).

Lack of wider understanding Winograd (Shrdlu’s programmer) ‘..There is a danger inherent in the label ‘expert system’. When we talk of a human expert we connote someone whose depth of understanding serves not only to solve specific well-formulated problems, but also to put them into a larger context. We distinguish between experts and idiot savants. Calling a program an expert is misleading….’ Can lead to inappropriate expectations But may be useful if users can be educated about proper expectations (are people getting used to limited machines?)

See following two paragraphs (from Hayes-Roth, 1983) Summaries of pulmonary function diagnosis of particular patient. One by human expert, other by expert system (PUFF). Conclusions: the low diffusing capacity, in combination with obstruction and a high total lung capacity is consistent with a diagnosis of emphysema. Although bronchodilators were only slightly useful in this one case, prolonged use may prove beneficial to the patient. PULMONARY FUNCTION DIAGNOSIS: MODERATELY SEVERE OBSTRUCTIVE AIRWAYS DISEASE. EMPHYSEMATOUS TYPE.

Conclusions: Overinflation, fixed airway obstruction and low diffusing capacity would all indicate moderately severe obstructive airway disease of the emphysematous type. Although there is no response to bronchodilators on this occasion, more prolonged use may prove to me more helpful. PULMONARY FUNCTION DIAGNOSIS: OB-STRUCTIVE AIRWAYS DISEASE, MODERATELY SEVERE EMPHYSEMATOUS TYPE.

No totally automatic ways of constructing expert knowledge bases, but there are programs which interact with domain experts to extract expert knowledge efficiently. e.g. finding holes in knowledge and prompting expert to fill them. AND/OR checking for consistency in knowledge OR Alternative to interviewing expert: looking at sample problem and solutions, and inferring its own rules. e.g. bank’s problem of deciding whether to approve a loan. Instead of interviewing loan oficers, look at past loans, and try to generate loans that will maximise number of good loans in the future.

Expert system Shells also marketed. e.g. EMYCIN (Empty Mycin) Consists of the shell of an expert system, without domain specific knowledge. New knowledge domain can be entered, and make use of same rule mechanisms.

Evaluate expert systems: good idea or not? How important is it to have systems that are commercially viable, and made use of in the real world? Would you be happy to rely on a medical Expert System instead of a doctor? AdvantagesDisadvantages

Reliance of expert systems on domain specific knowledge Also on heuristics operating on the knowledge Knowledge-base: need to find a way of representing knowledge. MYCIN: production rules. Also need to draw appropriate inferences – inference-engine. Need to work out what knowledge is appropriate, and to get it into the knowledge-base.

Knowledge engineering Based on protocol analysis (GPs pioneered this) : human subjects encouraged to think aloud as they solved problems. Protocols later analysed to reveal concepts and procedures employed. Protocol analysis used alongside Logic Theorist by Newell and Simon. Interaction between expert system builder, knowledge engineer, and human experts in some problem area. Some computational psychologists (e.g. Schvaneveldt) used networks to represent knowledge elicited as associations of concepts.

Automated Knowledge Acquisition and Evaluation Alternative to time-consuming and expensive knowledge engineering. Evaluation depends entirely on task for which ES are designed. If they function as assistants (like DENDRAL) we need only that they do not miss any solutions with respect to given set of constraints, and take a reasonable length of time. If like MYCIN they generate whole solutions, we need evaluation against human experts (or rival expert systems).

Evaluation of expert systems. Comparison to experts: need to follow experimental procedures, i.e. so raters don’t know which are human and which are computer’s solutions. DENDRAL: used as expert’s assistant, rather than stand alone expert. Heuristic search technique constrained by knowledge of human expert. ‘…supports hundreds of international users every day, assisting in structure elucidation problems for such things as antibiotics and impurities in manufactured chemicals..’ (Jackson, 1990) .

MYCIN: performance compares favourably with human experts. But never used in hospitals Suggested reasons (Jackson, 1990) • Its knowledge base is incomplete since it does not cover anything like the full spectrum of infectious diseases. • Running it would have required more computing power than hospitals could afford. • Interface not good. • Trade union protectionism by US doctors?

MYCIN. (Shortliffe and Buchanan, Stanford). Expert system which attempts to recommend appropriate therapies for patients with bacterial infections. Four part decision process: • Deciding if the patient has a significant infection • Determining the possible organisms involved • Selected a set of drugs that might be appropriate • Choosing the most appropriate drug or combination of drugs.

MYCIN has five components. • A knowledge base • A dynamic patient database • A consultation program • An explanation program • A knowledge acquisition program, for adding or changing rules.

Once MYCIN finds the identities of the disease-causing organisms, it tries to select therapy to treat disease. IF the identity of the organism is pseudomonas THEN therapy should be selected from among the following drugs: • Colistin (.98) • Polymyxin (.96) • Gentamicin (.96) • Carbenicillin (.65) • Sulfisoxazole (.64) (decimal numbers show prob. of arresting growth of pseudomonas).

Expert systems typically use production rules: (IF – THEN rules) e.g. MYCIN rule If: • The stain of the organism is gram-positive, and • The morphology of the organism is coccus, and • The growth conformation of the organism is clumps, then there is suggestive evidence (0.7) that the identity of the organism is staphylococcus.

MYCIN contains more than 500 such rules. Complex interactions of rules gives high level of performance. - at level of human specialists in blood infections (and much better than GPs) (Shortliffe, 1976). The UK NHS is said to be shifting to ‘evidence based medicine’ and is VERY short of experts, so be optimistic!

Diagnostic knowledge (knowledge-based) is represented as a set of rules IF • The site of the culture is blood, and • The stain of the organism is gram net, and • The morphology of the organism is rod, and • The patient has been seriously burned THEN there is evidence (0.4) that the identity of the organism is pseudomonas.

MYCIN control structure. Has top level goal IF (1) there is an organism which requires therapy, and (2) consideration has been given to any other organisms requiring therapy THEN compile a list of possible therapies, and determine the best one in this list. These rules used to reason backward to the clinical data (backward chaining). Possible bacteria causing infection are considered in turn. MYCIN attempts to prove whether they are involved.

Another actual expert system DENDRAL project, began at Stanford University (USA) in 1965. Feigenbaum and Lederberg. Aim: to determine the molecular structure of an unknown organic compound. Analysed data from mass spectrometer. Mass spectrometer – bombards chemical sample with beam of electrons, causing compound to fragment, and components to be rearranged. But complex molecule can fragment in different ways; can only make predictions about which bonds will break.

Has data from mass spectogram (i.e. after bonds have broken), and has to work out what the original compound was. Although there are constraints (i.e. has identified chemical formula of compound, and presence/absence of certain substructural features) still many possibilities. DENDRAL planner can assist in decision about which constraints to impose.

DENDRAL could figure out (on basis of vast amount of data from mass spectographs) which organic compound was being analysed. Performance relevant data, formulated hypotheses about compound’s molecular structure, and tested hypotheses by way of further predictions. Output was list of possible molecular compounds ranked in terms of decreasing plausibility.

Required constraints – based on conclusions already drawn. • Forbidden constraints – rules out possibilities that don’t fit the data, or because resultant structures are chemically unstable.

BUT: does not emulate ways in which humans would actually solve problems. DENDRAL (in 1960s) – beginning of divide between simulation of human behaviour, and trying to arrive at intelligence by any means available. Problems: • Best way to achieve intelligent behaviour may be to emulate human intelligence. • Most interesting aspect of AI is the light it throws on understanding the human mind. • Yet…expert systems do work!

Examples of domains for Expert Systems: • Engineering - Design - Fault finding - Manufacturing planning - Scheduling • Scientific analysis • Medical diagnosis • Financial analysis

Expert System Shell User Case specific data User Interface Explanation system Inference engine Knowledge base Knowledge base editor

Knowledge-base, contains representation of domain-specific knowledge. Inference engine – performs reasoning. Two kept separate. Normal method for representing knowledge in an expert system: IF-THEN rules. Often rules do not have certain conclusions: dealing with uncertainty. Main approaches to knowledge representation in AI. • Logic • Frames and semantic networks • If-then rules within a rule-based system

General characteristics Expert system: program designed to replicate decision making process of human expert. Basic idea: experts have a great deal of knowledge, and this knowledge could be provided in some formal manner to a program. • Requires knowledge base. Knowledge base entered by knowledge engineer – ‘knowledge engineering’, involves interviewing and observing experts, and converting their words and action into a knowledge base • Reasoning mechanisms to apply knowledge to problems • Mechanism for explaining their decisions.

Example: rules for diagnosing household emergency. Rule 1: If coughing THEN add smoky Rule 2: If wet and NOT raining THEN add burstpipe Rule 3: If NOT coughing AND alarm-rings THEN ADD burglar Rule 4: If smoky AND hot THEN ADD fire

Review of Schank’s Scripts: consist of a set of slots.