330 likes | 544 Views
Classic Case Studies. John MacIntyre 0191 515 3778 john.macintyre@sunderland.ac.uk. The Classics. DENDRAL: determine molecular structure of an unknown compound started in 1965 MYCIN: medical diagnosis system started in 1972. DENDRAL. Developed at Stanford University in 1965
E N D
Classic Case Studies John MacIntyre 0191 515 3778 john.macintyre@sunderland.ac.uk COM362 Knowledge Engineering Classic Case Studies
The Classics • DENDRAL: • determine molecular structure of an unknown compound • started in 1965 • MYCIN: • medical diagnosis system • started in 1972 COM362 Knowledge Engineering Classic Case Studies
DENDRAL • Developed at Stanford University in 1965 • Possibly the first computer program EVER to rival human experts in a specialized field • Determine molecular structure of an unknown compound • Used a modified form of “generate and test” methodology COM362 Knowledge Engineering Classic Case Studies
The DENDRAL Problem • Chemist is presented with an unknown chemical compound • Chemist must determine the molecular structure • Therefore needs to find out which atoms are in the structure • Needs to know how the atoms are connected to form molecules COM362 Knowledge Engineering Classic Case Studies
The DENDRAL Problem • Data from mass spectrometer • Not straight-forward! • Molecules can fragment in different ways • need to make some predictions about how molecules are LIKELY to break • sub-components of the molecule may be found in many different compounds • chemists therefore determine compound sub-components, and apply constraints that other sub-components must satisfy COM362 Knowledge Engineering Classic Case Studies
The DENDRAL Problem • Not a trivial problem! • Consider the formula: C6H13NO2 • There are 10,000 isomers of this compound!! • Each permutation can be uniquely identified • Could simply generate each of the10,000 permutations in turn and test • Very expensive in computing time! • There would like to constrain the generation of candidate permutations to save time COM362 Knowledge Engineering Classic Case Studies
Constrained Generation • CONGEN: • DENDRAL program for constrained generation of complete chemical structures • Manipulates symbols representing atoms and molecules • Uses a set of constraints on how atoms can be inter-connected • Chemist can specify and vary the initial constraints (eg based on experimental evidence) COM362 Knowledge Engineering Classic Case Studies
Specifying Constraints • Defining “constraining structures”: • specify “superatoms” that compound must contain • typically in organic compounds, rings or chains of carbon atoms linked to hydrogens • Defining other constraints: • open for the chemist to hypothesize • eg “compound must contain a carbon ring of 6 carbon atoms” etc…. COM362 Knowledge Engineering Classic Case Studies
Assessing Candidates • CONGEN may produce hundreds or thousands of candidate structures • First pass at assessing the candidates: • Use basic rules of mass spectrometry to test candidates and remove most unlikely ones • MSPRUNE: another DENDRAL program which does this • MSRANK: ranks remaining structures according to how their graphs match expected graphs for known compounds COM362 Knowledge Engineering Classic Case Studies
Scoring Candidates • Peaks (features) in the spectral graphs are weighted to represent their importance • Weighted scores are produced to give the rank ordering for each candidate structure • Essentially this is a “hypothesize-and-test” strategy COM362 Knowledge Engineering Classic Case Studies
Evaluating DENDRAL • Available on the network of Stanford University, California • Used by hundreds of people around the world every day • Has been used to challenge long-published chemical literature successfully • The first stepping-stone between “traditional” problem solving and modern expert systems COM362 Knowledge Engineering Classic Case Studies
Features of DENDRAL • Uses information from domain experts to help limit the search space for candidate structures • Uses an explicit representation of knowledge - fragmentation rules • No real inference mechanism - iterative passes through the rules controlled by user COM362 Knowledge Engineering Classic Case Studies
The Keys to Success? • DENDRAL was successful because: • It did not set out to replace the expert, only to assist the expert • The search technique is based on a proven model of knowledge with known mathematical properties • There is a language which can be used to represent the structures easily and is well specified COM362 Knowledge Engineering Classic Case Studies
MYCIN • Developed at Stanford University in 1972 • Regarded as the first true “expert system” • Assist physicians in the treatment of blood infections • Many revisions and extensions to MYCIN over the years COM362 Knowledge Engineering Classic Case Studies
The MYCIN Problem • Physician wishes to specify an “antimicrobial agent” - basically an antibiotic - to kill bacteria or arrest their growth • Some agents are poisonous! • No agent is effective against all bacteria • Most physicians are not expert in the field of antibiotics COM362 Knowledge Engineering Classic Case Studies
The Decision Process • There are four questions in the process of deciding on treatment: • Does the patient have a significant infection? • What are the organism(s) involved? • What set of drugs might be appropriate to treat the infection? • What is the best choice of drug or combination of drugs to treat the infection? COM362 Knowledge Engineering Classic Case Studies
MYCIN Components • KNOWLEDGE BASE: • facts and knowledge about the domain • DYNAMIC PATIENT DATABASE: • information about a particular case • CONSULTATION PROGRAM: • asks questions, gives advice on a particular case • EXPLANATION PROGRAM: • answers questions and justifies advice • KNOWLEDGE ACQUISITION PROGRAM: • adds new rules and changes exisiting rules COM362 Knowledge Engineering Classic Case Studies
Basic MYCIN Structure Physician User ConsultationProgram Dynamic Patient Data Static Knowledge Base ExplanationProgram Knowledge Acquisition Program Infectious Disease Expert COM362 Knowledge Engineering Classic Case Studies
The MYCIN Knowledge Base • Where the rules are held • Basic rule structure in MYCIN is: if condition1 and….and conditionmhold then draw conclusion1 and….and conditionn • Rules written in the LISP programming language • Rules can include certainty factors to help weight the conclusions drawn COM362 Knowledge Engineering Classic Case Studies
An Example Rule IF:(1) The stain of the organism is Gram negative, and (2) The morphology of the organism is rod, and (3) The aerobicity of the organism is aerobic THEN: There is strongly suggestive evidence (0.8) that the class of the organism is Enterobacteriaceae COM362 Knowledge Engineering Classic Case Studies
Calculating Certainty • Rule certainties are regarded as probabilities • Therefore must apply the rules of probability in combining rules • Multiplying probabilities which are less than certain results in lower and lower certainty! • Eg 0.8 x 0.6 = 0.48 COM362 Knowledge Engineering Classic Case Studies
Other Types of Knowledge • Facts and definitions such as: • lists of all organisms known to the system • “knowledge tables” of clinical parameters and the values they can take (eg morphology) • classification system for clinical parameters and the context in which they are applied (eg referring to patient or organism) • Much of MYCIN’s knowledge refers to 65 clinical parameters COM362 Knowledge Engineering Classic Case Studies
MYCIN’s Context Trees • Used to organise case data • Helps to visualise how information within the case is related • Easily extended and adapted as more clinical evidence becomes available COM362 Knowledge Engineering Classic Case Studies
Example Context Tree PATIENT-1 CULTURE-1 CULTURE-2 CULTURE-3 OPERATION ORGANISM-1 ORGANISM-2 ORGANISM-3 DRUG-1 DRUG-2 COM362 Knowledge Engineering Classic Case Studies
MYCIN Control Structure • Uses a goal-based strategy to attempt to solve, in the first instance, a TOP LEVEL GOAL RULE • Establishes sub-goals required to satisfy the top level goal • Therefore establishes the concept of backward chaining COM362 Knowledge Engineering Classic Case Studies
Top Level Goal IF:(1) There is an organism which requires therapy; and (2) consideration has been given to any other organism requiring therapy THEN: compile a list of possible therapies, and determine the best one in this list COM362 Knowledge Engineering Classic Case Studies
MYCIN Subgoals • Sub-goals are a generalised form of the top-level goal • Hence sub-goals consider the proposition that there is a particular organism • Exhaustive search on all relevant rules to test this proposition (until or unless one succeeds with total certainty) • More like exhaustive search than backward chaining COM362 Knowledge Engineering Classic Case Studies
Selection of Therapy • Done after the diagnostic phase is complete • Two phases: • Selection of a list of candidate drugs • Choice of preferred drugs or combinations of drugs from the list • Therapy rules use information on: • Sensitivity of organism to drug • Contraindications on the drug COM362 Knowledge Engineering Classic Case Studies
Example Recommendation IF: The identity of the organism is Pseudomonas THEN: I recommend therapy from the following drugs: 1 - COLISTIN (0.98) 2 - POLYMYXIN (0.96) 3 - GENTAMICIN (0.96) 4 - CARBENICILLIN (0.65) 5 - SULFISOXAZOLE (0.64) COM362 Knowledge Engineering Classic Case Studies
Evaluating MCYIN • Many studies show that MYCIN’s recommendations compare favourably with experts for diseases like meningitis • Study compared on real patients with expert and non-expert physicians: • MYCIN matched experts • MYCIN was better than non-experts COM362 Knowledge Engineering Classic Case Studies
MYCIN Limitations • A research tool - never intended for practical application • Limited knowledge base - only covers a small number of infectious diseases • Needed more computing power than most hospitals had at the time! • Doctors reluctant to use it • Poor interface COM362 Knowledge Engineering Classic Case Studies
Conclusions • DENDRAL was a ground-breaking program as it showed that computers could match experts in a specific domain • DENDRAL was always intended as an “expert assistant” • MYCIN was the first “expert system” which included an inference control structure • MYCIN is limited for practical use COM362 Knowledge Engineering Classic Case Studies
Further Reading • Introduction to Expert Systems • P. Jackson, Addison Wesley, 1990 • Expert Systems: Principles and Programming • J. Giarratano, G. Riley, PWS Publishing, 1994 • Artificial Intelligence: Tools, Techniques and Applications • T. O’Shea, M. Eisenstadt, Open University, 1984 COM362 Knowledge Engineering Classic Case Studies