Chapter 3 Knowledge Acquisition 知識擷取

Chapter 3Knowledge Acquisition知識擷取

3.1 INTRODUCTION • The goal of knowledge acquisition（知識擷取） is to elicit expertise（專業知識） from domain experts（領域專家）. Expertise Transfer Knowledge base Computerized Representation Expert G.J. Hwang

Advantages of Employing Knowledge Acquisition（知識擷取） Systems： • They does not only depend on the training cases（訓練範例）. • Real-time analysis is possible. • Real-time consistency checking is possible. • They can be integrated with KE tools. • Knowledge bases（知識庫） can be automatically generated. G.J. Hwang

REVIEWS OF PREVIOUS WORKS Substantive Knowledge： To identify current state “Am I in danger of being attacked” Strategic Knowledge： To determine what to do next “Climb to 30000 feet” G.J. Hwang

Knowledge Acquisition（知識擷取） System Substantive Knowledge Strategic Knowledge Classification Decision making Control Planning MORE SALT MOLE ASK Repertory Grid Approach Other Approach TEIRESIAS KRITON ETS NeoETS KSSO KITTEN AQUINAS KNACK RuleCon G.J. Hwang

The Acquisition of Substantive Knowledge • Repertory Grid（知識表格）-Oriented Methods： Step 1. Elicit elements to be classified. Step 2. Elicit constructs from experts. Each time three elements are selected. The expert is asked to give a construct to distinguish one element from the other two. Step 3. Rate the grid by filling a rating (1-5) to each entry. Step 4. Generate implication graph. G.J. Hwang

Step 1：Elicit elements from experts. Step 2：Elicit constructs from experts. G.J. Hwang

Step 3：Rate each entry of the grid. Step 4：Generate the implication graph. headache red purple high fever G.J. Hwang

Rules generated from the grid： First column： IF high_fever and red and purple and (not headache) Then Disease = Measles CF = MIN (0.8,1.0,0.8,0.8) = 0.8 Second column： IF (not high_fever) and (not red) and (not purple) and (not headache) Then Disease = German Measles G.J. Hwang

Advantages of applying repertory-grids（知識表格） Easy to analyze the elicited knowledge： • Similarity analysis of constructs. • Similarity analysis of elements. • Analysis of the relationships among constructs. • Detection of missed elements. • Detection of logical errors. G.J. Hwang

3.2 ELICTATION OF SUBSTANTIVE KNOWLEDGE • Knowledge Representation（知識表示法） A dog has 4 legs being very sure G.J. Hwang

An acquisition table is a repertory grid（知識表格） of multiple data types： Boolean ：true or false Single value：an integer, a real, or a symbol Set of value：a set of integers, real numbers or symbols. Range of values：a set of integers or real numbers. ‘X’：no relation. ‘U’：unknown or undecidable. • Ratings： 2：very likely to be. 1：maybe. G.J. Hwang

3.3 Some Problems of Repertory Grids （知識表格） • Problem of Element Selection G.J. Hwang

Problem of Multi-Level Knowledge and Acquirability INPUT DATA INPUT DATA SUBGOAL SUBGOAL SUBGOAL INPUT DATA GOAL G.J. Hwang

The Concept of Acquirability： The value of a terminalattribute of a decision tree must either be a constant or be acquirable from users.For example： IF (leaf-shape = scale) and (class = Gymnosperm) THEN family = Cypress. Class is not an acquirable attribute. G.J. Hwang

？？？ Leaf Shape Class Family G.J. Hwang

Domain basis and classification knowledge： Diseases Domain basis Other diseases Acute Exanthemas Classification knowledge Measles, German measles, Dangue fever,… G.J. Hwang

Problem of Missing Embedded Meanings（隱含知識） • When a diagnostician expresses the features of catch cold are headache, feel tired, cough, sneeze,…, he means “if a person catches cold, he may have those features” • We usually represent the expertise as the following rules: (Headache = yes) and (Feel_tired = yes) and (cough = yes) and …, --> Disease = Catch_cold G.J. Hwang

The embedded meaning（隱含知識）of the diagnostician “if one or some features do not appear, it is still possible that the patient catches cold.” Is ignored. G.J. Hwang

3.4 EMCUD：A New Model for Eliciting • Knowledge Representation（知識表示法）： Conventional Repertory grid（知識表格）or Acquisition Table + Attribute Ordering Table （屬性序列表格）(AOT) G.J. Hwang

Eliciting embedded meanings（隱含知識） by constructing the Attribute Ordering Table（屬性序列表格） • Value in an AOT may be： ‘D’：The attribute dominate（主導權） the object. ‘X’：The attribute has no relation with the object. an integer：The attribute is of some degree of importance to the object.(A smaller integer means less important.) G.J. Hwang

An example of Repertory Grid（知識表格）： The rule generated from first column： RULE3： (13<A116)(A2=YES)  (A3=4.3) → GOAL = Obj3 Where F(confidence) = 1.0 if confidence = 2 = 0.8 if confidence = 1 and Certainty Factor CF = MIN(F(2),F(1), (F(2)) = 0.8 G.J. Hwang

An example of constructing AOT. EMCUD：If A1 {9,10,12}, is it possible that GOAL =Obj1 ? EXPERT：No. /*This implies that A1 dominates Obj1 and AOT<Obj1,A1> = ‘D’ */ EMCUD：If A2  YES,is it possible that GOAL = Obj1? EXPERT：Yes. /*A2 does not dominate Obj1 */ EMCUD：If A1 > 16 or A1  13, is it possible that GOAL = Obj3? EXPERT：Yes. /* A1 does not dominate Obj3 */ EMCUD：If A2  YES, is it possible that GOAL = Obj3 ? EXPERT：Yes. /* A2 does not dominate Obj3 */ EMCUD：If A3  4.3 , is it possible that GOAL = Obj3 ? EXPERT：No. /* A3 does dominate Obj3 */ G.J. Hwang

EMCUD：Please rank A1 and A2 in the order of importance to Obj3 by choosing one of the following expressions： 1)A1 is more important that A2 2)A1 is less important that A2 3)A1 is as important as A2 EXPERT：1 /* A1 is more important to Obj3 than A2, hence AOT < Obj3,A1> = 2 and AOT <Obj3,A2> = 1 */ G.J. Hwang

Elicit Embedded Meanings（隱含知識） From RULE3, the following embedded rules（隱含規則） will Be generated by negating the predicates of A1 and A2： RULE3,1：NOT(13<A116)(A2=YES)  (A3=4.3) 　　　　 → GOAL = Obj3 RULE3,2： (13<A116)NOT(A2=YES)  (A3=4.3) 　　　　 → GOAL = Obj3 RULE3,3：NOT(13<A116)NOT(A2=YES)  (A3=4.3) 　　　　 → GOAL = Obj3 G.J. Hwang

Certainty Sequence(CS)： Represents the drgree of certainty degradation. CS(RULESij) = SUM(AOT<Obji,Ak>) for each ak in the negated predicates of ruleij For example： CS(RULE3,3) = AOT < Obj3,A1 + AOT<Obj3,A2> = 2 + 1 = 3 The embedded rules（隱含規則）generated from RULE3： RULE3,1：NOT(13<A116)(A2=YES)  (A3=4.3) 　　　　 → GOAL = Obj3 CS = 2 RULE3,2： (13<A116)NOT(A2=YES)  (A3=4.3) 　　　　 → GOAL = Obj3CS = 1 RULE3,3：NOT(13<A116)NOT(A2=YES)  (A3=4.3) 　　　　 → GOAL = Obj3CS = 3 G.J. Hwang

Construct Constraint List • Sort the embedded rules according to the CS values： RULES3,2CS = 1 RULES3,1CS = 2 RULES3,3CS = 3 • A prune-and-search algorithm： EMCUD：Do you think RULE3,1 is acceptable? Expert：Yes. /* then RULE3,2 is also accepted*/ EMCUD：Do you think RULE3,3 is acceptable? Expert：No. /* then CS=3 is recorded in the constraint list */ G.J. Hwang

Calculate Certainty Factors（確定因子） Confirm：1.0 Strongly support：0.8 Support：0.6 May support：0.4 CFij= Upper-Boundi- (Csij/MAX(Csi))  (Upper-Boundi – Lower-Boundi) MAX(Csi)：maximum CS value of the embedded rules generated from RULEi. Upper-Boundi：certainty factor of embedded Lower-Boundi：certainty factor of embedded rule with MAX(Csi) /* The rule with least confidence*/ G.J. Hwang

An example of calculating certainty factors（確定因子） For the embedded rules（隱含規則） from RULE3： 1. Upper – Bound = CF(RULE3) = 0.8 2. Since RULE3 is not accepted, the embedded rule with MAX(CS) is RULE3,1： EMCUD：If RULE3 strongly supports GOAL = Obj3 , what about RULE3,1 ? Expert：1. /*The Lower-Bound = 0.6*/ CF3,1 = 0.8 – (2/2) * (0.8 – 0.6) = 0.6 CF3,2 = 0.8 – (1/2) * (0.8 – 0.6) = 0.7 G.J. Hwang

The process of eliciting embedded meanings（隱含知識）： repertory grid original rules Attribute-Ordering Table eliciting embedded rules possible embedded rules Constraint List thresholding accepted embedded rules mapping mapping function certainty factors of the embedded rules G.J. Hwang

ACQUISITION TABLE AOT G.J. Hwang

Conventional Repertory Grids（知識表格）： IF (咳嗽=YES)&(疲倦=YES)&(頭痛=YES) THEN DISEASE=肺炎 CF=0.8 EMCUD： IF (咳嗽=YES)&(疲倦<>YES)&(頭痛=YES) THEN DISEASE=肺炎　　　　　CF=0.67 IF (咳嗽=YES)&(疲倦=YES)&(頭痛<>YES) THEN DISEASE=肺炎　　　　　CF=0.73 IF (咳嗽=YES)&(疲倦<>YES)&(頭痛<>YES) THEN DISEASE=肺炎　　　　　CF=0.6 G.J. Hwang

OBJECT CHAIN：A METHOD FOR questions selection： • For the grid with 50 elements (or objects), there are 19600 possible choices of questions to elicit constructs (or attributes). • Initial repertory grid（知識表格） and the object chains： OBJECT CHAIN Obj1 --> 2,3,4,5 Obj2 --> 1,3,4,5 Obj3 --> 1,2,4,5 Obj4 --> 1,2,3,5 Obj5 --> 1,2,3,4 G.J. Hwang

The expert gives attribute P1 to distinguish Obj1 andObj2 fromObj3 OBJECT CHAIN Obj1 -- > 2,5 Obj2 -- > 1,5 Obj3 -- > 4 Obj4 -- > 3 Obj5 -- > 1,2 G.J. Hwang

The expert gives attribute P2 to distinguish Obj2 andObj5 fromObj1 OBJECT CHAIN Obj1 -- > NULL Obj2 -- > 5 Obj3 -- > NULL Obj4 -- > NULL Obj5 -- > 2 G.J. Hwang

The expert gives attribute P3 to distinguish Obj2 fromObj5 OBJECT CHAIN Obj1 -- > NULL Obj2 -- > NULL Obj3 -- > NULL Obj4 -- > NULL Obj5 -- > NULL G.J. Hwang

Advantages： • Fewer questions are asked(log2n to n-1 questions). • All of the objects are classified. • Every question matches the current requirement of classifying objects. • Disadvantages： • It may force the expert to think a specific direction. • Some important attributes may be ignored. G.J. Hwang

Eliciting hierarchy of grids： • For the expert system（專家系統） of classifying families of plants Goal is FAMILY (科) G.J. Hwang

Since class is not acquirable, it becomes the goal of a new grid. Goal is CLASS G.J. Hwang

Since class is not acquirable, it becomes the goal of a new grid. Goal is TYPE G.J. Hwang

Decision tree of the hierarchy of grids： FAMILY OF PLANT LEAF SHAPE NIDDLE PATTERN CLASS TYPE FLATE STEAM POSITION ONE TRUNK G.J. Hwang

3.5 An Application and Performance Evaluation of EMCUD • Application Domain： Diagnosis of Acute Exanthema • Hardware： Personal Computer • Software： Personal Consultant Easy G.J. Hwang

The codes of diseases and their translations: 1-Measles8 - Meningococcemia 2-German measles9 - Rocky Mt. Spotted fever 3-Chickenpox10 - Typhus fevers 4-Smallpox11 – Infectious mononucleosis 5-Scarlet12 – Enterovirus infections 6-Exanthem subitum13 – Drug eruptions 7-Fifth disease14 – Eczema herpeticum Table 3.3：Testing results of the old and new prototypes. G.J. Hwang

3.6 Knowledge integration（知識整合）from multiple experts • To build a reliable expert system, the cooperation of several experts is usually required. • Difficulties： • Synonyms of elements (possible solutions) • Synonyms of traits (attributes to classify the solutions) • Conflicts of ratings G.J. Hwang

Each expert has his own way to do some works. Habitual domain of Expert 1 Habitual domain of Expert 2 Integrated Knowledge Use more attributes to make choices from more possible decisions G.J. Hwang

Expert 1 Expert 2 Expert N Busy Busy Busy Far away Far away Knowledge Engineer It is difficult to have all of the experts work together G.J. Hwang

… Expert 1 Expert 2 Expert N Phase 1 interview Repertory Grid 1 Repertory Grid 2 Repertory Grid N The unions of element sets and construct sets Common Repertory Grid Phase 2 interview … Expert 1 Expert 2 Expert N Eliminate some redundant vocabularies Common Repertory Grid G.J. Hwang

Phase 3 interview … Expert 1 Expert 2 Expert N Rated Common Repertory Grid 1 Rated Common Repertory Grid 2 Rated Common Repertory Grid N Knowledge Integration Integrated Repertory Grid Rule Generation G.J. Hwang

Repertory Grid 1 Repertory Grid 2 Repertory Grid N The unions of element sets and construct sets Common Repertory Grid Phase 2 interview … Expert 1 Expert 2 Expert N Eliminate some redundant vocabularies Common Repertory Grid Phase 3 interview Expert 1 Expert 2 Expert N G.J. Hwang

Rated Common Repertory Grid 1 Rated Common Repertory Grid 2 Rated Common Repertory Grid N Knowledge Integration Integrated Repertory Grid Flat Repertory Grid Generate AOT AOT … Filled AOT 1 Filled AOT 2 Filled AOT N Integration or AOT’s Integrated AOT Rule Generation G.J. Hwang

Chapter 3 Knowledge Acquisition 知識擷取