1 / 23

ML: Classical methods from AI Decision-Tree induction Exemplar-based Learning Rule Induction TBEDL

ML: Classical methods from AI Decision-Tree induction Exemplar-based Learning Rule Induction TBEDL. RuleInduction. Rule Induction. We will follow (again):. ACL’99 Tutorial on: Symbolic Machine Learning for NLP (Mooney & Cardie 99). Sequential Covering Greedy Covering

Download Presentation

ML: Classical methods from AI Decision-Tree induction Exemplar-based Learning Rule Induction TBEDL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ML: Classical methods from AI • Decision-Tree induction • Exemplar-based Learning • Rule Induction • TBEDL

  2. RuleInduction Rule Induction • We will follow (again): ACL’99 Tutorial on: Symbolic Machine Learning for NLP (Mooney & Cardie 99) • Sequential Covering • Greedy Covering • Strategies for Learning a Single Rule: • Top-Down vs. Bottom-Up

  3. RuleInduction Rule Induction • Propositional FOIL • Relational Learning and Inductive Logic Programming (ILP) • FOIL • Applications: • Text Categorization • Information Extraction

  4. RuleInduction Rule Induction and NLP • Text Categorization(Cohen 95,96; Craven et al. 98; Slattery & Craven 98) • Semantic Parsing(Zelle & Mooney 93,94,96) • Information Extraction(Soderland 95,96,99; Freitag 98a,98b,98c) (Califf & Mooney 97,99; Turmo & Rodríguez 01) • Generation (Radev 98)

  5. IE Information Extraction (Turmo & Rodríguez, 01)

  6. IE Information Extraction (Turmo & Rodríguez, 01) “Vira a marrón oscuro al corte”

  7. IE (Turmo & Rodríguez, 01) Information Extraction

  8. IE Information Extraction (Turmo & Rodríguez, 01) • Basic concepts • Colour: <n3, n4> • Derived concepts • Color_state: <n5, n9>

  9. IE Information Extraction (Turmo & Rodríguez, 01) Resultats globals? isa_color (A, A) :- pos_s_adj(A), has_hypernym_03464977n(A), ancestor(A, C), pos_s_adj(C). isa_color (A, A) :- has_hypernym_03460270n(A), brother(C,A), pos_nc(C), has_hypernym_00009919n(C). … UsingFOIL(First Order Induction Learner, Quinlan, 1990) as basic learner 38 rules were learned by FOIL for color only 1 was illformed

  10. IE Information Extraction (Turmo & Rodríguez, 01) Drawbacks of the learning process • Insufficient amount of positive examples • Active Learning • Artificial examples • Relevance of negative examples • Use of empirical observations • Freitag’s baseline • Use of a distance measure between examples • Use of clustering techniques

  11. Internet IE Information Extraction • The WebÞKB Project • CMU Text Learning Group(Tom Mitchell, Andrew McCallum, Mark Craven, etc.) • Situation: >350 million Web pages available from a personal workstation. However none of them are understandable for your computer • Goal: To automatically create a computer-understandable knowledge base whose content mirrors that of the WWW • Utility: Allowing much more effective information retrieval and supporting knowledge-based inference and problem solving on the World Wide Web • How: Using machine learning to create information extraction methods for each of the desired types of knowledge

  12. WebKB architecture Entities Person department_of projects_of name_of ... Student advisors_of courses_TAed_by Faculty projects_led_by students_of Internet IE

  13. WebKB architecture Web Pages Fundamentals of CS Home Page Instructors: Jim Tom Jim’s Home Page I teach several courses: Fundamentals of CS Intro to AI My research includes: Intelligent web agents Human computer interaction Internet IE

  14. WebKB architecture KB Instances Fundamentals-of-CS instructors_of: jim, tom home_page: Jim courses_taught_by: fundamentals-of-CS, intro-to-AI home_page: Internet IE

  15. Web pages Ontology INPUT Learning algorithm Learning algorithm Learning algorithm ... TRAINING RESULT Classification rules Relation extraction rules Extraction rules ... WebKB WWW Internet IE WebKB architecture TEST

  16. Internet IE Learning Tasks • Recognizing class instances by classifying bodies of text • Recognizing relation instances by classifying chains of hyperlinks • Recognizing class and relation instances by extracting small fields of text from Web pages

  17. Internet IE Learning Tasks • Recognizing class instances by classifying bodies of text • Bayesian text categorization • Several text representations • Exploiting hyperlink relations • relational text categorization • clustering of documents • Exploiting combination of several classifiers

  18. course(A) Ù person(B) Ù link_to(B,A) Þinstructor_of(A,B) research_project(A) Ù person(C) Ù link_to(L1,A,B) Ù link_to(L2,B,C)Ù neighbour_word_people(L1)Þmember_proj(A,C) Internet IE Learning Tasks • Recognizing relation instances by classifying chains of hyperlinks • Discovering hyperlink paths of unknown and variable size. • First order representation • Induction of relational rules (FOIL)

  19. length(F,<,3) Ù in_title(A) Ù prev_word(A,”GMT”) Ù unknown(A) Ù not(length(A,=,4)) Ù follow_word(A,B) Ù length(B,>,4) Þownername(F) Internet IE Learning Tasks • Recognizing class and relation instances by extracting small fields of text from Web pages • Sequence Ruleswith Validation (Freitag, 98; 99): • FOIL-based general-purpose relational learner for IE • Rules for extracting names of home page owners: • 77.4% accuracy!

  20. Internet IE Evaluation • Training corpora(hand labelled according to the prescribed ontology): • 8,000 Web pages • 1,400 Web-page pairs • From the computer science department Web sites at four universities: Cornell, University of Texas at Austin, University of Washington, and University of Wisconsin. • Experimental test on the Web site of the computer science department at Carnegie Mellon University

  21. Evaluation Internet IE

  22. Internet IE Evaluation Class instances Relation instances

  23. RuleInduction Rule Induction: Summary • Connection to DanRoth’s work at the Cognitive Computation Group (Univ. of Illinois at Urbana-Champaign)

More Related