220 likes | 372 Views
Progress on Building the Component Library. Bruce Porter, Peter Clark Ken Barker, Art Souther, John Thompson James Fan, Dan Tecuci, Peter Yeh Charles Benton, Marwan Elrakabawy, Cheyenne Kohnlein November 1, 2000. The Purpose of the Component Library.
E N D
Progress on Building the Component Library Bruce Porter, Peter Clark Ken Barker, Art Souther, John Thompson James Fan, Dan Tecuci, Peter Yeh Charles Benton, Marwan Elrakabawy, Cheyenne Kohnlein November 1, 2000
The Purpose of the Component Library • To represent the set of common actions, states, objects, and properties so that SME’s can build KB’s by simply instantiating and assembling them. • Representing actions has been our primary focus for four months. • Most team members have useda fewprototype components to build relatively simple scenarios. Now we’re trying to properly build a more comprehensive set of components.
Refresher… Slides from kickoff meeting in New Orleans
Representation of Bioremediation Soil Rate contains I- I- Q+ environment Q- rate agent Bio- technologist Bioremediation Amount Amount amount amount script remediator product pollutant agent Microbes Script Oil Fertilizer patient se se se se patient agent absorbed product Break Down Get Apply Absorb then then then
An underlying abstraction... Soil Rate contains I- I- Q+ environment Q- rate agent Bio- technologist Bioremediation Amount Amount amount amount script remediator product pollutant agent Microbes Script Oil Fertilizer patient se se se se patient agent absorbed product Break Down Get Apply Absorb then then then Rate I- I- Q+ Q- rate Amount Amount Conversion amount raw- materials amount product Substance Substance
Another abstraction... Soil Rate contains I- I- Q+ environment Q- rate agent Bio- technologist Bioremediation Amount Amount amount amount script remediator product pollutant agent Microbes Script Oil Fertilizer se se se patient se patient agent absorbed product Break Down Get Apply Absorb then then then Digest food eater script Agent Script Substance agent se patient se absorbed agent Break Down Absorb then
Another abstraction... Agent Soil Rate contains I- I- Q+ environment Q- rate agent Bio- technologist Bioremediation Amount Amount amount amount script remediator product pollutant agent Microbes Script Oil Fertilizer patient se se se se patient agent absorbed product Break Down Get Apply Absorb then then then Treatment script substance Script substance se patient patient Get Apply then
The Space of Actions • Based on various linguistic resources and an analysis of 2 texts by Alberts, we’re working toward this set of about 190 action components. • We’ve built components for about half of them, as shown here. • Our coding rate has increased significantly, and we’re now able to productively add more personnel.
Schedule • Through the end of 2000: • focus on action components, completing about 90% of those currently planned. • Start coding pump-priming knowledge, building basic representations of about 200 objects and events. • January through March 2001: • Focus on exercising the component library by encoding significant portions of Alberts. This work doubles as essential pump-priming. • Begin to represent generic objects, especially “role concepts” (more on this later). • Integrate the component library with core knowledge developed by other team members (more on this later).
What’s in a Component? • The specification gives the definition, slot constraints, and links to standard linguistic sources. Here’s an example. • The KM code gives the axioms and an explicit interface to the user. Here’s an example. Note that the code includes only local axioms; KM infers the rest. Here’s the complete expansion.
Our Process for Building a Component • form initial clusters of actions (e.g. transfer) based on an analysis of Alberts, Roget’s clusters, Cyc, and other linguistic sources. • write a specification for each action. • search Alberts for all occurrences (including all morphological variants) of each action, and make sure that the representation will accommodate them. Here’s the result of analyzing the actions in one chapter. These “coded examples”will be useful for training SME’s. • organize the actions taxonomically and pull out commonalities that can be handled with various types of composition.* • code the actions in KM along with simple test cases, commit them to the CVS-managed library, and run all test cases daily. Larger scenarios will provide the next level: integration testing.* * These points will be elaborated below.
How to access the Component Library • Click here to visit the component library. • It’s updated every day unless some test case fails. • We’ll add a feature to download the entire library via FTP.
The Dictionary of Slots • We want a simple, small, and slow growing set of slots. Ours currently has 78 slots (53 relations and 25 properties) and is inspired by well-studied sets of semantic roles from Linguistics, (surveyed in Ken’s dissertation). • Slots should apply intuitively to knowledge expressed informally. We have early evidence based on 3 large experiments. • The semantics of the slots must be axiomatized. Here are some examples. • Slots must make the distinctions necessary for inferencing (at least to the fidelity of the KR language) • The slot language must continue to evolve.
Non-taxonomic composition:Clichés • a cliché is a small pattern of axioms that recurs throughout the hierarchy. For example: • Reflexive: requiredslot: agent, object agent=object • Reciprocal:requiredslot: agent, object agent is object of an instance of this action having this object as agent • Undo(A): precondition: object is the object of the resulting-state of action A postcondition: object is no longer the object of the resulting-state of action A
Non-taxonomic composition:Utility Concepts • concepts that have natural homes within the hierarchy, but also form a part of the semantics of concepts across the hierarchy • Copy: • reasonable as a standalone concept • also part of Transcribe, Forge, Encode, Reproduce, etc.
Non-taxonomic composition:model-as • Many concepts in the KB are “role concepts” • e.g., container, nutrient • aregeneric • are highly reusable(can be applied in many concepts) • “If the DNA containing the 5S rRNA genes is …” • “many DNA sequences produce two or more distinct proteins” • “The DNA guides the synthesis of specific RNA molecules…” • “The DNA is enclosed in …” • “The idea that DNA transfers information…” • By separating the “model” (e.g. container) and its application (e.g. to DNA), we can apply & reuse the same model in many ways.
Applying models Cell generalizations: Container Consumer …? • Traditional: “Hard-wire” models to the modeled things • Better: Define machine-selectable “views” Cell model-as: Container (wall = membrane, ..) Consumer (consumes = organic molecules, ..) Vehicle (transported = DNA, …) …. • Control when and how components apply • Allows generic components to be used multiple ways (more reuse) - difficult in the traditional approach!
How others can contribute to the Component Library • Because the Library is only 4 months old and we’ve focused on particular types of knowledge, much remains to be done. We have several suggestions for how it might be usefully expanded.
How SME’s might index the Component Library • SME’s will undoubtedly adjust to our tools somewhat, but they start with English. We should index the Library by English terms. • Here’s a simple way to do that ... (next slide)
Mapping from Verbs to Actions • SME: I would like to use transport. • Shaken: Which of these senses of transport would you like? • - v. send from one person or place to another (see: Transfer) • - v. move while supporting … (see: Carry) • - v. hold spellbound • - v. transport commercially • v. move something or somebody around (see: Move) • - n. the commercial enterprise of transporting goods and materials • - n. something that serves as a means of transportation (see: Transport-Device) • - n. a mechanism to transport magnetic tape over the head … • - n. an exchange of molecules across a membrane (see: Molecular-Transport) • - n. a state of being carried away by overwhelming emotion • We get “for free” the mapping from transport to: • Transfer, Carry,Move, Transport-Device, and Molecular-Transport • by linking our components to synsets in Wordnet. • The red components are currently in the Library; the blue components are planned.
Other types of Knowledgewe’re Encoding • Properties usually surface as adjectives. We have a framework for representing them, and a plan for populating the KB. • Pump-priming knowledge. We have proposed a scenario for Jan’01 and started to represent knowledge of biological objects. We start with taxonomies and partonomies (like SME’s build), then convert them automatically to KM.
Coordinating our efforts on developing Core Knowledge • The Core Knowledge Workshop in Austin next month • Proposed agenda: • Address representation challenges: continuous processes, modes of existence, time, space, causality, modals and counterfactuals, … • Develop a detailed plan for integrating other core theories, such as ‘Everyday Semantics’ • Design the Core Knowledge for Shaken 1.0 • Schedule: • Duration: we suggest 3 days • Dates: we suggest mid-December