320 likes | 526 Views
Engineering a Knowledge Base for an Intelligent Personal Assistant. Vinay K. Chaudhri Adam Cheyer Richard Guili. Bill Jarrold Karen L. Myers John Niekrasz. Outline. Problem KB Development Knowledge Engineering Challenges Deploying the Knowledge Base Future Work
E N D
Engineering a Knowledge Base for an Intelligent Personal Assistant Vinay K. Chaudhri Adam Cheyer Richard Guili Bill Jarrold Karen L. Myers John Niekrasz
Outline • Problem • KB Development • Knowledge Engineering Challenges • Deploying the Knowledge Base • Future Work • Summary and Conclusions
Problem • Cognitive Assistant that Learns and Organizes (CALO) • Learn from experience • Be told what to do • Explain what it is doing • Reflect on experience • Respond robustly to surprises • Situated in an office environment
CALO Functions Schedule & Organize in Time Monitor & Manage Tasks Organize & Manage Information CALO Prepare Information Products Observe & Mediate Interactions Acquire, Allocate Resources
An Ontology is needed to permit sharing of data and knowledge across these various subcomponents
To: Sue @ sri.com Subj: Re: fMRI meeting Ok, I suggest Wednesday at 4pm. To: Bob@ sri.com Subj: Re: fMRI meeting See you then. Attached is the current draft. Manager of Works on Leader of Relevant to Meetingfor Projects Files Meetings Learning in ContextProvides greater value, needs fewer examples To: Bob@sri.com Subj: fMRI meeting We need to meet soon to discuss the paper deadline. Important?Meeting?…? Learning Algorithm
Example Functionality • CALO will automatically put together a portfolio of information (e.g., mail, files, web pages) relevant to your projects and to upcoming meetings • CALO will summarize, prioritize, and classify an email. • CALO will identifies the action items, and produce an annotated meeting record.
Test Questions: PQs and Iqs(Parameterized Questions &Instantiated Questions) What |sc:%Meeting| is being discussed or suggested in |io:%EmailMessage|? What is the duration suggested for the meeting discussed in |io:%EmailMessage|? What is the time suggested for the meeting discussed in |io:%EmailMessage|? What date is mentioned in |io:%Email|? What location is mentioned in |io:%Email|? What time is mentioned in |io:%Email|?
Outline • Problem • KB Development • Knowledge Engineering Challenges • Deploying the Knowledge Base • Future Work • Summary and Conclusions
KB Development • Knowledge Representation Framework • Development Process • Overview of Knowledge Content
Knowledge Representation Framework The “core” ontology • The Component Library (CLIB) • Barker, Porter, Clark KCAP 2001. • CLIB is written in KM (Knowledge Machine) • Re-usable, Composable, Domain-Independent Library • Richly axiomatized event classes (e.g. Move, Attach)
Knowledge RepresentationFramework • We used OWL for sharing the knowledge with modules that needed to load the ontology • We developed a KM to OWL translator • We limited the translation to only that subset of KM that could be translated into OWL
Knowledge Representation Framework • SPARK procedure language for representing knowledge about performing automated tasks • Expressiveness of SPARK was essential for representing complex process structures necessary for accommodating office tasks • We represent uncertain knowledge using weighted rules • Weights are necessary to capture the output from learning methods • We are still able to expose a deterministic interface to the rest of the system
Development Process • Distributed team with over 20 different research groups • We solicited requirements • List of classes and relations • Formal axioms • Large scale reuse of ontologies • iCalendar • Work of Radarnetworks • Ontology Simplification • Eliminate unneeded constructs • Simplify representation • Distributed development • Use Protégé for knowledge authoring
People First name, last name Contacts Postal address, home address, work address Emails Sender, receiver, etc Calendars Start, end, repetition Projects/Tasks Meetings Meeting types, discussion topics, meeting roles Organizations Organizational roles Learning Methods Capability of learning methods, data needed Provenance Source of an information Overview of Knowledge Content- The “Office Ontology”
ChatSessionMessage Comment: "Instances of #$ChatSessionMessage are complete messages passed between chat participants during a #$ChatSession. For example, if Bob and Fred are involved in CALO Online Chat Bob might send the chat message 'Hi Fred' to Fred. Such a message is a #$ChatSessionMessage. More specifically, it is a #$ChatTextMessage. Please see the subclasses of #$ChatSessionMessage because developers are more likely to be referencing its subclasses. A negative example of #$ChatSessionMessage would be a portion of the message sent from Bob to Fred such as 'Hi Fr'.” Superclasses: ElectronicMessage ComputerEncodedInformation Overview of Knowledge Content- Example Class
Process model system (PTIME) has approx 50 process models In Core plus Office Ontology Approx 1000 classes Approx 500 relations Overview of Knowledge Content
Outline • Problem • KB Development • Knowledge Engineering Challenges • Deploying the Knowledge Base • Future Work
Knowledge Engineering Challenges • Reusing iCalendar • Representing Meetings • Representing Tasks • Ensuring Interoperability
Reusing iCalendar • Prune the relations needed • All the relations were not needed • We did not want to bloat the ontology • We retained only what was needed • Define symbol name mappings • We renamed the relations to fit in our standard naming convention • But, we retained the mappings to the original name • Link to the rest of the ontology • We needed to define iCalendar relations using existing vocabulary of People, and Time
Representing Meetings • Communication Model • Modeling multi-modal communication • Modeling Discourse Structure • Modeling Meeting Activity
Representing MeetingsModeling Discourse Structure • Modeling Dialog Structures • Define Communicate subclasses such as Statement, Question, BackChannel, etc. • Modeling Argument Structure • Define coarser level actions such as Raising an Issue, Proposal, Acceptance, etc.
Representing MeetingsModeling the Meeting Activity • Provide ways to segment a meeting • Physical state of participants • Sitting, standing, etc • Agenda state of participants • Position within a previously defined meeting structure
Representing Tasks • Tasks are modeled in terms of • a task type • a set of input and output parameters, • Whether a parameter is required or optional • allowed constraints • Task instances are used through out the system • Descriptive properties • Priority, documentation, source, location, resource allocation and usage • Temporal properties • Creation time, start time (adapted from iCalendar)
Ensuring Interoperability • OWL does not allow n-ary relationships • Task representation requires representing position of an argument in a list • Needed to reify each argument so that the position could be specified • OWL does not allow specialization of primitive data types • Special kinds of strings such as Postal Code, Telephone Number are of interest • Needed to define a hierarchy of ``pseudo ranges’’
Outline • Problem • KB Development • Knowledge Engineering Challenges • Deploying the Knowledge Base • Future Work
Deploying the Knowledge Base • Querying the Knowledge base • Uniform point of access provided by a query manager • Updating the knowledge base • Methods to update the instance data if ontology changes • Mechanism to propagate additions to the ontology at runtime
Outline • Problem • KB Development • Knowledge Engineering Challenges • Deploying the Knowledge Base • Future Work
Future Work • Align Different OWL files • Ontology to Help “Stacked Learning” • A software engineer can solve a new learning problem by writing its specification in ontology • An end-user can specify a goal, and CALO can compute how to learn to meet that goal • CALO can infer a user’s goal and learn how to achieve that goal