Introduction to Web Science

Introduction to Web Science Reusing knowledge

Six challenges of the Knowledge Life Cycle • Acquire • Model • Reuse • Retrieve • Publish • Maintain

Reusing knowledge • Three reusable types of objects • Ontologies • Problem Solving Methods • Knowledge Bases • Plus we can also use additional sources (WWW)

Problems with reuse • Locating the knowledge to be reused is difficult • Distributed agents may be unaware that the knowledge they need is available (this is the challenge of knowledge retrieval) • Knowledge may simply be in the wrong form for the task

Two particular reuse tasks • Question answering • Dialogue systems

What is Question Answering? • The main aim of QA is to present the user with a short answer to a question rather than a list of possibly relevant documents. • As it becomes more and more difficult to find answers on the WWW using standard search engines, question answering technology will become increasingly important.

Question Types (1) • Clearly there are many different types of questions: • When was Mozart born? • Question requires a single fact as an answer. • Answer may be found verbatim in text i.e. “Mozart was born in 1756”. • How did Socrates die? • Finding an answer may require reasoning. • In this example die has to be linked with drinking poisoned wine.

Question Types (2) • How do I assemble a bike? • The full answer may require fusing information from many different sources. • The complexity can range from simple lists to script-based answers. • Is the Earth flat? • Requires a simple yes/no answer.

Evaluating QA Systems • The biggest independent evaluations of question answering systems have been carried out at TREC (Text Retrieval Conference) • Five hundred factoid questions are provided and the groups taking part have a week in which to process the questions and return one answer per question. • No changes are allowed to your system between the time you receive the questions and the time you submit the answers.

A Generic QA Framework • A search engine is used to find the n most relevant documents in the document collection • These documents are then processed with respect to the question to produce a set of answers which are passed back to the user • Most of the differences between question answering systems are centred around the document processing stage

A Simplified Approach • The answers to the majority of factoid questions are easily recognised named entities, such as countries, cities, dates, peoples names, etc • The relatively simple techniques of gazetteer lists and named entity recognisers allow us to locate these entities within the relevant documents – the most frequent of which can be returned as the answer • This leaves just one issue that needs solving – how do we know, for a specific question, what the type of the answer should be

A Simplified Approach (1) • The simplest way to determine the expected type of an answer is to look at the words which make up the question: • who – suggests a person • when – suggests a date • where – suggests a location

A Simplified Approach (2) • Clearly this division does not account for every question but it is easy to add more complex rules: • country – suggests a location • how much – suggests an amount of money • author – suggests a person • birthday – suggests a date • college – suggests an organization • These rules can be easily extended as we think of more questions to ask

Problems (1) • The most frequently occurring instance of the right type might not be the correct answer. • For example if you are asking when someone was born, it maybe that their death was more notable and hence will appear more often (e.g. John F Kennedy’s assassination). • There are many questions for which correct answers are not named entities: • How did Ayrton Senna die? – in a car crash

Problems (2) • The gazetteer lists and named entity recognisers are unlikely to cover every type of named entity that may be asked about: • Even those types that are covered may well not be complete. • It is of course relatively easy to build new lists, e.g. Birthstones.

Does a gazetteer of people names contains all the names? • Amber • Precious • Diamond • Asia • Summer • Holly • Are these person’s names?

Dialogue (1) • A sequence of utterances • Exchange of information among multiple dialogue participants • Stays coherent over the time • Driven by certain goal • finding the most suitable restaurant in a foreign city, • booking the cheapest flight to a given city, • controlling the state of the devices in a home, • or the goal might also be the interaction itself (chatting)

Dialogue (2) • Most natural means for communication for humans perceived as a very expressive, efficient and robust • However, dialogue is very complex protocol • follow certain conventions or protocols that are adopted by participants • humans usually use their extensive knowledge and reasoning capabilities to understand the conversational partner • the dialogue utterances are often imperfect – ungrammatical or elliptical

Ellipsis • People often utter partial phrases to avoid repetition • A: At what time is “Titanic” playing? • B: 8pm • A: And “The 5th element”? • It is necessary to keep track of the conversation to complete such phrases

Deixis • Some words can only be interpreted in context: • Previous context (anaphora) • “The monkey took the banana and ate it” • Future context (cataphora) • “Give me that. The book by the lamp.” • Temporal/spatial • “The man behind me will be dead tomorrow.” • (Who is the man? When he died/dies?)

Indirect Meaning • The meaning of a discourse may be far from literal. • B: I can’t reach him. • A: There is the telephone. • B: I am not in my office. • A: Okay. • Undertones & implications are often employed for effect or efficiency

Turn Taking • People seem to know very well when they can take their turn • There is little overlap (5%) • Gaps are often a few 1/10ths of a second • Appears fluid, but not obvious why • A computational model of overlap does not exists • causes problem for dialogue systems

Conversational fillers • Phrases like “a-ha”, “yes”, “hmm” or “eh” are often prompted in order to fill the pauses of the conversation, to indicate the attention or reflection • The challenge here is to recognize when they should be understood as a request for turn taking and when they should be ignored

Most common dialogue domain • Flight and train timetable information and reservation • Smart homes • Automated directory enquires • Yellow pages enquires • Weather information

Components of a Dialogue System

Automatic Speech Recognition • Transforms speech to text • Two basic types • Grammar-based ASR • The set of accepted phrases defined by regular/context-free grammars (i.e. language model in the form of a grammar) • Usually speaker independent • Dictation machine • Recognizes “any utterance” • N-gram language model • Often speaker dependent

Natural Language Understanding • Analyzes textual utterance and returns its formal semantic representation • Logical formula • Named entities • etc

Dialogue Manager • Coordinates activity of all components • Maintains representation of the current state of the dialogue • Communicates with external applications • Decides about the next dialogue step

Three types of DM • Finite-state • dialogue flow determined by a finite state automata • Frame-based • form filling • Plan (task) based • a dynamic plan is constructed to reach the dialogue goal • … in practice, you often find an extended versions or combinations of above mentioned approaches!

Finite State Automata

Frame Based

Plan Based • Take a problem solving approach • There are goals to be reached • Plans are made to reach those goals • The goals and plans of the other participants must be iteratively inferred or predicted • Potential for handling complicated dialogues • suffers from today’s technological limitation • in more complex cases the planning problem can become computationally intractable • Examples: Bathroom consultant

Natural Language Generation • Produces a textual utterance (so called surface realization) from an internal (formal) representation of the answer • The surface realization can include formatting information • Speaking style, pauses • Background sounds

Text-To-Speech • Transforms the surface realization into a an acoustic representation (sound signal)

Typical parameters • Commercial systems: • small vocabulary (~100 words) • closed domain • system initiative • Research systems: • larger (but still small) vocabulary (~10000 words) • closed domain • (limited) mixed initiative

Different Initiatives • System-initiative • system always has control, user only responds to system questions • User-initiative: • user always has control, system passively answers user questions • Mixed-initiative: • control switches between system and user using fixed rules • Variable-initiative: • control switches between system and user dynamically based on participant roles, dialogue history, etc.

Multi Modal Dialogue Systems • Several possible input/output modalities to communicate with dialogue systems • speech, text, pointing, graphics, gestures, face configurations, body positions, emotions, etc. • Not single “most convenient” modality (different modalities have different advantages) • entering day of week: click on a calendar • entering Zip code: use keyboard • performing commands: speech • complex query: express them as typed natural language • Several modalities useful • when one modality is not applicable - e.g. eyes or hands are busy, silent environment • or when difficult to use - e.g. small devices with limited keyboard and small screen

Case Study • Comic • Companions

The Comic Avatar

Wizard of Oz

Putting it together

The Companions Architecture

The Companions Robot

The Companions Interface 1

The Companions Interface 2

Questions?

Introduction to Web Science