320 likes | 548 Views
PLOW: A Collaborative Task Learning Agent. Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009. Outline. Introduction The PLOW System Demonstration Learning Tasks Evaluation Strength & Weakness Related Works Q&A and Discussion. Introduction.
E N D
PLOW: A Collaborative Task Learning Agent Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009
Outline • Introduction • The PLOW System • Demonstration • Learning Tasks • Evaluation • Strength & Weakness • Related Works • Q&A and Discussion
Introduction • Aim to further human-machine interaction • Quickly learn new tasks from human subjects using modest amounts of interaction • Acquire task models from intuitive language-rich demonstration
Background • Previous Work: Learn new tasks by observation, learn throw observing experts’ demonstration. • Paper’s Contribution: Acquire tasks much more quickly, typically from a single example, maybe with some clarification dialogue.
Language Processing • Focus on how it is used for task learning, rather than how it is accomplished • TRIPS system • Based on a domain-independent representation
Instrumentation • The key issue is to get the right level of analysis for the instrumentation DOMs
Task Learning • Challenges • Identifying the correct parameterization • Hierarchical structure • Identifying the boundaries of iterative loops • Loop termination conditions • Task goals
Primitive Action Learning • NL Interpretation + GUI Interpretation • Heuristic search through DOM • Semantic metric • Structural distance metric
Primitive Action Learning Natural Language Interpretation Heuristic search GUI Interpretation Structural Distance Metric Semantic metric
Parameterization Learning • Identify appropriate parameterization • Object roles • Input/output parameter • Constant • Relational dependency • Information from language
Parameterization Learning • Output: Hotels • Input: Address • Constant: Hotels • Relational Dependency: Zip is Role of Address
Hierarchical Structure Learning • Beginning of new procedures • A Goal • End of procedure
Iteration Learning • Iterative procedure learning • Language support • PLOW’s attempt for Iteration • User corrections/more example • Rule/Pattern learned
Evaluation • 16 test subjects with general training • 3 other systems: • One learned entirely from passive observation • One used a sophisticated GUI primarily designed for editing procedures but extended to allow the definition of new procedures • One used an NL-like query and specification language that requires detailed knowledge of HTML producing the web page
Evaluation – First part • Subjects taught different systems about some subset of predefined test questions • Evaluators created new test examples by specifying values for input parameters, then scored the execution results • PLOW scored 2.82 out of 4 (not mention other systems’ scores)
Evaluation – Second part • 10 new test questions designed by an outside group, unknown to the developers prior to the test • Subjects had one work day to teach whichever of these tasks they wished, using whichever of the task learning system • PLOW was used to teach 30 out of 55 task models
Evaluation – Second part • 10 new test questions designed by an outside group, unknown to the developers prior to the test • Subjects had one work day to teach whichever of these tasks they wished, using whichever of the task learning system • PLOW was used to teach 30 out of 55 task models • Also, PLOW received the highest average score in the test (2.2/4)
Strength • Integrating natural language recognition and understanding (TRIPS, 2001) • “Play by play” mode, great user experience • Easier to identify parameters, boundaries of loops, termination conditions, build hierarchical structure, realize goals • Generalization from one short task • Learn not only the task, but also the rule
Strength • Error correction from users • “This is wrong. Here is another title” • PLOW will confirm correctness from users when generating data from lists • Less domain knowledge required, less training • Close to “one-click automation”
Weakness • Some remarks of Evaluation: • PLOW was ensured of being able to learn 17 pre-determined test questions, other systems? • 10 new tasks have different levels of difficulties: ex: For what reason did <person> travel for <project> between <start date> and <end date>? • No detailed analysis of evaluation result, so does PLOW really learn robust task models from a single example? Or just better on certain types of tasks?
Weakness • Learning and reasoning relied on NL understanding: • encounters new concepts? • require certain patterns of speaking? Enough NL understanding capabilities? • Still need one full work day to teach 3 simple tasks/person • Users have to construct good task models, no error detection mechanism for users in PLOW
Related works • Sheepdog, 2004 • an implemented system for capturing, learning, and playing back technical support procedures on the Windows desktop • Complex technical supporting procedures – relatively simple procedures in PLOW • Record traces to form alignment problem, use I/O HMMs to build procedure models, need many training examples
Related works • Tailor, 2005 • a system that allows users to modify task information through instruction • Recognize user’s instruction: combine rules with parser, JavaNLP • Map sentences to hypothesized changes • Reason about the effects of changes, detect the unanticipated behavior • Also relatively simple tasks
Related works • CHINLE, 2008 • a system which automatically constructs PBD systems for applications based on their interface specification • Learning from incomplete data and partial learning from inconsistent data – PLOW can learn subset of certain tasks, but users cannot make mistakes