Goals

Creating a Corpus forA Conversational Assistant for Everyday TasksHenry Kautz, Young Song, Ian Pereira, Mary Swift, Walter Lasecki, Jeff Bigham, James AllenUniversity of Rochester

Goals • Fine-grained activity recognition combining speech and RGBD vision • Learning and recognizing multi-step activities from (one-shot) instruction • Learning names and properties of objects from instruction • Tracking and assistance using task model

overhead mics power meter lapel mic video kinect open/close sensors RFID sensors

Language Logical Form “I’m going to make a cup of tea.”

Extracted Events “I put it on the stove.” :event ont::put :agent user :theme v123 :start 0 :end 32 :utt 2 :speechtime/eventtime reln: overlap

Domains • Making tea - 12 subjects x 3 episodes • Making sandwiches • Building things with blocks • Coarse-grained home activities • Snack bar surveillance

Labeling Corpus • Need to label data for • Supervised learning methods • Evaluating supervised or unsupervised methods • “Gold standard” • Define event ontology • Hand label • Review / correct by second investigator • 1 hour per 2 minutes • Alternative?

Crowd AR • Idea • Try to recognize activities using current model • When confidence is low, ask human workers to label video segment • Mediate response • Update model with new labels

Worker Interface • Workers watch a live video stream of an activity and enter open-ended text labels into the bottom text field • They can see the responses of other workers and the learningmodel (HMM) on to the right of the video, and agree with them by clicking on them.

Mediator • An example of the graph created by the input mediator • Green nodes represent sufficient agreement between multiple workers (here N = 2). • The final sequence matches the baseline despite incorrect (over-specific) submissions by 2 out of the 3 workers, and a spelling error by one worker on “walk”the word ‘walk’.

Interactive Recognition and Labeling Experiments • Domain: coarse-grained activities • Model: HMM

Privacy

Monitoring Multi-Agent Scenarios • Surveillance of department honor snack bar • 85% correct on 11 trials

Parameterized & Complex Activities • Average number of objects and actions correctly labeled by worker groups of different sizes over two different activity sequences. • As the group size increases, more objects and actions are labeled.

Goals

Goals

Presentation Transcript

Goals

Goals

Goals

Goals

Goals

Goals

Goals

Goals :

Goals

Goals

Goals

Goals

Goals

Goals

Goals

Goals

Goals

Goals

Goals

Goals

Goals

Goals :