COMP 650: POMDP’s real life applications

COMP 650: POMDP’s real life applications • Rahul Kumar • Department of Computer Science • Rice University • April 18, 2013

Long-term user intention prediction for wheelchair navigation using POMDP • References: Taha et al POMDP-based long-term user intention prediction for wheelchair navigation. • *image taken from http://robotzeitgeist.com/tag/dementia

Outline • Motivation • Introduction • POMDP quick review • Problem Specification / formulation • On Line assistance • Experimental results • Conclusions

Motivation Why to make wheel chairs smart ? • Growing number of aging population. • Increase in accidents or other calamities. • Terrible diseases which affect motor control

Reactive Wheelchairs • Reactive refers to systems that do not use representation of environment. • Most Popular among Intention recognition wheel chairs. • Rely on local or temporal information collected online. • Systems with limited power or processing power use this technique. • Examples: Rolland-III, NavChair etc.

POMDP - 1 • General framework for sequential decision making where states are hidden and actions are stochastic. • Widely used in assistive applications.

POMDP -2 • S – set of states • A – set of action • Z – set of observations • T – conditional transition probabilities S x A x S -> [0,1] • Z – conditional observation probabilities A x S x Z -> [0,1] • R: A x S -> real number

POMDP agent overview Observation Action Environment StateEstimator BeliefState Policy

POMDP Generation For efficient POMDP system , we need to have proper • State Space • Transition States • Observation States

POMDP model generation architecture

State Space • Spatial States : Wheelchair location = {s1,s2,s3,… } • Destination states : Places of Interest = {d1,d2,..} • Joint representation of both of them = {s1d1,s2d1,…}

State extraction and representation

Transition model • Transition model specifies the probability of transition from one state to another given when a certain action is executed. • Actions= global navigation commands = {North, South, East, West, Stop } • Observation = Joystick movements = { Up, Down, Right, Left, NoInput} • Directly calculated from the map topology.

Observation model • We use training data from particular user. • In indoor settings, wheelchair user usually performs repetitive tasks. • For example, A task can be going from living room to kitchen etc.

Reward function • -1 for each action • +100 for an action that leads to Destination.

On-line navigation

Experimental result -1 • Artificial data was generated based on the activity of user in the environment. • Zmdp software package was used. Zmdp package has several heuristic search algorithm for POMDPs and MDPs. • Known starting points but unknown destinations. • 100% success in predicting destination.

Experiment result -2

Experiment result -3

Conclusion • Employing POMDP for long term user intention prediction for wheel chair navigation. • No behavioral selection like other papers.

Future work • Enhance the capabilities and the intelligence of the system through automated activity monitoring and task extraction.

POMDP Hands * Image taken from http://matanyahorowitz.com/index.php

Overview • Motivation • Approach/Big Picture • Example/ Intution • Model Construction • Results

Motivation If you know all shapes and positions exactly, you can generate a trajectory that will work *Slide taken from Hsiao etal.

Problem at hand • How to decide on configuration of object when robot have to manipulate an object!

Approach/Big Picture • Partition Space : Identify and separate regions where we will have similar properties. • Reducing uncertainty in configuration by taking actions which acts as “funnels” i.e. mapping large sets of initial states to smaller set of resulting states. • We will work with set of guarded complaint motion. These actions acts as funnels.

Example Partial policy graph for robot

Abstract model construction • Action space: Two guarded complaint move commands for each degree of freedom. • Transition probability: Sample large number of triplets from given initial states. • Observation probability : Contact sensors have some uncertainty in determining contact. • Reward : 15 for reaching the goal, -50 for lifting in wrong configuration, -1 for each motion, -5 for being in unstable states or boundary states

Solving POMDP

Experiment • Similar to previous problem except that block is stepped • High fidelity simulation : 92% success, average reward: -1.59 • Fixed policy : 81% success, average reward= -10.632

Video demonstration

Future work • To address problem with shape uncertainty. • To handle interaction with other objects,

References • Shio et al Grasping POMDP’s

THANK YOU!

COMP 650: POMDP’s real life applications

COMP 650: POMDP’s real life applications

Presentation Transcript

E-commerce applications

Real Life Literacy Skills Buffet

Chicago, IL

Real Options

CHAPTER 13 Real Options and Other Topics in Capital Budgeting

Mr. Putter and Tabby Fly the Plane

Real Options

“Juggling” By Donna Gamache

Applications of Aqueous Equilibria Electrolyte Effect

Lazy functional programming for real Tackling the Awkward Squad

Mastitis Reports in Dairy Comp 305

CHAPTER 13 Option Pricing with Applications to Real Options

FreeRTOS

Real-time Signal Processing on Embedded Systems

APPLICATIONS OF DIFFERENTIATION

FURTHER APPLICATIONS OF INTEGRATION

MDF and its Applications

Section 4.7 Optimization Problems

FURTHER APPLICATIONS OF INTEGRATION

COMP 421 /CMPET 401

CPE555A: Real-Time Embedded Systems