HUMANOBS

HUMANOBS Predictive Heuristics for Decision-Making in Real-World Environments Helgi Páll Helgason, Kristinn R. Thorisson, Eric Nivel, Pei Wang Reykjavik University / Icelandic Institute for Intelligent Machines Temple University, Philadelphia AGI 2013 - Beijing – August 2013

Problem > Multi-objective decision making > Realistic environments: dynamic, stochastic, continuous > Insufficient knowledge and resources > Knowledge is grounded in experience and re-evaluated continuously > Reasoning under uncertainty and in real-time

Problem 0.2 S0,0 0.6 S0,1 S0 0.1 S0,2 0.4 S0,3 time > Time is not discrete > Set of possible actions not always enumerable > Set of possible resulting states not always enumerable

S0,0 S0,1 S0 S0,2 S0,3 time > Search guided by the predicted value of interesting future states > Value: relevancy to goals > Set of possible courses of action  ordered set of interesting actions > Interestingness derived from experience (learning, attention control and self-compilation) and current activity (goals); real valued

> Predictors are controlled by success rate and confidence > Predicted state has a likelihood: likelyhood(S)=confidence(P)*(SuccessRate(P)-0.5)+0.5 > Goal’s utility: utility(G)=priority(G)*urgency(G) where urgency is the time horizon (from now), computed relatively to the horizon of all other goals > Goal’s achievement: 1 if achieved in S, -1 otherwise

> Expected value for a state S derived from S0  ...  S, computed at the time of S0 ExpectedValue(S, S0)= product of all likelihoods of intermediate states leading to S from S0 * Sum of all (goal achievement in S * goal utility) > Use the expected value as the (predictive) heuristic > Domain-independence

> Implemented in AERA > With an additional process of commitment (eliminate redundant goals, solve conflicts) > Scheduling of search driven by the predicted success of goals (learned from experience) in addition to the expected value of predicted future states > Anytime operation > Continuous updates of expected values as new goals are produced and new states predicted  re-scheduling

HUMANOBS

HUMANOBS

Presentation Transcript

HUMANOBS

HUMANOBS