80 likes | 96 Views
Explore predictive heuristics for decision-making in dynamic, uncertain environments. Learn to reason under uncertainty and in real-time, using experience-grounded knowledge and continuous re-evaluation. Discover how to guide search based on value and interesting future states, with a focus on goal relevance and utility. Embrace domain independence with a process of commitment to optimize goal achievement. Achieve continuous updates and re-scheduling for efficient decision-making.
E N D
HUMANOBS Predictive Heuristics for Decision-Making in Real-World Environments Helgi Páll Helgason, Kristinn R. Thorisson, Eric Nivel, Pei Wang Reykjavik University / Icelandic Institute for Intelligent Machines Temple University, Philadelphia AGI 2013 - Beijing – August 2013
Problem > Multi-objective decision making > Realistic environments: dynamic, stochastic, continuous > Insufficient knowledge and resources > Knowledge is grounded in experience and re-evaluated continuously > Reasoning under uncertainty and in real-time
Problem 0.2 S0,0 0.6 S0,1 S0 0.1 S0,2 0.4 S0,3 time > Time is not discrete > Set of possible actions not always enumerable > Set of possible resulting states not always enumerable
S0,0 S0,1 S0 S0,2 S0,3 time > Search guided by the predicted value of interesting future states > Value: relevancy to goals > Set of possible courses of action ordered set of interesting actions > Interestingness derived from experience (learning, attention control and self-compilation) and current activity (goals); real valued
> Predictors are controlled by success rate and confidence > Predicted state has a likelihood: likelyhood(S)=confidence(P)*(SuccessRate(P)-0.5)+0.5 > Goal’s utility: utility(G)=priority(G)*urgency(G) where urgency is the time horizon (from now), computed relatively to the horizon of all other goals > Goal’s achievement: 1 if achieved in S, -1 otherwise
> Expected value for a state S derived from S0 ... S, computed at the time of S0 ExpectedValue(S, S0)= product of all likelihoods of intermediate states leading to S from S0 * Sum of all (goal achievement in S * goal utility) > Use the expected value as the (predictive) heuristic > Domain-independence
> Implemented in AERA > With an additional process of commitment (eliminate redundant goals, solve conflicts) > Scheduling of search driven by the predicted success of goals (learned from experience) in addition to the expected value of predicted future states > Anytime operation > Continuous updates of expected values as new goals are produced and new states predicted re-scheduling