550 likes | 570 Views
Explore the implications of discounting future events in systems theory through detailed analysis and models. Discover how properties, algorithms, and game graphs are utilized to understand system dynamics.
E N D
Discounting the Future in Systems Theory Luca de Alfaro, UC Santa Cruz Tom Henzinger, UC Berkeley Rupak Majumdar, UC Los Angeles Chess Review May 11, 2005 Berkeley, CA
Property c ("eventually c") a b c c … some trace has the property c
Property c ("eventually c") a b c c … some trace has the property c c … all traces have the property c
Richer Models FAIRNESS: -automaton Parity game ADVERSARIAL CONCURRENCY: game graph graph Stochastic game PROBABILITIES: Markov decision process
Concurrent Game 1,12,2 1,11,22,2 1,22,1 a b c 2,1 player "left"player "right" -for modeling open systems [Abramsky, Alur, Kupferman, Vardi, …] -for strategy synthesis ("control") [Ramadge, Wonham, Pnueli, Rosner]
Property c 1,1 2,2 1,1 1,2 2,2 1,2 2,1 a b c 2,1 hhleftii c … player "left" has a strategy to enforce c
Property c 1,1 2,2 1,11,2 2,2 1,2 2,1 a b c Pr(1): 0.5 Pr(2): 0.5 2,1 hhleftii c … player "left" has a strategy to enforce cleft c … player “left" has a randomized strategy to enforce c
Qualitative Models Trace: sequence of observations Property p: assigns a reward to each trace boolean rewards Model m: generates a set of traces (game) graph Value(p,m): defined from the rewards of the generated traces 9 or 8 (98) B
Stochastic Game a b c right right 1 2 1 2 left left a: 0.6 b: 0.4 a: 0.5 b: 0.5 a: 0.0 c: 1.0 a: 0.0 c: 1.0 1 1 a: 0.1 b: 0.9 a: 0.2 b: 0.8 a: 0.7 b: 0.3 a: 0.0 b: 1.0 2 2
Property c Probability with which player "left" can enforce c ? a b c right right 1 2 1 2 left left a: 0.6 b: 0.4 a: 0.5 b: 0.5 a: 0.0 c: 1.0 a: 0.0 c: 1.0 1 1 a: 0.1 b: 0.9 a: 0.2 b: 0.8 a: 0.7 b: 0.3 a: 0.0 b: 1.0 2 2
Semi-Quantitative Models Trace: sequence of observations Property p: assigns a reward to each trace boolean rewards Model m: generates a set of traces (game) graph Value(p,m): defined from the rewards of the generated traces sup or inf (sup inf) [0,1] µ R
A Systems Theory Class of properties p over traces Algorithm for computing Value(p,m) over models m Distance between models w.r.t. property values
A Systems Theory w-regular properties Class of properties p over traces Algorithm for computing Value(p,m) over models m GRAPHS Distance between models w.r.t. property values bisimilarity m-calculus
Transition Graph Q states d: Q 2Q transition relation
Graph Regions Q states d: Q 2Q transition relation = [ Q !B ] regions 9pre, 8pre: 9 9pre(R) 8pre(R) R µ Q 8
Graph Property Values: Reachability R Given RµQ, find the states from which some trace leads to R. R
Graph Property Values: Reachability R = (m X) (R Ç9pre(X)) Given RµQ, find the states from which some trace leads to R. R R R [ pre(R) . . . R [ pre(R) [ pre2(R)
Concurrent Game Q states l, r moves of both players d: Q l r Q transition function
Game Regions Q states l, r moves of both players d: Q l r Q transition function = [ Q !B ] regionslpre, rpre: q lpre(R) iff (l l ) (r r) d(q,l,r) R 2,* lpre(R) R µ Q 1,2 1,1
Game Property Values: Reachability left R Given RµQ, find the states from which player "left" has a strategy to force the game to R. R
Game Property Values: Reachability left R = (m X) (R Ç lpre(X)) Given RµQ, find the states from which player "left" has a strategy to force the game to R. R left R R [ lpre(R) . . . R [ lpre(R) [ lpre2(R)
An Open Systems Theory w-regular properties Class of winning conditions p over traces GAME GRAPHS Algorithm for computing Value(p,m) over models m Distance between models w.r.t. property values alternating bisimilarity [Alur, H, Kupferman, Vardi] (lpre,rpre) fixpoint calculus
An Open Systems Theory w-regular properties hhleftiiR Class of winning conditions p over traces GAME GRAPHS Algorithm for computing Value(p,m) over models m Every deterministic fixpoint formula f computes Value(p,m), where p is the linear interpretation [Vardi] of f. (lpre,rpre) fixpoint calculus (m X) (R Ç lpre(X))
An Open Systems Theory Two states agree on the values of all fixpoint formulas iff they are alternating bisimilar [Alur, H, Kupferman, Vardi]. GAME GRAPHS Algorithm for computing Value(p,m) over models m Distance between models w.r.t. property values alternating bisimilarity (lpre,rpre) fixpoint calculus
Stochastic Game Q states l, r moves of both players d: Q l r Dist(Q) probabilistic transition function
Quantitative Game Regions Q states l, r moves of both players d: Q l r Dist(Q) probabilistic transition function = [ Q ! [0,1] ] quantitative regions lpre, rpre: lpre(R)(q) = (sup l l ) (inf r r) R(d(q,l,r))
Quantitative Game Regions Q states l, r moves of both players d: Q l r Dist(Q) probabilistic transition function = [ Q ! [0,1] ] quantitative regions lpre, rpre: lpre(R)(q) = (sup l l ) (inf r r) R(d(q,l,r)) B 9 8
Probability with which player "left" can enforce c : (m X) (c Ç lpre(X)) Ç = pointwise max 0 0 1 a b c right right 1 2 1 2 left left a: 0.6 b: 0.4 a: 0.5 b: 0.5 a: 0.0 c: 1.0 a: 0.0 c: 1.0 1 1 a: 0.1 b: 0.9 a: 0.2 b: 0.8 a: 0.7 b: 0.3 a: 0.0 b: 1.0 2 2
Probability with which player "left" can enforce c : (m X) (c Ç lpre(X)) Ç = pointwise max 0 1 1 a b c right right 1 2 1 2 left left a: 0.6 b: 0.4 a: 0.5 b: 0.5 a: 0.0 c: 1.0 a: 0.0 c: 1.0 1 1 a: 0.1 b: 0.9 a: 0.2 b: 0.8 a: 0.7 b: 0.3 a: 0.0 b: 1.0 2 2
Probability with which player "left" can enforce c : (m X) (c Ç lpre(X)) Ç = pointwise max 0.8 1 1 a b c right right 1 2 1 2 left left a: 0.6 b: 0.4 a: 0.5 b: 0.5 a: 0.0 c: 1.0 a: 0.0 c: 1.0 1 1 a: 0.1 b: 0.9 a: 0.2 b: 0.8 a: 0.7 b: 0.3 a: 0.0 b: 1.0 2 2
Probability with which player "left" can enforce c : (m X) (c Ç lpre(X)) Ç = pointwise max 0.96 1 1 a b c right right 1 2 1 2 left left a: 0.6 b: 0.4 a: 0.5 b: 0.5 a: 0.0 c: 1.0 a: 0.0 c: 1.0 1 1 a: 0.1 b: 0.9 a: 0.2 b: 0.8 a: 0.7 b: 0.3 a: 0.0 b: 1.0 2 2
Probability with which player "left" can enforce c : (m X) (c Ç lpre(X)) Ç = pointwise max 1 1 1 a b c right right 1 2 1 2 left left a: 0.6 b: 0.4 a: 0.5 b: 0.5 a: 0.0 c: 1.0 a: 0.0 c: 1.0 1 1 a: 0.1 b: 0.9 a: 0.2 b: 0.8 a: 0.7 b: 0.3 a: 0.0 b: 1.0 2 2 In the limit, the deterministic fixpoint formulas work for all w-regular properties [de Alfaro, Majumdar].
A Probabilistic Systems Theory w-regular properties Class of properties p over traces MARKOV DECISION PROCESSES Algorithm for computing Value(p,m) over models m Distance between models w.r.t. property values quantitative bisimilarity [Desharnais, Gupta, Jagadeesan, Panangaden] quantitative fixpoint calculus
A Probabilistic Systems Theory quantitativew-regular properties Class of properties p over traces max expected value of satisfying R MARKOV DECISION PROCESSES Algorithm for computing Value(p,m) over models m Every deterministic fixpoint formula f computes expected Value(p,m), where p is the linear interpretation of f. quantitative fixpoint calculus (m X) (R Ç9pre(X))
Qualitative Bisimilarity e: Q2! {0,1} … equivalence relation F … function on equivalences F(e)(q,q') = 0 if q and q' disagree on observations = min { e(r,r’) | r2 9pre(q) Ær’2 9pre(q’) } else Qualitative bisimilarity … greatest fixpoint of F
Quantitative Bisimilarity d: Q2! [0,1] … pseudo-metric ("distance") F … function on pseudo-metrics F(d)(q,q') = 1 if q and q' disagree on observations ¼ max of supl infr d(d(q,l,r),d(q',l,r)) supr infl d(d(q,l,r),d(q',l,r)) else Quantitative bisimilarity … greatest fixpoint of F Natural generalization of bisimilarity from binary relations to pseudo-metrics.
A Probabilistic Systems Theory Two states agree on the values of all quantitative fixpoint formulas iff their quantitative bisimilarity distance is 0. MARKOV DECISION PROCESSES Algorithm for computing Value(p,m) over models m Distance between models w.r.t. property values quantitative bisimilarity quantitative fixpoint calculus
Great BUT … 1 The theory is too precise. Even the smallest change in the probability of a transition can cause an arbitrarily large change in the value of a property. 2 The theory is not computational. We cannot bound the rate of convergence for quantitative fixpoint formulas.
Solution: Discounting Economics: A dollar today is better than a dollar tomorrow. Value of $1.- today: 1 Tomorrow: a for discount factor 0 < a < 1 Day after tomorrow: a2 etc.
Solution: Discounting Economics: A dollar today is better than a dollar tomorrow. Value of $1.- today: 1 Tomorrow: a for discount factor 0 < a < 1 Day after tomorrow: a2 etc. Engineering: A bug today is worse than a bug tomorrow.
Discounted Reachability Reward (ac) =ak if c is first true after k transitions 0 if c is never true The reward is proportional to how quickly c is satisfied.
Discounted Property ac 1 a a b c a2 ac
Discounted Property ac 1 a a b c a2 ac Discounted fixpoint calculus:pre(f) a ¢ pre(f)
Fully Quantitative Models Trace: sequence of observations Property p: assigns a reward to each trace real reward Model m: generates a set of traces (game) graph Value(p,m): defined from the rewards of the generated traces sup or inf (sup inf) [0,1] µ R
Discounted Bisimilarity d: Q2! [0,1] … pseudo-metric ("distance") F … function on pseudo-metrics F(d)(q,q') = 1 if q and q' disagree on observations ¼ max of supl infr d(d(q,l,r),d(q',l,r)) supr infl d(d(q,l,r),d(q',l,r)) else Quantitative bisimilarity … greatest fixpoint of F a ¢
A Discounted Systems Theory discounted w-regular properties Class of winning rewards p over traces STOCHASTIC GAMES Algorithm for computing Value(p,m) over models m Distance between models w.r.t. property values discounted bisimilarity discounted fixpoint calculus
A Discounted Systems Theory discounted w-regular properties max expected reward aR achievable by left player Class of expected rewards p over traces STOCHASTIC GAMES Algorithm for computing Value(p,m) over models m Every discounted deterministic fixpoint formula f computes Value(p,m), where p is the linear interpretation of f. discounted fixpoint calculus (m X) (R Ça ¢ lpre(X))
A Discounted Systems Theory The difference between two states in the values of discounted fixpoint formulas is bounded by their discounted bisimilarity distance. STOCHASTIC GAMES Algorithm for computing Value(p,m) over models m Distance between models w.r.t. property values discounted bisimilarity discounted fixpoint calculus