160 likes | 173 Views
This talk explores a formalism for learning stochastic systems, such as Markov Chains, from AI observations. It discusses learning algorithms, error analysis, and the use of logics to specify global behavior. The goal is to provide more meaningful bounds for the model learnt, focusing on global behaviors rather than local transition probabilities.
E N D
Towards Trusted Machine Learning?LearningStochastic Systemswith Global PAC Bounds Hugo Bazille INRIA/IRISA, Blaise Genest CNRS/IRISA, Cyrille Jégourel SUTD, Singapore Sun Jun SMU, Singapore
Formalism Stochastic System Learning Algorithm from AI (converges towards the Markov System at ∞) observations reset afterfinite time Information on the system: Set of states – known in this talk Support of transitions – depends Transition probabilities – never Observation W: Sequence of States observed e.g.: s1 s2 s3 s1 s3 s5 Stochastic Model (e.g. Markov Chain) PAC bounds (like for Statistical MC) withprobability 97%, error< 3% +
Formalism Stochastic System Learning Algorithm from AI observations reset Learning algorithms: Frequencyestimator: Estimated_Proba(s1,s2) = -Laplace smoothing: Estimated_Proba(s1,s2) = Stochastic Model (e.g. Markov Chain) PAC bounds (like for Statistical MC) withprobability 97%, error< 3% +
Formalism Stochastic System Learning Algorithm from AI observations reset Stochastic Model (e.g. Markov Chain) PAC bounds (like for Statistical MC) withprobability 97%, error< 3% + State of the art erroranalysisin AI: boundson transition probabilities: P(Estimated_Proba(s,t)-Prob(s,t) < 3%) > 97% usingstatistical « ChernovBounds » with enough observations of transitions.
Understand learning algorithms from AI. e.g., where -Laplace smoothing useful? Which to use? Provide more meaningful bounds for model learnt: Globalboundsonbehaviors of the model rather than local transition probabilities. Goals How? Use logics LTL, CTL… to specify global behavior. We want that the probabilities to fulfil logical formula in model and in system must be close.
One fixed « Time Beforefailure » formula X(¬s0UsF): Probability to seefaultbeforeseeing initial state again. => Frequencyestimator, giving MC AW Reset strategy: resetwhenwereachsF or s0: e.g.: Result 1:
What about learning a Markov chain good for allproperties? => find AW , esuchthatuniformelyover all formula , we have: - All properties of LTL? - All properties of PCTL? - All properties of CTL? Result 2: Not possible (askarbitrarily high precision for arbitrarilynested formula). [Daca et al.’16] Possible for depth k formulas, but O(exp(k)) Not possible (PCTL cannothandleanystatisticalerror) Possible, and not socostly!
ErrorwrtProbability(Xj | jUy| Gj), for all j,yCTL formulae. observations W={« u s v s »}: resetwhensame state seentwice Global PAC bounds for CTL Need to know the support of transitions (not the case for fixedtimedbeforefailure) => We use Laplace smoothingto ensure the probabilities are all >0.
ErrorwrtProbability(Xj | jUy| Gj), for all j,yCTL formulae. observations W={« u s v s »}: reset whensame state seentwice Global PAC bounds for CTL … Next slide
Conditioning of Markov Chain R(2)=R(3)={1} and R(1)={} P1(Leave≤m1) = 1-(1-2t)m ≈ 2 m t => Cond(M) ≈ 2 m t
gwrtProbability(Xj | jUy| Gj), for all j,yCTL formulae. ErrorBound …truethanks to Laplacesmoothing
Understand learning algorithms from AI: Frequentist estimator enough for one fixed property.-Laplace smoothing to keep >0 transitions. Useful for providing bounds for CTL, rationale to fix Provide Globalboundsonbehaviors of the model learnt for CTL. Not possible for PCTL or unbounded LTL. Conclusion
Local vs Global probabilities Global Property: reach state 3. From state 1: PA(j)= ½. Whenlearning AW, smallstatisticalerrors possible localerror on proba: e From state 1: has probability (t-e)/2t Global errordepends on conditioningt of system e.g.: t = 2e, PAW(j)=¼.
-e PCTLcannothandlestatisticalerror +e PCTL Propertyj: (s2 => Proba(X(sF)) ≥ ½) PA(j)=1 in the original system A. If welearn AWfromobservations W, wecan do a smallstatsticalerroreand obtain PAW(j)=0
ErrorwrtProbability(Xj | jUy| Gj), for all j,yCTL formulae. observations W={« u s v s »}: resetwhensame state seentwice Global PAC bounds for CTL Need to know the support of transitions (not the case for fixedtimedbeforefailure) Withoutit, learn: from system witht verysmall => We use Laplace smoothingto ensure the probabilities are all >0.