Statistical Model-Checking of Black-Box Systems

Statistical Model-Checking of “Black-Box” Probabilistic SystemsVESTA Koushik Sen Mahesh Viswanathan Gul Agha University of Illinois Urbana-Champaign

Motivation • Simulation of probabilistic systems • used for performance evaluation and • reliability analysis • Can we use the traces obtained from simulation for formal verification? • Statistical model-checking

Assumptions for “black-box” probabilistic systems • Stochastic Discrete Event System • Paths are of the form s0--t0->s1--t1-> … • Labeling function L : S ! 2AP • Probability measure  on the set of paths with common prefix is unknown • Each state has a unique identifier • Not required if properties are without nested probabilistic operators • We have no control on the execution of the system • Samples can be generated through discrete event simulation • Time domain may be continuous or discrete • Example: • Systems having underlying continuous-time Markov chain (CTMC) model • Systems having underlying discrete-time Markov chain (DTMC) model

Properties in CSL sub-logic •  ::= true | a | Æ | : | PQ p() •  ::=  U<t | X  where Q2 {<,>,¸,·} • P< 0.5(§<10 full) • Probability that queue becomes full in 10 units of time is less than 0.5 • P>0.98(: retransmit U<200 receive) • Probability that a message is received successfully within 200 time units without any need for retransmission is greater than 0.98

Error: , Yes Model Model-Checker No Statistical Approaches Younes et al. 02,04 Monte-Carlo Simulator Property

Decoupled from the tool • Run implementation to generate samples, or • Get Samples from Monte-Carlo simulation of model Model-Checker Yes:  Model or Implementation Don’t Know No:  Our Approach Property

Statistical Model Checking • Given a model M, a set of samples S (generated from M) and a property  • A(S, s0,) = • A(S, s0,) = “yes” with error  ) = Pr[A(S, s0,) = “yes” | M,s02] • A(S, s0,) = “no” with error  ) = Pr[A(S, s0,) = “no” | M,s0²] • A(S, s0,) = “don’t know” • smaller the error (also called p-value) better the confidence { “yes” with error  “no” with error  “don’t know”

Model-Checking Overview • Check satisfaction of a formula • Check satisfaction of its sub-formula • Use the result to check satisfaction of the formula • 1Æ2 is satisfied at s iff • 1 is satisfied at s • 2 is satisfied at s • 1 U<t2 is satisfied on a path s1s2… iff • At si, 2 is satisfied • At sj (for all j <i), 1 is satisfied • time(si) – time(s1) < t • P<p () is satisfied at s iff • probability that a path from s satisfies  is less than p Easy Easy How??

Checking P<0.6(p U<12 q) statistically at s Sample contains, say, 30 paths from s • On 21 paths (p U<12 q) is satisfied • 21/30 > 0.6 • can we say that P<0.6(p U<12 q) is violated at s ?? • Statistically, yes, provided we quantify the error in our decision • error =  =Pr[On 21 (or more) out of 30 paths (p U<12 q) hold | probability that (p U<12 q) holds on a path is less than 0.6] ·Pr[X ¸ 21 ] where X~Binomial(30,0.6) ……. p U<12 q

p r 0.0 0.6 1.0 21/30 r p 0.0 0.6 1.0 10/30 Error (p-value) • Let r = (# of paths on which (p U<12 q) hold / # of total paths) • Let p = Pr[(p U<12 q) holds on a path] • “no” answer : (formula violates) • “yes” answer : (formula holds) error = Pr[r ¸ 21/30 | p · 0.6] error = Pr[r · 10/30 | p ¸ 0.6]

Nested: Checking P<0.6(1U<122) at s • 1 and 2 contain nested probabilistic operators • Checking (1 U<122) over a path • Answers are not simply “yes” or “no” • Answers can be • “yes” with error  • “no” with error  • “don’t know” • Need a modified decision procedure • Handle “don’t know” to get useful answers • Incorporate error of decision for sub-formulas

Checking P<0.6(1U<122) at s (Problem) Solution • Resolve “don’t know” (?) in adversial fashion • Observation region • Create “uncertainty region” to incorporate error associated with sub-formulas. ……. ? ? 1 2 3 1 U<122

To check P<0.6(1U<122) at s Need to check if # of “yes” paths by # of total paths < 0.6 Let, # of “yes” paths=20, # of “no” paths =8, # of “don’t know” paths = 3 • # of “yes” paths lies between • 20 : resolve all “don’t know” paths as “no” paths • 23 : resolve all “don’t know” paths as “yes” paths • Create an uncertainty region [0.6 - 1 , 0.6 + 2] • 1 and 2 depends on error for decision along all the sample paths • Check if [20/30,23/30] falls outside [0.6 - 1 , 0.6 + 2] 0.6-1 0.6+2 0.0 1.0 0.6 20/30 23/30

Case 1: “yes” answer error estimate r p 0.6-1 0.6+2 0.0 0.6 1.0

Case 2: “no” answer error estimate r p 0.6-1 0.6+2 0.0 0.6 1.0

Case 3: “don’t know” answer no error 0.6-1 0.6+2 0.0 1.0 0.6

From nested error to uncertainty region • Random variable X = 1 if ² and 0 otherwise • Let Random variable Z =1 if A(S,,) = “yes” with error ’ and 0 if A(S,,) = “no” with error ’ • X ~ Bernoulli(p’) (say) • Z ~ Bernoulli(p’’) (say) • We get samples from this distribution • Can estimate p’’ • However, to verify P¸ p() • check if p’ ¸ p or not • Relate p’ and p’’ • p’-’p’ · p’’ · p’+(1-p’)’ • p’ - 1· p’’ · p’ + 2 [uncertainty region]

Conjunction A(S,s,1Æ2) • Let A(S,s,1) = x1 with error 1 • and A(S,s,2) = x2 with error 2 • where xi2 [“yes”,”no”,”don’t know”] • If x1=“yes” and x2=“yes” then A(S,s,1Æ2) = “yes” with error max(1,2) • If x1=“no” or x2=“no” then A(S,s,1Æ2) = “no” with error 1 + 2 - 12 • Else “don’t know”

Evaluation • Implementation VeStA • http://osl.cs.uiuc.edu/~ksen/vesta/ • Tandem Queuing Network • Cyclic Polling System • Grid World Example • Answers matched the numerical model-checker • error () of the order 10-8 in all of our experiments • Very high confidence in our result • Disadvantage: Space requirement is high • Required to store all samples before model-checking

Future Work • Use Machine Learning to get rid of state identifiers • Possible for CTMC models [Sen et al. QEST’ 04] • State identifiers are not required if there is no nested probabilistic operator • In practice most interesting properties are without nested probabilistic operators • Verify probabilistic properties of various network protocols • Earlier intractable due to large state space

Statistical Model-Checking of Black-Box Systems