170 likes | 421 Views
Markov Models and Applications . Henrik Schiøler, Hans-Peter Schwefel . Mm1 Discrete time Markov processes Mm2 Continuous time Markov processes Mm3 M/M/1 type models Mm4 Advanced queueing models Mm5 Hidden Markov Models and their application (hps).
E N D
Markov Models and Applications Henrik Schiøler, Hans-Peter Schwefel • Mm1 Discrete time Markov processes • Mm2 Continuous time Markov processes • Mm3M/M/1 type models • Mm4Advanced queueing models • Mm5Hidden Markov Models and their application (hps) Note: slide-set will be complemented by formulas, mathematical derivations, and examples on the black-board! hps@kom.auc.dk http://www.kom.auc.dk/~hps
Motivation: Stochastic models • Goals: • Quantitative analysis of (communication) systems • E.g., Quality of Service • Enhanced Algorithms for Information Processing • ’Extrapolation’, Error Concealment, Localisation, fault detection, etc. • Stochastic Impact • Error Models • Randomization in Transmission Protocols • Complex systems abstraction using statistics • Human Impact (e.g. Traffic, Mobility Models) • Frequently use of stochastic models • Simulation Models Stochastic Simulation • Analytic Models, e.g. Markovian Type, stochastic Petri Nets
Content • Intro • Revision: Discrete Time Markov Processes • Definition, basic properties • State-probabilities, steady-state analysis • Parameter Estimation, Example: Mobility Model • Hidden Markov Models • Definition & Example • Efficient computation of Pr(observation) • Most likely state sequence • Parameter Estimation • Application Examples of HMMs • Link error models • Mobility models, positioning • Fault-detection • error concealment • Summary & Exercises
Discrete Time Markov Processes • Definition • State-Space: finite or countable infinite, w/o.l.g. E={1,2,...,N} (N= also allowed) • Transition probabilities: pjk=Pr(transition from state j to state k) • Xi = RV indicating the state of the Markov process in step i • ’Markov Property’: State in step i only depends on state in step i-1 • Pr(Xi=s | Xi-1=si-1,Xi-2=si-2 ,...,X0=s0 ) = Pr(Xi=s | Xi-1=si-1) • Computation of state probabilities • Initial state probabilities (Step 0): 0 • Probability of state-sequence s0 ,s1 ,...,si: Pr(X0=s0 ,X1=s1 ,...,Xi=si ) = ... • Pr(Xi=k)=j [Pr(Xi-1=j)*pjk] • i = i-1P • State-holding time: geometric with parameter pii • Parameter Estimation for ’observable’ discrete time Markov Chains • Example: 2-state Markov chain (state = link behavior at packet transmission {erroneous,ideal}) • Parameter estimation, Markov property validation, limitations
Discrete Time Markov Processes (cntd.) • Properties • homogenuity: P independent of step i • Irreducibility: each state is reachable from any other state (in potentially multiple steps) • Transient states, positive recurrent states • Periodicity • Steady-state probabilities • =limii • Limit exists and is independent of 0 if Markov chain irreducible and aperiodic • Aperiodic & positive recurrent = ergodic is probability distribution • Examples(periodicity, ergodicity, steady-state probabilities, absorbing states) • Application example: mobility model – set-up, benefits, problems
Content • Intro • Revision: Discrete Time Markov Processes • Definition, basic properties • State-probabilities, steady-state analysis • Parameter Estimation, Example: Mobility Model • Hidden Markov Models • Definition & Example • Efficient computation of Pr(observation) • Most likely state sequence • Parameter Estimation • Application Examples of HMMs • Link error models • Mobility models, positioning • Fault-detection • error concealment • Summary & Exercises
Hidden Markov Models (HMMs): Definition • Main property • In each state s E, an ’observation symbol’ from some alphabet V is generated probabilistically • The underlying state cannot be observed, only the sequence O=[O1,O2,...,OT] of generated symbols • HMM = <E, V, 1, P, B> • E: state-space (discrete, finite/infinite), w/o.l.g. E={1,2,...,N} • V: set of possible observation symbols (discrete for now), w/o.l.g V={1,2,...,M} • 1: initial state probabilities at step 1 • P: NxN matrix of state transition probabilities pij = Pr(Xk+1=j | Xk=i) • B: NxM matrix of symbol generation probabilities: bij = Pr (Ok=j | Xk=i) • Example: 2-state HMM, observations = result from biased coin-toss • Note: Discrete time Markov model is special case of HMM, namely each column of B contains at most one non-zero element • Exercise: Write a (Matlab) program with input (1, P, B,T) that generates a sequence of observations of length T
Hidden Markov Models (HMMs): Computations • Problem 1: Compute probability of observing a certain sequence o=[o1,...,oT] in a given HMM. • First (inefficient) approach (’brute-force’): • Generate all possible state-sequences of length T: q=[q1,...,qT] • Sum up all Pr(o| q) weigthed by Pr(q) (total probabilities) • Problem: Number of paths grows exponentially as NT • More efficient (quadratic in N) approach: forward procedure • Iterative method computing probabilities for pre-fixes of the observation sequence:t := [Pr(O1=o1,...,Ot=ot, Xt=1), ..., Pr(O1=o1,...,Ot=ot, Xt=N)] • At step t=1: 1(i) = Pr(O1=o1, X1=i) = 1(i) bi,o1 [ Matlab Notation:1 = 1 .* B(:, o1 ) ’] • tt+1 (t=1,2,...,T-1):t+1(i) = (jEt(j) pji) Pr(Ot+1=ot+1 | Xt+1=i )t+1 = (t P) .* B(:, ot+1 )’ • Finally: Pr(O=o) = jET(j) • Computation can be illustrated in Trellis structure • Similarly (and identifiers needed later): Backwards procedure • t := [Pr(Ot+1=ot+1,...,OT=oT| Xt=1), ..., Pr(Ot+1=ot+1,...,OT=oT | Xt=N)] • T =1(vector with all elements = 1); t = (P * B(:, ot+1 ))’ .* t+1
HMMs: Computations (cntd.) Problem 2: Find ’most likely’ state sequence for an observation o=[o1,...,oT] in a given HMM. • I.e. find the sequence q*=[q1*,...,qT*] that maximizes Pr(X1=q1,...,XT=qT | O=o) (or, equivalently, the joint probability) • Optimization via pre-fix of length t (Viterbi Algorithm):t := [maxq1,...,qt-1{Pr(X1=q1,...,Xt-1=qt-1, Xt=1,O1=o1,...,Ot=ot)}, ..., maxq1,...,qt-1{ Pr(X1=q1,...,Xt-1=qt-1, Xt=N,O1=o1,...,Ot=ot)}] • Algorithm • 1 =1 .* B(:, o1 ) • t+1 (j) = [maxi=1,...,Nt(i)pij] Bj,ot+1, t+1(j)=argmaxi=1,...,Nt(i)pij, t=1,2,...,T-1 • Maximum of probability: p*= maxi=1,...,NT(i), qT*= argmaxi=1,...,NT(i) • state sequence: qt*= t+1(qt+1*), t=T-1,...,1 • Efficient implementations: use of logarithms to avoid multiplications
HMMs: Computations (cntd.) Problem 3: Find ’most likely’ HMM model for an observation o=[o1,...,oT]. • Assumption: State-space E and symbol alphabet V are given • Hence, desired is <1*, P*, B*> such that Pr <1, P, B> (O=o) is maximized • Iterative procedure for maximization: <1(m), P(m), B(m)> <1(m+1), P(m+1), B(m+1)> • Compute using model <1(m), P(m), B(m)>: • t(i):=Pr(Xt=i | O=o) = t(i)t(i) / i [t(i)t(i)] • t(i,j):= Pr(Xt=i, Xt+1=j | O=o) = t(i) pij bj,ot+1t+1(j) / j i [t(i)pij bj,ot+1t+1(j)] • ’Expectations’: • T(i):= t=1T-1t(i) =expected number of transitions from state i in o • T(i,j):= t=1T-1t(i,j) = expected number of transitions from state i to state j in o • S(i,k):= t=1,...,T, ot=kt(i) = expected number of times in state i in o and observing symbol k • S(i):= t=1,...,T,t(i) = expected number of times in state i in o • Updated HMM: • 1(m+1) =[1(1),..., 1(N)], pij(m+1)=T(i,j)/T(i), • bik(m+1)= S(i,k)/S(i) • Update-step increases Pr <1, P, B> (O=o), but potentially convergence to local maximum
Content • Intro • Revision: Discrete Time Markov Processes • Definition, basic properties • State-probabilities, steady-state analysis • Parameter Estimation, Example: Mobility Model • Hidden Markov Models • Definition & Example • Efficient computation of Pr(observation) • Most likely state sequence • Parameter Estimation • Application Examples of HMMs • Link error models • Mobility models, positioning • Fault-detection • error concealment • Summary & Exercises
HMMs: Application Examples • Link error models • State-space=different levels of link quality, observation V={error, correct} • Equivalent to ’biased’ coin toss example • Extensions to multiple link-states • Advantage: more general types of burst errors • Mobility models • State-space=product space(different classification of user-behavior, current coordinates) • observation = set of discrete positions of user/device • Positioning • State-space same as mobility model • Observations now e.g. RSSI distributions
HMMs: Application Examples II • Fault-detection (Example from last semester student project) • State-space={Congested, lowly utilized} x {good wireless link, bad link} • Observations: discrete levels of RTT measurements (per packet) and packet loss events (binary) • Discussion of advantages/disadvantages, comparison to Bayesian Networks • Error concealment • E.g. Transmission of speech over noisy/lossy channel • State-space=speaker model • observation = received symbols, subject to loss/noise
Summary • Intro • Revision: Discrete Time Markov Processes • Definition, basic properties • State-probabilities, steady-state analysis • Parameter Estimation, Example: Mobility Model • Hidden Markov Models • Definition & Example • Efficient computation of Pr(observation) • Most likely state sequence • Parameter Estimation • Application Examples of HMMs • Link error models • Mobility models, positioning • Fault-detection • error concealment • Summary & Exercises
References • L. Rabiner, B-H Juang: ’Fundamentals of Speech Recognition’, Prentice Hall, 1993. • Sections 6.1-6.4
Exercises 1 Hidden Markov Models: Given is the following 3-state hidden Markov model with parameters pi1=[0.2,0.3,0.5], P=[0.2,0.4,0.4; 0.5,0.1,0.4; 0.2,0.2,0.6]. The observations are coin-toss results (Heads=1, Tails=2) with B=[0.8,0.2;0.5,0.5;0.1,0.9]. • write a (Matlab) program that generates observation sequences of length T from the given HMM. • Write a program that efficiently compute the probability of a given observation sequence. Run the program for S=’HHTHTTTHT’. Compare with a probability estimate via simulation using the program from Task a. • Write a program to determing the most-likely state sequence and run the program for the sequence in (b).
Exercises 2 Localisation with HMMs: Consider a 5mx5m squared room in which 3 access points are placed in the three corners (0,5), (5,5), (5,0). Use a grid with 1mx1m elements to discretize this geographic space. A mobile device is moving through the room and the Access Points measure received signal strength which follows a path-loss model RSSI[dB] = Round(- 6 log10 (d/d0)+13+N), with d0=0.1m. The Noise N is assumed to be Normal distributed with standard deviation sigma=2. Write Matlab functions to • Compute for each grid position (i,j), the probabilities of observing an RSSI triplet (R1,R2,R3), Ri=0,...,9. • Determine the MLE of the trajectory of the mobile device for observation sequence [1,2,1],[2,0,4],[4,2,1],[7,3,4]. • Assume that the mobile device moves equally likely in any of the possible (2-4) vertical/horizontal directions, with velocity 1m/timeunit. Setup the matrices P and B that describe the resulting HMM. (Use lexiographic order for the 2-dimensional coordinates and for the RSSI triplets) • Determine the most likely trajectory for the above observation sequence resulting from the HMM.