Stochastic Machines CS679 Lecture Note by Jin Hyung Kim Computer Science Department KAIST

Stochastic Machines CS679 Lecture Note by Jin Hyung Kim Computer Science Department KAIST

Statistical Machine • Root at statistical mechanics • derive thermodynamic properties of macroscopic bodies from microscopic elements • probabilistic nature due to enormous degree of freedom • concept of entropy plays the central role • Gibbs distribution • Markov Chain • Metropolis algorithm • Simulated Annealing • Boltzman Machine • device for modeling the underlying probability distribution of data set

Statistical Mechanics • In thermal equilibrium, probability of state I • energy of state i • absolute temperature • Boltzman constant • In NN model

Markov Chain • Stochastic process of Markov property • state Xn+1 at time n+1depends only on state Xn • Transition probability & Stochastic matrix • m-step transition probability

Markov Chain • Recurrent state • P(ever returning to the state i) = 1 • Transient state • P(ever returning to the state i) < 1 • mean recurrence time of state i : Ti(k) • expectation of time elapsed between (k-1)th return to kth return • steady-state probability of state i, i • I = 1/(mean recurrence time) • ergodicity • long-term proportion of time spent in state i approaches to the steady-state probability

Convergence to stationary distribution • State distribution vector • starting from arbitrary initial distribution, transition prob will converge to stationary distribution for ergodic Markov chain • independent of initial distribution • example 11.1 and 11.2

Principle of detailed Balance • At thermal equilibrium, the rate of occurrence of any transition equals the corresponding rate of occurrence of the inverse transition ipij = jpji • Detailed Balance implies distribution i is stationary • Detailed Balance is sufficient condition for the thermal equilibrium

Metropolis algorithm • Modified Monte Carlo method • Suppose our objective is to reach the state minimizing energy function • 1. Randomly generate a new state, Y, from state X • 2. If E(energy difference between Y and X) < 0 then move to Y (set Y to X) and goto 1 • 3. Else • 3.1 select a random number,  • 3.2 if  < exp(- E / T) then move to Y (set Y to X) and goto 1 • 3.3 else goto 1

Metropolis algrthm and Markov Chain • choose probability distribution so that Markov chain converge to be a Gibbs distribution then where • Metropolis algorithm is equivalent to random step in stationary Markov chain • shown that such choice satisfied principle of detailed balance

Simulated Annealing • Solves combinatorial optimization • variant of Metropolis algorithm • by S. Kirkpatric (83) • finding minimum-energy solution of a neural network = finding low temperature state of physical system • To overcome local minimum problem • Key idea • Instead always going downhill, try to go downhill ‘most of the time’

Iterative + Statistical • Simple Iterative Algorithm (TSP) 1. find a path p 2. make p’, a variation of p 3. if p’ is better than p, keep p’ as p 4. goto 2 • Metropolis Algorithm • 3’ : if (p’ is better than p) or (random < Prob), then keep p’ as p • a kind of Monte Carlo method • Simulated Annealing • T is reduced as time passes

About T • Metropolis Algorithm • Prob = p(DE) = exp ( DE / T) • Simulated Annealing • Prob = pi(DE) = exp ( DE / Ti) • if Ti is reduced too fast, poor quality • if Tt >= T(0) / log(1+t) - Geman • System will converge to minimun configuration • Tt = k/1+t - Szu • Tt = a T(t-1) where a is in between 0.8 and 0.99

Function Simulated Annealing current select a node (initialize) fort 1 todo Tschedule[t] ifT=0 thenreturncurrent next a random selected successor of current E  value[next] - value[current] ifE > 0 thencurrentnext elsecurrentnext only with probability eE /T

Gibbs Sampling • Generates Markov chain with Gibbs distribution as equilibrium distribution • Numerical estimate of the marginal density of RV Xk • with knowledge of conditional distribution of Xk given all the other component • x1(1) is drawn from distribution X1, given X2(0), X3(0),…,Xk(0) • x2(1) is drawn from distribution X2, given X2(1), X3(0),…,Xk(0) • ... • xk(1) is drawn from distribution X1, given X2(1), X3(1),…,Xk(0) • ... • xK(1) is drawn from distribution X1, given X2(1), X3(1),…,Xk(1) • Converge to true marginal prob Distr.

Gibbs Sampling • Convergence theorem

Stochastic Machines CS679 Lecture Note by Jin Hyung Kim Computer Science Department KAIST

Stochastic Machines CS679 Lecture Note by Jin Hyung Kim Computer Science Department KAIST

Presentation Transcript

Problem Solving by Search by Jin Hyung Kim Computer Science Department KAIST

Computer Science Dept. KAIST Education

Jae- hyung Lee, Sun- hak Hong, Jin- wook Han, Young- kyu Kim

Moonzoo Kim Computer Science, KAIST

Kim Jin-A

Mr. Jin-Hyung KIM, Ph.D. Head, Administration Division RCA Regional Office (rcaro)

Online Character Recognition Research at KAIST/CAIR November 1-2, 1995 Jin Hyung Kim

Symbolic Programming and LISP CS570 Lecture Notes by Jin Hyung Kim Computer Science Department

Radial-Basis Function Networks (5.10-5.12) CS679 Lecture Note by Hyunsok Oh

Multilayer Perceptrons CS679 Lecture Note by Jin Hyung Kim Computer Science Department KAIST

Independent Component Analysis PART I CS679 Lecture Note by Gil-Jin Jang

Uncertainty II CS570 Lecture Note by Jin Hyung Kim Computer Science Department KAIST

Neural Networks Chapter 8 Principal Components Analysis CS679 Lecture Note by Jahwan Kim

Advanced Search Techniques CS570 Lecture Notes by Jin Hyung Kim Computer Science Department KAIST

Radial-Basis Function Networks (5.13 ~ 5.15) CS679 Lecture Note by Min-Soeng Kim

Knowledge in Learning (Logical formulation of Learning) CS570 Lecture Notes by Jin Hyung Kim

Information Theory (10.6 ~ 10.10, 10.13 ~ 10.15) CS679 Lecture Note by Sungho Ryu

Face Recognition using Neural Network by Dong Hyun Roh Computer Science Department KAIST

Uncertainty Handling in Bayesian Network CS570 Lecture Note by Jin Hyung Kim

Combining Multiple Experts’ Opinion CS570 Lecture Notes by Jin Hyung Kim

Self-Organizing Maps CS679 Lecture Note by Jin Hyung Kim Computer Science Department KAIST

Uncertainty CS570 Lecture Note by Jin Hyung Kim Computer Science Department KAIST