210 likes | 363 Views
Extending Evolutionary Programming to the Learning of Dynamic Bayesian Networks. Allan Tucker Xiaohui Liu Birkbeck College University of London. Diagnosis in MTS. Useful to know causes for a given set of observations within a time series
E N D
Extending Evolutionary Programming to the Learning of Dynamic Bayesian Networks Allan Tucker Xiaohui Liu Birkbeck College University of London
Diagnosis in MTS • Useful to know causes for a given set of observations within a time series • E.g. Oil Refinery: ‘Why a temperature has become high whilst a pressure has fallen below a certain value?’ • Possible paradigm to perform Diagnosis is the Bayesian Network • Evolutionary Methods to learn BNs • Extend work to Dynamic Bayesian Networks
Dynamic Bayesian Networks • Static BNs repeated over t time slices • Contemporaneous / Non-Contemporaneous Links • Used for Prediction / Diagnosis within dynamic systems
Assumptions - 1 • Assume all variables take at least one time slice to impose an effect on another. • The more frequently a system generates data, the more likely this will be true (e.g. every minute, second etc.) • Contemporaneous Links are excluded from the DBN.
Representation • N variables at time slice, t • P Triples of the form (x,y,lag) • Each triple represents a link from a node at a previous time slice to a node at time t. Example: { (0,0,1); (1,1,1); (2,2,1); (3,3,1); (4,4,1); (0,1,3); (2,1,2); (2,4,5); (4,3,4) }
Search Space • Given the first assumption and proposed representation Search Space will be: • E.g. 10 variables, MaxLag = 30 • Make further assumptions to reduce this and speed up the search
Assumptions - 2 • Cross-Description-Length Function (CDLF) • Exhibits smoothness of Cross Correlation Function (CCF) cousin • Exploit this smoothness using Swap
Assumptions - 3 • Auto-Description-Length Function (ADLF) exhibits the lowest DL with time lag=1 • Automatically insert these links before evaluation
Evolutionary Programming to find Links with Low Description Length Multivariate Time Series Evolutionary Program (Swap) to find Dynamic Bayesian Network with Low Description Length Dynamic Bayesian Network User Explanation Algorithm (using Stochastic Simulation)
EP to Find low DL links • Using an EP approach with self adapting parameters to find a good selection of links with low DL (High Mutual Information) • Representation: Each individual is a triple (x,y,lag) • Fitness is DL of triple • Solution is the resultant population
Swap Operator • Select a triple from one parent with equal probability • Mutate the current lag with a uniform distribution: • Current Lag + U(-MaxLag/10, MaxLag/10) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) (x,y,lag) [Lag - MaxLag/10] [Lag + MaxLag/10] X Lag: 1 MaxLag
Generating Synthetic Data (1) t-3 t-2 t-1 t (2) t-3 t-2 t-1 t t+1
Oil Refinery Data • Data recorded every minute • Hundreds of variables • Selected 11 interrelated variables • Discretized each variable into 4 states • Large Time Lags (up to 120 minutes between some variables)
Parameters Data Synthetic Oil Data Number of Var MaxLag 5 30 11 60 EP - DL Links Synthetic Oil Data Mutation Rate Limit (% of all links) ListGenerations 1% 25 30 1% 5 50 EP - DBN Structure Synthetic Oil Data Population Size Generations OpRate (KGM/Swap) Slide Mutation Rate 20 500 80% 10% 30 2000 80% 10%
Explanations Input: t - 0 : Tail Gas Flow in_state 0 Reboiler Temperature in_state 3 Output: t - 7 : Top Temperature in_state 0 t - 54 : Feed Rate in_state 1 t - 75 : Reactor Inlet Temperature in_state 0
Future Work • Exploring the use of different metrics • Improving accuracy (e.g. different discretization policies, continuous DBNs) • Learning a library of DBNs in order to classify the current state of a system