1 / 51

Learning Module Networks

Learning Module Networks. Eran Segal Stanford University. Aviv Regev (Harvard) Nir Friedman (Hebrew U.). Joint work with: Dana Pe’er (Hebrew U.) Daphne Koller (Stanford). Data. MSFT. INTL. NVLS. MOT. Learning Bayesian Networks. Density estimation

vienna
Download Presentation

Learning Module Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Module Networks Eran Segal Stanford University Aviv Regev (Harvard) Nir Friedman (Hebrew U.) Joint work with: Dana Pe’er (Hebrew U.) Daphne Koller (Stanford)

  2. Data MSFT INTL NVLS MOT Learning Bayesian Networks • Density estimation • Model data distribution in population • Probabilistic inference: • Prediction • Classification • Dependency structure • Interactions between variables • Causality • Scientific discovery

  3. 70 60 MSFT 50 MSFT DELL DELL 40 INTL INTL 30 NVLS MOTI NVLS 20 MOT 10 Mar.’02 May.’02 Aug.’02 Oct.’02 Jan.’03 Jan.’02 Stock Market • Learn dependency of stock prices as a function of • Global influencing factors • Sector influencing factors • Price of other major stocks

  4. 70 60 50 MSFT DELL 40 INTL 30 NVLS MOTI 20 10 Mar.’02 May.’02 Aug.’02 Oct.’02 Jan.’03 Jan.’02 Stock Market • Learn dependency of stock prices as a function of • Global influencing factors • Sector influencing factors • Price of other major stocks MSFT DELL INTL NVLS MOT

  5. 70 60 50 MSFT DELL 40 INTL 30 NVLS MOTI 20 10 Mar.’02 May.’02 Aug.’02 Oct.’02 Jan.’03 Jan.’02 Stock Market • Learn dependency of stock prices as a function of • Global influencing factors • Sector influencing factors • Price of other major stocks Bayesian Network DELL INTL MSFT NVLS MOT

  6. Fragment of learned BN Stock Market • 4411 stocks (variables) • 273 trading days (instances) from Jan.’02 – Mar.’03 Problems • Statistical robustness • Interpretability

  7. 70 60 50 MSFT DELL 40 INTL 30 NVLS MOTI 20 10 Mar.’02 May.’02 Aug.’02 Oct.’02 Jan.’03 Jan.’02 Key Observation • Many stocks depend on the same influencing factors in much the same way • Example: Intel, Novelus, Motorola, Dell depend on the price of Microsoft • Many other domains with similar characteristics • Gene expression • Collaborative filtering • Computer network performance • …

  8. Module Network CPD 1 CPD 1 MSFT Module I CPD 2 CPD 2 CPD 3 MOT CPD 4 DELL INTL Module II CPD 6 CPD 5 CPD 3 AMAT HPQ Module III The Module Network Idea Bayesian Network MSFT MOT DELL INTL AMAT HPQ

  9. Share parameters and dependencies between variables with similar behavior Explicit modeling of modular structure Problems and Solutions • Statistical robustness • Interpretability

  10. Outline • Module Network • Probabilistic model • Learning the model • Experimental results

  11. Module Network Components • Module Assignment Function • A(MSFT)=MI • A(MOT)=A(DELL)=A(INTL) =MII • A(AMAT)= A(HPQ)=MIII AMAT MSFT HPQ DELL MOT INTL MSFT Module I MOT INTL DELL Module II AMAT HPQ Module III

  12. Module Network Components • Module Assignment Function • Set of parents for each module • Pa(MI)= • Pa(MII)={MSFT} • Pa(MIII)={DELL, INTL} MSFT Module I MOT INTL DELL Module II AMAT HPQ Module III

  13. Module Network Components • Module Assignment Function • Set of parents for each module • CPD template for each module MSFT Module I MOT INTL DELL Module II AMAT HPQ Module III

  14. MSFT MOT INTL DELL AMAT HPQ Ground Bayesian Network Ground Bayesian Network • A module network induces a ground BN over X • A module network defines a coherent probabilty distribution over X if the ground BN is acyclic MSFT Module I MOT INTL DELL Module II AMAT HPQ Module III

  15. MI MII MIII Module graph Theorem: The ground BN is acyclic if the module graph is acyclic Module Graph • Nodes correspond to modules • MiMj if at least one variable in Mi is a parent of Mj MSFT Module I MOT INTL DELL Module II AMAT HPQ Acyclicity checked efficiently using the module graph Module III

  16. Outline • Module Network • Probabilistic model • Learning the model • Experimental results

  17. Marginal likelihood Assignment /structure prior Parameter prior Data likelihood Learning Overview • Given data D, find assignment function A and structure S that maximize the Bayesian score • Marginal data likelihood

  18. MI MII|MSFT MIII|DELL,INTL Likelihood Function MSFT Module I MOT INTL DELL Module II AMAT HPQ Likelihood function decomposes by modules Module III Instance 1 Instance 2 Sufficient statistics of (X,Y) Instance 3

  19. Bayesian Score Decomposition • Bayesian score decomposes by modules MSFT Module I Module j parents Module j variables MOT INTL DELL Delete INTL ModuleIII Module II AMAT HPQ Module III

  20. Bayesian Score Decomposition • Bayesian score decomposes by modules MSFT Module I MOT INTL DELL A(MOT)=2  A(MOT)=1 Module II AMAT HPQ Module III

  21. Assignment function A Improve structure Improve assignments Algorithm Overview • Find assignment function A and structure S that maximize the Bayesian score Find initial assignment A Dependency structure S

  22. MSFT DELL AMAT MOT HPQ INTL 1 2 3 Initial Assignment Function Variables (stocks) AMAT MOT DELL MSFT INTL HPQ Instances (trading days) x[1] x[2] x[3] x[4] Find variables that are similar across instances A(MOT)= MII A(INTL)= MII A(DELL)=MII

  23. Assignment function A Improve structure Improve assignments Algorithm Overview • Find assignment function A and structure S that maximize the Bayesian score Find initial assignment A Dependency structure S

  24. Learning Dependency Structure • Heuristic search with operators • Add/delete parent for module • Cannot reverse edges • Handle acyclicity • Can be checked efficientlyon the module graph • Efficient computation • After applying operator formodule Mj, only update scoreof operators for module Mj MSFT  ModuleII X MSFT Module I MOT MI MII MIII INTL DELL X Module II INTL  ModuleI  AMAT HPQ INTL  ModuleIII Module III

  25. Learning Dependency Structure • Structure search done at module level • Parent selection • Reduced search space relative to BN • Acyclicity checking • Individual variables only used for computation of sufficient statistics

  26. Assignment function A Improve structure Improve assignments Algorithm Overview • Find assignment function A and structure S that maximize the Bayesian score Find initial assignment A Dependency structure S

  27. Learning Assignment Function • A(DELL)=MI • Score: 0.7 DELL DELL MSFT Module I MOT INTL Module II AMAT HPQ Module III

  28. Learning Assignment Function • A(DELL)=MI • Score: 0.7 • A(DELL)=MII • Score: 0.9 DELL MSFT Module I MOT INTL DELL Module II AMAT HPQ Module III

  29. Learning Assignment Function • A(DELL)=MI • Score: 0.7 • A(DELL)=MII • Score: 0.9 • A(DELL)=MIII • Score: cyclic! MSFT Module I MOT INTL DELL Module II DELL AMAT HPQ Module III

  30. Learning Assignment Function • A(DELL)=MI • Score: 0.7 • A(DELL)=MII • Score: 0.9 • A(DELL)=MIII • Score: cyclic! MSFT Module I MOT INTL DELL Module II AMAT HPQ Module III

  31. Ideal Algorithm • Learn the module assignment of all variables simultaneously

  32. Problem • Due to acyclicity cannot optimize assignment for variables separately A(DELL)=ModuleIV A(MSFT)=ModuleIII DELL DELL DELL MSFT MSFT DELL Module I Module II MI MII DELL AMAT HPQ MIII MIV Module III Module IV Module graph Module Network

  33. Problem • Due to acyclicity cannot optimize assignment for variables separately A(DELL)=ModuleIV A(MSFT)=ModuleIII DELL DELL DELL MSFT MSFT DELL Module I Module II MI MII DELL AMAT HPQ MIII MIV Module III Module IV Module graph Module Network

  34. Learning Assignment Function • Sequential update algorithm • Iterate over all variables • For each variable, find its optimal assignment given the current assignment to all other variables • Efficient computation • When changing assignment from Mi to Mj, only need to recompute score for modules i and j

  35. Learning the Model AMAT MSFT HPQ • Initialize module assignment A • Optimize structure S • Optimize module assignment A • For each variable, find its optimalassignment given the currentassignment to all other variables DELL MOT INTL MSFT Module I MOT INTL DELL Module II AMAT HPQ MOT Module III

  36. Learn parameter sharing Shared parameters Shared structure Learn structure  X X N/A  X X X    X  X Langseth+al     Related Work Bayesian networks Parameter sharing PRMs OOBNs Module Networks

  37. Outline • Module Network • Probabilistic model • Learning the model • Experimental results • Statistical validation • Case study: Gene regulation

  38. Structure change iterations 50 40 Vars changed (% from total) 30 20 10 Algorithm iterations 0 0 5 10 15 20 Learning Algorithm Performance -128 -129 Bayesian score (avg. per gene) -130 Algorithm iterations -131 0 5 10 15 20

  39. -450 500 instances -500 200 instances -550 • Best performance achieved for models with 10 modules -600 Test data likelihood (per instance) 100 instances -650 -700 -750 25 instances 50 instances -800 0 20 40 60 80 100 120 140 160 180 200 Number of modules Generalization to Test Data • Synthetic data: 10 modules, 500 variables

  40. -450 -500 -550 -600 -650 -700 -750 -800 0 20 40 60 80 100 120 140 160 180 200 Generalization to Test Data • Synthetic data: 10 modules, 500 variables 500 instances 200 instances Test data likelihood (per instance) 100 instances • Gain beyond 100 instances is small 25 instances 50 instances Number of modules

  41. 90 80 70 60 50 • 74% of 2250 parent-child relationships recovered 40 30 20 10 0 0 20 40 60 80 100 120 140 160 180 200 Structure Recovery Graph • Synthetic data: 10 modules, 500 variables 500 instances 200 instances Recovered structure (% correct) 100 instances 50 instances 25 instances Number of modules

  42. 600 550 Test Data Log-Likelihood(gain per instance) 500 450 Bayesian network performance 400 0 0 50 100 150 200 250 300 Number of modules Stock Market • 4411 variables (stocks), 273 instances (trading days) • Comparison to Bayesian networks (cross validation)

  43. Regulatory Networks • Learn structure of regulatory networks: • Which genes are regulated by each regulator

  44. Gene Expression Data Experiments • Measures mRNA level forall genes in one condition • Learn dependency of the expression of genes as a function of expression of regulators Induced Genes Repressed

  45. 150 100 50 Bayesian network performance Test Data Log-Likelihood(gain per instance) 0 -50 -100 -150 0 100 200 300 400 500 Number of modules Gene Expression • 2355 variables (genes), 173 instances (arrays) • Comparison to Bayesian networks

  46. Biological Evaluation • Find sets of co-regulated genes (regulatory module) • Find the regulators of each module 46/50 30/50 Segal et al., Nature Genetics, 2003

  47. false true HAP4  true false Ypl230W ? Experimental Design • Hypothesis: Regulator ‘X’ activates process ‘Y’ • Experiment: Knock out ‘X’ and repeat experiment X Segal et al., Nature Genetics, 2003

  48.  Ypl230w  Kin82  Ppt1 wt  (hrs.) wt wt   (min.) (min.) 0 3 5 7 9 24 0 2 5 7 9 24 0 7 15 30 60 0 5 15 30 60 0 7 15 30 60 0 5 15 30 60 341 differentially expressed genes 281 602 >16x >4x >4x Differentially Expressed Genes Segal et al., Nature Genetics, 2003

  49.  Ypl230w # Module Significance 39 Protein folding 7/23, 1e - 4 29 Cell differentiation 6/41, 2e - 2 5 Glycolysis and foldin g 5/37, 4e - 2 34 Mitochondrial and protein fate 5/37, 4e - 2  Ppt1 # Module Significance 14 Ribosomal and phosphate metabolism 8/32, 9e 3 11 Amino acid and purine metabolism 11/53, 1e 2 15 mRNA, rRNA and tRNA processing 9/43, 2e 2 39 Protein f olding 6/23, 2e 2 30 Cell cycle 7/30, 2e 2  Kin82 • All regulators regulate predicted modules # Module Significance 3 Ener gy and osmotic stress I 8/31, 1e 4 2 Energy, osmolarity & cAMP signaling 9/64, 6e 3 15 mRNA, rRNA and tRNA processing 6/43, 2e 2 Biological Experiments Validation • Were the differentially expressed genes predicted as targets? • Rank modules by enrichment for diff. expressed genes Segal et al., Nature Genetics, 2003

  50. Summary • Probabilistic model for learning modules of variables and their structural dependencies • Improved performance over Bayesian networks • Statistical robustness • Interpretability • Application to gene regulation • Reconstruction of many known regulatory modules • Prediction of targets for unknown regulators

More Related