540 likes | 716 Views
Bayesian network models of Biological signaling pathways. karensachs@stanford.edu. PIP3. PIP2. PKC. PKA. Plc . p38. Raf. Jnk. Erk. Akt. From Phospho-molecular profiling to Signaling pathways. Cell1. Cell2. Cell3. Flow Measurments. Cell4. Cell600. Picture: John Albeck.
E N D
Bayesian network models of Biological signaling pathways karensachs@stanford.edu
PIP3 PIP2 PKC PKA Plc p38 Raf Jnk Erk Akt From Phospho-molecular profiling to Signaling pathways Cell1 Cell2 Cell3 Flow Measurments Cell4 ... Cell600 Picture: John Albeck Signaling Pathways High throughput data
Outline • What are signaling pathways? • What kind of data is available study them? • How do we use Bayesian networks to learn their structure? • Two extensions: • Markov neighborhood algorithm • Bayesian network based cyclic networks (BBCs)
Outline • What are signaling pathways? • What kind of data is available study them? • How do we use Bayesian networks to learn their structure? • Two extensions: • Markov neighborhood algorithm • Bayesian network based cyclic networks (BBCs)
Secrete cytokines Proliferation Cell death Inside each cell is a molecular network Cells respond to their environment
Modification Translation Transcription Protein mRNA “Central Dogma” DNA Modified Protein Delivers instructions for specific gene Ribosome: Protein-production factory ‘Blueprint’- instructions for production of all proteins
A B C RNA TF Signaling & Genetic pathways A B C Cell response DNA
Outline • What are signaling pathways? • What kind of data is available study them? • How do we use Bayesian networks to learn their structure? • Two extensions: • Markov neighborhood algorithm • Bayesian network based cyclic networks (BBCs)
Graph Node: Measured level/activity of protein Edge: Influence (dependency) between proteins Conditional probability distributions Each node has a conditional probability given its parents 0 1 2 Bayesian Networks Protein A Protein B Protein E P(B|A=‘On’) Protein C Protein D 10
(analytical solution!) How do we use Bayesian Networks to infer pathways? The Technical Details Score candidate models Use a heuristic search to find high scoring models
Protein data • Western blot
Protein data • Protein arrays
Protein data • Mass Spectrometry All of these lysate approaches give 1 measurement per protein for 10^3-10^7 cells
Flow Cytometry: Single Cell Analysis Thousands of datapoints
Stimulations and perturbations 1 8 3 5 2 6 7 9 4 VAV SLP-76 LFA-1 CD3 CD28 L A T Cytohesin RAS PI3K JAB-1 Zap70 10 Lck PKC PLCg Akt PIP3 Activators 1. a-CD3 2. a-CD28 3. ICAM-2 4. PMA 5. b2cAMP Inhibitors 6. G06976 7. AKT inh 8. Psitect 9. U0126 10. LY294002 PIP2 PKA Raf MAPKKK MAPKKK Mek1/2 MEK4/7 MEK3/6 Erk1/2 JNK p38
Conditions (multi-well format) 12 Color Flow Cytometry perturbation a Mek1/2 PIP2 PIP3 PKC PKA Plc p38 Raf Jnk Erk Akt perturbation b perturbation n T-Lymphocyte Data • Datasets • of cells • condition ‘a’ • condition ‘b’ • condition…‘n’ • Primary human T-Cells • 9 conditions • (6 Specific interventions) • 9 phosphoproteins, 2 phospolipids • 600 cells per condition • 5400 data-points Omar Perez
A B E C D Statistical Dependencies Phospho A Phospho B
A B E C D Statistical Dependencies Phospho A Phospho B Edges can be directed (primarily) due to the use of interventions
Conditions (multi well format) Multiparameter Flow Cytometry perturbation a Mek1/2 PIP2 PIP3 PKC PKA Plc p38 Raf Jnk Erk Akt perturbation b • Datasets • of cells • condition ‘a’ • condition ‘b’ • condition…‘n’ perturbation n Overview Influence diagram of measured variables Bayesian Network Analysis
Inferred Network Phospho-Proteins Phospho-Lipids PKC Perturbed in data PKA Raf Plc Jnk P38 Mek PIP3 P44/42 Akt PIP2
How well did we do? Phospho-Proteins Phospho-Lipids PKC Perturbed in data PKA Raf Plc Jnk P38 Mek PIP3 P44/42 Akt PIP2 Direct phosphorylation
Features of Approach • Direct phosphorylation: Mek Erk • Difficult to detect using other forms of high-throughput data: • -Protein-protein interaction data • -Microarrays
How well did we do? Phospho-Proteins Phospho-Lipids PKC Perturbed in data PKA Raf Plc Jnk P38 Mek PIP3 P44/42 Akt PIP2
How well did we do? Phospho-Proteins Phospho-Lipids PKC Perturbed in data PKA Raf Plc Jnk P38 Mek PIP3 P44/42 Akt PIP2 Indirect Signaling
PKC Jnk Mapkkk PKC Jnk Mek4/7 Raf Mek Erk Not measured Indirect signaling • Indirect signaling Indirect connections can be found even when the intermediate molecule(s) are not measured • Dismissing edges
PKC Raf Mek Rafs497 PKC Rafs259 Mek Ras Indirect signaling - Complex example • Is this a mistake? • The real picture • Phoso-protein specific • More than one pathway of influence
Expected Pathway How well did we do? Phospho-Proteins Phospho-Lipids PKC Perturbed in data PKA Raf Plc Jnk P38 Mek PIP3 P44/42 • 15/17 Classic Akt PIP2
Expected Pathway Reported Reversed Missed Signaling pathway reconstruction Phospho-Proteins Phospho-Lipids PKC Perturbed in data PKA Raf Plc Jnk P38 Mek PIP3 Erk • 15/17 Classic • 17/17 Reported • 3 Missed Akt PIP2 [Sachs et al 2005]
Caveats • Inhibitor specificity • Binding site similar across proteins • Reagent availability and specificity • Data quality • These are issues in many biological apps! I think I’ll bind here
Outline • What are signaling pathways? • What kind of data is available study them? • How do we use Bayesian networks to learn their structure? • Two extensions: • Markov neighborhood algorithm • Bayesian network based cyclic networks (BBCs)
Building larger networks 12 color capability Model 50-100 variables 4 color capability Model 12 variables PKC PKA Raf Plc Jnk P38 Mek PIP3 P44/42 Akt PIP2 ~80 proteins involved in MAPK signaling (11- at the cutting edge- is NOT enough!) 33
Measured subsets = Incomplete dataset (Missing data) Insufficient information for standard approaches (will perform poorly) Use a set of biologically motivated assumptions to constrain search.. And to reduce the number of experiments ( ) 11 4 = 330 34
Constraining the search Using ‘Markov neighborhoods’ (for each variable) • Plus potential perturbation parents Identify candidate parents 35
Molecules 1, 3, 7, 9 Molecules 1, 2, 6, 11 Molecules 2, 4, 7, 10 Mek1/2 PIP2 PKA PKC PIP3 p38 Jnk Plc Erk Raf Akt Approach overview Bayesian Network Analysis (Constrained search)
Neighborhood reduction D A 411 C E F B 4 color capability Conditional independencies in the substructure? ABC 37
Accurate Reproduction of Model ~15 experiments, 4-colors Confidence value different from original model PKC PKA Raf Plc Jnk P38 Mek PIP3 Erk Akt PIP2
Active learning approach Mek1/2 PIP2 PKA PKC PIP3 p38 Jnk Plc Erk Raf Akt 39
Outline • What are signaling pathways? • What kind of data is available study them? • How do we use Bayesian networks to learn their structure? • Two extensions: • Markov neighborhood algorithm • Bayesian network based cyclic networks (BBCs)
Learning cyclic structures with Bayesian networks • Biological networks contain many loops • Bayesian networks are constrained to be acyclic So…
Overcoming acyclicity GRB2/SOS Ras Raf MEK Develop a new, Bayesian network derived algorithm that models cycles… Erk Signaling pathways contain many cycles Bayesian networks are constrained to be acyclic How can we accurately model pathways with cycles?
Bayesian Network Based Cyclic Networks (BBNs) • I. Break loops with molecule inhibitors • II. Use BN to learn the structure (now not cyclic!) • III. Close loops GRB2/SOS Ras Raf Mek inhibitor Solomon Itani MEK Erk
Bayesian Network Based Cyclic Networks (BBNs) • I. Break loops with molecule inhibitors • Detect loops P(A)A* ~= P(A) • II. Use BN to learn the structure (now not cyclic!) • III. Close loops P(B|Pa(B)) A* ~= P(B|Pa(B)) AB GRB2/SOS Ras Raf MEK Erk
Future work Larger network from overlapping sets (Markov neighborhood) Dynamic models over time Differences in signaling (sub-populations, treatment conditions, cell types, disease states)
Acknowledgements Garry Nolan Dana Pe’er Doug Lauffenburger Omar Perez Dennis Mitchell Funding LLS post doctoral fellowship Shigeru Okumura Mesrob Ohannessian Solomon Itani 46
Mathematical Intuition B C A C is independent of A given B. B A C independent of A given B and D C D • No need to introduce time!!! • When loops are broken, the result is a BN!!!
Prediction: ErkAkt Erk1/2 unperturbed • Erk Akt not well established in literature Predictions: • Erk1/2 influences Akt • While correlated, Erk1/2 does not influence PKA PKC PKA Raf Mek Erk1/2 Akt
control, stimulated Erk1 siRNA, stimulated P-Akt P-PKA Validation • SiRNA on Erk1/Erk2 • Select transfected cells • Measure Akt and PKA P=9.4e-5 P=0.28