1 / 32

Implication Networks from Large Gene-expression Datasets

Debashis Sahoo PhD Candidate, Electrical Engineering, Stanford University Joint work with David Dill, Andrew Gentles, Rob Tibshirani, Sylvia Plevritis. Implication Networks from Large Gene-expression Datasets. Integrative Cancer Biology Program, Stanford University. Motivation.

Download Presentation

Implication Networks from Large Gene-expression Datasets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Debashis Sahoo PhD Candidate, Electrical Engineering, Stanford University Joint work with David Dill, Andrew Gentles, Rob Tibshirani, Sylvia Plevritis Implication Networks from Large Gene-expression Datasets Integrative Cancer Biology Program, Stanford University ICBP, Stanford University

  2. Motivation • Current approaches • Clustering • Co-expression • Linear regression • Mutual information CCNB2 BUB1B ICBP, Stanford University

  3. Hidden Relationships • Pearson’s correlation = -0.1 • GABRB1 and ACPP are not linearly related. • There is a Boolean relationship • ACPP high  GABRB1 low • GABRB1 high  ACPP low GABRB1 ACPP ICBP, Stanford University

  4. Outline • Motivation • Boolean analysis • Boolean implication network • Biological insights • Conserved Boolean network • Conclusion ICBP, Stanford University

  5. Outline • Motivation • Boolean analysis • Boolean implication network • Biological insights • Conserved Boolean network • Conclusion ICBP, Stanford University

  6. Boolean Analysis Workflow Get data GEO [Edgar et al. 02] Normalize RMA [Irizarry et al. 03] Determine thresholds Discover Boolean relationships Biological interpretation ICBP, Stanford University

  7. Intermediate Threshold Determine threshold • A threshold is determined for each gene. • The arrays are sorted by gene expression • StepMiner is used to determine the threshold High CDH expression Low Sorted arrays [Sahoo et al. 07] ICBP, Stanford University

  8. Discovering Boolean Relationships • Analyze pairs of genes. • Analyze the four different quadrants. • Identify sparse quadrants. • Record the Boolean relationships. • ACPP high  GABRB1 low • GABRB1 high  ACPP low 2 4 GABRB1 1 3 ACPP ICBP, Stanford University

  9. Boolean Relationships • There are six possible Boolean relationships • A low  B low • A low  B high • A high  B low • A high  B high • Equivalent • Opposite ICBP, Stanford University

  10. Four Asymmetric Boolean Relationships • A low  B low • A low  B high • A high  B low • A high  B high PTPRC low  CD19 low XIST high  RPS4Y1 low RPS4Y1 CD19 PTPRC XIST FAM60A low  NUAK1 high COL3A1 high  SPARC high SPARC NUAK1 FAM60A COL3A1 ICBP, Stanford University

  11. Two Symmetric Boolean Relationships Opposite Equivalent CCNB2 EED BUB1B XTP7 ICBP, Stanford University

  12. Outline • Motivation • Boolean analysis • Boolean implication network • Biological insights • Conserved Boolean network • Conclusion ICBP, Stanford University

  13. Boolean Implication Network • Boolean implications form a directed graph • Nodes: • For each gene A • A high • A low • Edges: • A high to B low • A high  B low A high B low C high ICBP, Stanford University

  14. Size of The Boolean Networks highhigh highlow lowlow lowhigh Equivalent Opposite ICBP, Stanford University

  15. Boolean Networks Are Not Scale Free Human Total Symmetric Asymmetric #probesets #probesets #probesets #relationships #relationships #relationships ICBP, Stanford University

  16. Outline • Motivation • Boolean analysis • Boolean implication network • Biological insights • Conserved Boolean network • Conclusion ICBP, Stanford University

  17. Gender Specific • XIST • X inactivation specific transcript • Expressed in female • RPS4Y1 • Y-linked gene • Expressed in male only • Boolean relationship • XIST highRPS4Y1 low RPS4Y1 XIST [Day et al. 07] ICBP, Stanford University

  18. Tissue Specific • ACPP • Acid phosphatase, prostate • Prostate specific gene • GABRB1 • GABA A receptor, beta 1 • Brain specific • Boolean relationship • ACPP highGABRB1 low GABRB1 ACPP ICBP, Stanford University

  19. Development • HOXD3 • Homeobox D3 • Fruit fly antennapedia homolog • HOXA13 • Homeobox A13 • Fruit fly ultrabithorax homolog • Boolean relationship • HOXD3 high  HOXA13 low HOXA13 HOXD3 [Rinn et al. 07] ICBP, Stanford University

  20. Differentiation • PTPRC • protein tyrosine phosphatase, receptor type, C • B220 • Expressed in B cell precursors and mature B cell • CD19 • Expressed in mature B cell • Boolean relationship • PTPRC low  CD19 low CD19 PTPRC ICBP, Stanford University

  21. Biological Insights Gender Tissue GABRB1 RPS4Y1 XIST ACPP Development Differentiation HOXA13 CD19 HOXD3 PTPRC ICBP, Stanford University

  22. Outline • Motivation • Boolean analysis • Boolean implication network • Biological insights • Conserved Boolean network • Conclusion ICBP, Stanford University

  23. Fly 17M 208M Human 336M Mouse Conserved Boolean Networks • Find orthologs between human, mouse and fly using EUGene database. • Search for orthologous gene pairs that have the same Boolean relationship. [Gilbert, 02] 41K 4M ICBP, Stanford University

  24. Conserved Boolean Relationships Mouse Human Fly • Two largest connected components in the network of equivalent genes • 178 genes: highly enriched for cell-cycle and DNA replication • 32 genes: highly enriched for synaptic functions CycB Ccnb2 CCNB2 Bub1 Bub1b BUB1B ICBP, Stanford University

  25. Conserved Asymmetric Boolean Relationships GABRB1 expressing cells have low cell cycle (BUB1B) activity. Mouse Human Fly Lcch3 Gabrb1 GABRB1 Bub1 Bub1b BUB1B ICBP, Stanford University

  26. Outline • Motivation • Boolean analysis • Boolean implication network • Biological insights • Conserved Boolean network • Conclusion ICBP, Stanford University

  27. Conclusion • Boolean analysis • Boolean relationships are directly visible on the scatter plot. • Enables discovery of asymmetric relationship. • Can reveal known biological processes. • Has potential for new biological discovery. • Boolean network • Is large • Is not scale free ICBP, Stanford University

  28. Acknowledgements • Leonore A Herzenberg • James Brooks • Joe Lipsick • Gavin Sherlock • Howard Chang • Stuart Kim • The Felsher Lab: • Natalie Wu • Cathy Shachaf • Dean Felsher Funding: ICBP Program (NIH grant: 5U56CA112973-02) ICBP, Stanford University

  29. The END ICBP, Stanford University

  30. Example ICBP, Stanford University

  31. Determine threshold Its hard to determine a threshold for this gene. StepMiner usually puts a threshold in the middle for this case. ICBP, Stanford University

  32. (expected – observed) statistic = √ expected a00 ( ) a01 a11 a00 1 error rate = + 2 (a00+ a01) (a00+ a10) a00 a10 Statistical Tests • Compute the expected number of points under the independence model • Compute maximum likelihood estimate of the error rate ICBP, Stanford University

More Related