1 / 31

Pathway Ranking Tool

Pathway Ranking Tool. Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003. Project Overview. Project Overview. BioDiscovery, Inc. at Marina del Rey Analyzing microarray data on pathway level instead of individual gene level Methods: -Enrichment Analysis -Permutational Statistics

brone
Download Presentation

Pathway Ranking Tool

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

  2. Project Overview Project Overview • BioDiscovery, Inc. at Marina del Rey • Analyzing microarray data on pathway level instead of individual gene level • Methods: • -Enrichment Analysis • -Permutational Statistics • -S. Metric • -Multivariate test

  3. Project Overview, cont. • Validation of statistical methods • 2 data sets: Brain Tumor, Interferon-gamma. • Sources of annotation: BioCarta, Kegg, Gene Ontology.

  4. Project Flowchart microarray algorithm phenotype pathway

  5. Research and Development in GeneSight • GeneSight is a data analysis software • Feature: -Statistical significance testing -Multiple Data Visualizations -Automated gene annotation -Complete result reports -Pathway analysis (?)

  6. Biology of Brain Tumor • Glioblastoma multiforme(GBM) is the most malignant of the glial tumors, classified as grade IV. • Many brain tumors are currently incurable. • Average survival time: 1 year

  7. Bad Genes Foment Trouble • Oncogenes: promote normal cell growth • Tumor suppressor genes: retard cell growth http://www.med.harvard.edu/publications/On_The_Brain/Volume4/Number2/SP95Awry.html

  8. Biology of Interferon • Interferon is a class of cytokines that mediate antiviral, antiproliferative, antitumor activites, etc. • IFN gamma is produced by T lymphocytes in response to mitogens or to antigens. • IFNs bind to their receptors and initiate JAK-STAT signaling cascade.

  9. Biology of Interferon, cont. http://www.grt.kyushu-u.ac.jp/eny-doc/pathway/ifn_gamma.html

  10. Gene Annotations • Grouping related genes together into pathways (A) BioCarta Ex: p53 Signaling Pathway (B) KEGG Ex:Citrate cycle (TCA cycle) • Grouping genes into structured, controlled vocabularies (ontologies) Gene Ontology -Biological Process. Ex: angiogenesis, apoptosis -Molecular Function. Ex: DNA binding activity -Cellular Component. Ex: nucleus, mitochondria

  11. Steps: 1. Mann-Whitney Test: obtain list of probe sets that satisfy a certain p-value. 2. Cluster analysis: see how many of listed probe occur in a cluster (pathway). Example: Original data: 12,625 genes. Select genes p-value <0.001. =>narrow to 927 genes. 2. Cluster those 927 genes into clusters. Traditional method of ranking gene pathways

  12. Mann-Whitney Test, Denovo Glioblastoma p<0.001

  13. How Affy. Microarray Chips Work Best results: Genes hybridize perfectly with Perfect Match, and not at all with Mismatch. PM: Perfect MatchMM: Mismatch http://www.ucl.ac.uk/oncology/MicroCore/HTML_resource/Norm_Affy1.htm

  14. Normal Normal Tumor Tumor Probe Set A 4.5 3.8 10.2 11.1 Probe Set B 2.3 2.7 13.5 13.6 Probe Set C 7.8 8.2 1.4 1.8 Probe Set A 3.5 4.2 8.9 9.6 Conditions Genes Example of GeneSight PlotData Theoretical Tumor Expression Levels (Log Transformed) Notice column replicates, Probe Set replicates.

  15. Given Data Sets • Given two data sets: Brain Tumor, IFN-γ • Brain Tumor Data Set has 5+ tumor types,however, only 2 Tumor types were used (Denovo Glioblastoma, Progressive Glioblastoma) • IFN-γ Data Set: the entire data set was used.

  16. What and why? • Goal: write a prototype extension to GeneSight that uses permutational statistics to develop a custom distribution for a given Microarray data set. • Overall significance: the software provides a list of (potentially) significant pathways that enables researchers to focus their work.

  17. E E C C 1 2 3 4 E C E C 1 2 3 4 What is permutational statistics? (In this context.) Choose different Control and Experiment groupings (permute). By iterating through an adequate number of permutations, we can determine if a pathway is likely to be significant (p-value).

  18. Permutational Stats. • There are two versions of the S. Metric currently implemented. M = Number of Genes flagged as significant Total = Total number of Genes in the Pathway S. Metric I = S. Metric II =

  19. (Layman's) How Statistics Works Data Statistic P-Value After all permutations are done, calculate the p-Value Permute Here S. Metric I, II

  20. Initial Significance Flagging S. Metric Algorithm • Take at least 10,000 unique permutations. A unique permutation is determined by a Permute class.For each condition For each permutation For each gene Calc. Mean diff. Calc. T-stat End For For each pathway store the statistic End for End for calcPvalue(stored statistic)End For pValue

  21. Limitations • Computational Power (Memory, CPU) • Required number of replicates (8,8)

  22. Output of result

  23. Validation of pathway analysisMethod 1 ???? Problem: lack of insignificant pathways

  24. Validation of pathway analysisMethod 2 Comparision of Prediction Methods 16 14 12 10 # of identified significant pathways 8 6 4 2 0 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 # of Pathways in BioCarta sorted by P-value

  25. ResultBrain Tumor-BioCarta

  26. ResultIFNG-Molecular Function (GO)

  27. Biological Limitations • Prediction of pathways to be significant in the conditions of interest is subjective. • Assumption of similar biological states between Denovo Glioblastoma and Progressive Glioblastoma.

  28. Future Direction • Finish modifying the Multivariate Statistic for use in the permutational method. This method uses PCA and Multivariate statistics. • Finish Validating the data produced using the Multivariate Statistic.

  29. Initial Results of Multivariate Stat. Sorted by p-value.

  30. Conclusion • It is not clear which is better the S. metric or traditional Enrichment Analysis. • Improvements can be made to the S. metric.

  31. Acknowledgements • Dr. Bruce Hoff • Dr. Anton Petrov • SoCalBSI: Dr. Jamil Momand, Dr. Sandra Sharp, Dr. Nancy Warter-Perez, Dr. Wendie Johnston • National Science Foundation • National Institute of Heath

More Related