1 / 21

Sai Moturu

Sai Moturu. Introduction. Current approaches to microarray data analysis Analysis of experimental data followed by a posterior process where biological information is incorporated to make inferences Integrative analysis technique in this paper

cruz-chen
Download Presentation

Sai Moturu

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sai Moturu

  2. Introduction • Current approaches to microarray data analysis • Analysis of experimental data followed by a posterior process where biological information is incorporated to make inferences • Integrative analysis technique in this paper • Integrate gene annotation with expression data to discover intrinsic associations among both data sources based on co-occurrence patterns

  3. Methods and Data • Association Rules Discovery • Gene expression data • Gene annotation: Gene ontology categories, metabolic pathways and transcriptional regulators • Applied to two previously studied experiments

  4. Association Rules Discovery • Antecedent -> Consequent X -> Y • Measures of Quality • Support: P(XυY) • Confidence: P(Y|X) = P(XυY)/P(Y) • Improvement: Confidence/Consequent = P(XυY)/(P(X)*P(Y))

  5. Association Rules Discovery • Itemsets • Genes and the set of experiments in which gene is over or underexpressed • Gene characteristics • Constraint • Antecedent needs to be gene annotation • Expression Thresholds • Genes with log expression values >1 are overexpressed and <-1 are underexpressed (two fold)

  6. Mining Association Rules • The association rules that we are interested in have low support values and high confidence values • A variant of the apriori algorithm is used that has helped previously with mining low support-high confidence biologically significant patterns

  7. Filtering • Major drawback with association rules is the number of rules generated is huge • Also there is redundancy • This is taken care of with two filters • Redundant filter • Single antecedent filter

  8. Diauxic shift dataset • Gene expression accompanying the metabolic shift from fermentation to respiration that occurs when fermenting yeast cells • Expression levels recorded at 7 time points • External information • Metabolic pathways • Transcriptional regulators

  9. Results • Association rules among metabolic pathways and expression patterns • 1126 out of over 6000 genes were annotated with at least one pathway • Association rules with minimum support of 5, minimum confidence of 40% and minimum improvement of 1 • Redundant and single antecedent filters applied • 21 association rules

  10. Results • Association rules among transcriptional regulators and expression patterns • 3490 genes were annotated with at least one regulator • Association rules with minimum support of 5, minimum confidence of 80% and minimum improvement of 1 • Redundant filter applied • 28 association rules

  11. Results • Association rules among transcriptional regulators, metabolic pathways and expression patterns • 3882 genes • Association rules with minimum support of 5, minimum confidence of 80% and minimum improvement of 1 • Redundant filter applied • 37 association rules

  12. Results

  13. Results

  14. Results

  15. Serum stimulation dataset • Gene expression program of human fibroblast after serum exposure • External information • Gene ontology terms

  16. Results • Association rules among biological process annotation and expression patterns • 4092 genes of over 8000 • Support of 4, min confidence of 10% and min improvement of 1 • Single antecedent and redundant filters applied • 12 associations

  17. Results • Association rules among terms from all GO categories • 4630 genes of over 8000 • Support of 4, min confidence of 10% and min improvement of 1 • Redundant filter applied • 31 associations

  18. Results

  19. Results

  20. Results

  21. Conclusions • Some of the biological implications matched the ones found experimentally • The others could be explored further • Integrative data analysis is very useful for meaningful discoveries using gene expression data

More Related