1 / 19

Analyzing Global Gene Expression

Analyzing Global Gene Expression. Microarray Data. A “snapshot” of the amount of a particular gene being transcribed in a tissue Measured for tens of thousands of genes Use of multiple tissues on a single array allow for direct comparisons between tissues. Objectives of Microarray Studies.

nevina
Download Presentation

Analyzing Global Gene Expression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analyzing Global Gene Expression

  2. Microarray Data • A “snapshot” of the amount of a particular gene being transcribed in a tissue • Measured for tens of thousands of genes • Use of multiple tissues on a single array allow for direct comparisons between tissues

  3. Objectives of Microarray Studies • Which genes are affected when exposed to a “treatment”? • Hit it with a stick and see what happens • Given a “profile” of levels of expression for many genes, can the unknown “treatment” be predicted? • Tumor or disease classification • Time course experiments allow the study of coregulation of genes, and for the reconstruction of regulatory networks

  4. Many computational and statistical problems • Image analysis (spot identification, background, etc.) • Data management and pipelining • “Normalization” of data • Clustering coregulated genes • Classifying tissue types • Regulatory network inference • Promoter identification (when combined with genomic sequence data)

  5. Normalization Cy5 signal (log2) Cy3 signal (log2)

  6. then apply slope and intercept to the original dataset repeat until r2 changes by < 0.001 Normalization by iterative linear regression • fit a line (y=mx+b) to the data set • set aside outliers (residuals > 2 x s.e.) D Finkelstein et al. http://www.camda.duke.edu/CAMDA00/abstracts.asp

  7. Normalization (Linear) Cy5 signal (log2) Cy3 signal (log2)

  8. Normalization (Linear) Cy5 signal (log2) Cy3 signal (log2)

  9. Looking for significance in microarray data • Tools: • SAM • Cluster • TreeView

  10. Identifying differential expression SAM Significance Analysis of Microarrays Tusher et al., PNAS 2001 http://www-stat.stanford.edu/~tibs/SAM/index.html

  11. More freeware tools for microarray analysis • indexed at Y.F. Leung’s Functional Genomics site: http://ihome.cuhk.edu.hk/~b400559/ • MeV (TIGR) www.tigr.org • MAExplorer (NCI) www.lecb.ncifcrf.gov/MAExplorer/ • Expression Profiler (EBI) • http://ep.ebi.ac.uk/ • many of these tools require a Java Virtual Machine

  12. Data Transformation (MM 4.1) • Compute activation or repression by ratio of red/green control • However, discrepancies in interpreting repression vs. activation numbers • Solution: Log transformation of data • Log10(4) = 0.6 while log10(.25) = -0.6

  13. Pearson correlation coefficient • Provides a measure of similarity between expression patterns • Calculate mean and standard deviation for the rows in question (Table 4.2) • Subtract the appropriate mean from each value in a row and divide by the standard deviation to generate a normalized row of data • Multiply corresponding values from each row and keep a running total • Divide the total by number of elements in the row to get the correlation coefficient

  14. Merit of this coefficient • If identical patterns, value should be 1.0 • Reciprocal patterns, value should –1.0 • USE LOG TRANSFORMED DATA for computation of Pearson coefficient • Used in Clustering

  15. Clustering genes • Combine rows pairwise based on Pearson coefficients until all rows accounted for • Eisen et al. 1998. Cluster analysis and display of genome-wide expression patterns. PNAS 95:14863-14868

  16. Guilt by association • Genes exhibiting similar expression patterns are thought to be involved in common physiological processes • Can be used to find potential regulatory sequences

  17. Controlling isozyme expression • Isozymes are distinct enzymes that catalyze the same reaction • Isozymes often have kinetic properties, cofactor requirements and/or localization • Promoter functionality

  18. DeRisi Paper and exercise

More Related