1 / 21

Microarray Analysis

Microarray Analysis. Jesse Mecham CS 601R. Microarray Analysis. It all comes down to Experimental Design Preprocessing Data Analysis. Experimental Design. Elimination of confounding factors Same cell line, minimal exposure Timing of sampling Technological considerations

margo
Download Presentation

Microarray Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Microarray Analysis Jesse Mecham CS 601R

  2. Microarray Analysis • It all comes down to • Experimental Design • Preprocessing • Data Analysis

  3. Experimental Design • Elimination of confounding factors • Same cell line, minimal exposure • Timing of sampling • Technological considerations • Hybridization considerations • Chip/tag selection

  4. Slide to Data Gene Value D26528_at 193 D26561_cds1_at -70 D26561_cds2_at 144 D26561_cds3_at 33 D26579_at 318 D26598_at 1764 D26599_at 1537 D26600_at 1204 D28114_at 707

  5. Preprocessing • Data import • Background adjustment • Normalization • Summarization of multiple probes per transcript • Quality control

  6. Data Import • Incorporate various file formats into desired data formats • Different vendors have different representations • Sometimes desired data is not provided

  7. Background Adjustment • It all comes down to one word…noise • Optical distortion • Non-specific hybridization • Equipment damage

  8. M vs. A • M represents differential ratio M = (log R – log G) • A represents the fluorescence intensity A = (log R + log G)/2 • Desirable transformation would show uniform distribution of differential across intensities

  9. Normalization • Normalization between samples needs to be established for a variety of reasons • Different reverse transcription efficiency levels • We are using PCR to amplify in separate plates • Hybridization inequalities • Variations in solution used in hybridization reaction • Spatial abnormalities between plates • Particularly apparent for in-house plates

  10. Background Example

  11. Possible Problem in Background?

  12. Summarizing Data • Process of reducing the various samples into an analysis • The crux of microarray analysis • Can apply a linear or a non linear model using any of the following techniques • Support Vector Machines (SVM) • Neural Networks • Empirical Bayes

  13. Quality Control • Concerned with accuracy and reproducibility • Dr. Piatetsy-Shapiro (last week’s colloquium) was primarily concerned with this area of microarray analysis • Detection of errors (x-validation) • Isolation and validation of significant results • Corrective behavior

  14. Time for Fun • Dataset • ApoAI.RData • The apolipoprotein AI (ApoAI) gene is known to play a pivotal role in high density lipoprotein (HDL) metabolism. Mice which have the ApoAI gene knocked (KO) out have very low HDL cholesterol levels. • Puprose is to determine how ApoAI deficiency affects the action of other genes in the liver • Help determine what molecular pathways ApoAI operates on

  15. Markers • All mRNA data from both knockout and wild-type were marked GREEN • KO and WT are marked RED • Oftentimes, both populations are run on same plate with one being marked RED and the other marked GREEN

  16. Rwww.r-project.org • “S”-like GNU project language and environment for statistical computing • Great free package for linear and non-linear statistical modeling • Also includes: • an effective data handling and storage facility, • a suite of operators for calculations on arrays, in particular matrices, • a large, coherent, integrated collection of intermediate tools for data analysis, • graphical facilities for data analysis and display either on-screen or on hardcopy, and • a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities.

  17. Bioconductorhttp://bioconductor.org • Open source package for statistical analysis of genomic data • Includes both statistical and graphical tools • Active project with a constant influx of new packages • Does not include more complex analysis tools at this time (SVM’s, etc.)

  18. With Controls

  19. Controls Removed

More Related