560 likes | 585 Views
Introduction to DNA Microarrays. Michael F. Miles, M.D., Ph.D. Depts. of Pharmacology/Toxicology and Neurology and the Center for Study of Biological Complexity mfmiles@vcu.edu 225-4054. Biological Regulation: “You are what you express”. Levels of regulation Methods of measurement
E N D
Introduction to DNA Microarrays Michael F. Miles, M.D., Ph.D. Depts. of Pharmacology/Toxicology and Neurology and the Center for Study of Biological Complexity mfmiles@vcu.edu 225-4054
Biological Regulation: “You are what you express” • Levels of regulation • Methods of measurement • Concept of genomics
Regulation of Gene Expression • Transcriptional • Altered DNA binding protein complex abundance or function • Post-transcriptional • mRNA stability • mRNA processing (alternative splicing) • Translational • RNA trafficking • RNA binding proteins • Post-translational • Many forms!
Regulation of Gene Expression • Genes are expressed when they are transcribed into RNA • Amount of mRNA indicates gene activity • Some genes expressed in all tissues -- but are still regulated! • Some genes expressed selectively depending on tissue, disease, environment • Dynamic regulation of gene expression allows long term responses to environment
Mesolimbic dopamine ? Other Reinforcement Intoxication Acute Drug Use Chronic Drug Use ?Synaptic Remodeling Persistent Gene Exp. “Addiction” Compulsive Drug Use Altered Signaling Gene Expression Tolerance Dependence Sensitization ?Synaptic Remodeling
Progress in Studies on Gene Regulation 1960 1970 1980 1990 2000 mRNA, tRNA discovered Nucleic acid hybridization, protein/RNA electrophoresis Molecular cloning; Southern, Northern & Western blots; 2-D gels Subtractive Hybridization, PCR, Differential Display, MALDI/TOF MS Genome Sequencing DNA/Protein Microarrays
Nucleic Acid Hybridization: How It Works
Primer on Nucleic Acid Hybridization • Hybridization rate depends on time,the concentration of nucleic acids, and the reassociation constant for the nucleic acid: C/Co = 1/(1+kCot)
A Bit of History ~1992-1996: Oligo arrays developed by Fodor, Stryer, Lockhart, others at Stanford/Affymetrix and Southern in Great Britain ~1994-1995: cDNA arrays usually attributed to Pat Brown and Dari Shalon at Stanford who first used a robot to print the arrays. In 1994, Shalon started Synteni which was bought by Incyte in 1998. However, in 1982 Augenlicht and Korbin proposed a DNA array (Cancer Research) and in 1984 they made a 4000 element array to interrogate human cancer cells. (Rejected by Science, Nature and the NIH)
-2 0 +2 relative change AvgDiff S-score Use of S-score in Hierarchical Clustering of Brain Regional Expression Patterns NAC NAC VTA PFC VTA PFC HIP HIP
Candidate Gene Studies Cycles of Expression Profiling Merge with Biological Databases Expression Profiling: A Non-biased, Genomic Approach to Resolving the Mechanisms of Addiction
Utility of Expression Profiling • Non-biased, genome-wide • Hypothesis generating • Gene hunting • Pattern identification: • Insight into gene function • Molecular classification • Phenotypic mechanisms
Comparisons (S-score, d-chip) De-noise Statistical Filtering (e.g. SAM) Filtered Gene Lists GE Database (SQL Server) Clustering Techniques Hybridization and Scanning Overlay Biological Databases (PubGen, GenMAPP, QTL, etc.) Experimental Design Provisional Gene “Patterns” Molecular Validation (RT-PCR, in situ, Western) Candidate Genes Behavioral Validation
Synthesis and Analysis of 2-color Spotted cDNA Arrays: “Brown Chips”
Synthesis of High Density Oligonucleotide Arrays by Photolithography/Photochemistry
GeneChip Features • Parallel analysis of >30K human, rat or mouse genes/EST clusters with 15-20 oligos (25 mer) per gene/EST • entire genome analysis (human, yeast, mouse) • 3-4 orders of magnitude dynamic range (1-10,000 copies/cell) • quantitative for changes >25% ?? • SNP analysis
Rtase/ Pol II Total RNA dsDNA Biotin-cRNA T7 pol AAAA-T7 TTTT-5’ TTTT-T7 5’ AAAA CTP-biotin Hybridization Oligo(dT)-T7 Scanning Steptavidin- phycoerythrin PM MM Oligonucleotide Array Analysis
Stepwise Analysis of Microarray Data • Low-level analysis -- image analysis, expression quantitation • Primary analysis -- is there a change in expression? • Secondary analysis -- what genes show correlated patterns of expression? (supervised vs. unsupervised) • Tertiary analysis -- is there a phenotypic “trace” for a given expression pattern?
Affymetrix Arrays: Image Analysis “.DAT” file “.CEL” file
Affymetrix Arrays: PM-MM Difference Calculation Probe pairs control for non-specific hybridization of oligonucleotides
Variability in Ln(FC) Ln(FC1) (a) Ln(FC2)
Position Dependent Nearest Neighbor (PDNN) - 2003 • Zhang, Miles and Aldape, (2003) A model of molecular interactions on short oligogonucleotide microarrays: implications for probe design and data analysis. Nature Biotech. In Press.
Chip Normalization Procedures • Whole chip intensity • Assumes relatively few changes, uniform error/noise across chip and abundance classes • Spiked standards • Requires exquisite technical control, assumes uniform behavior • Internal Standards • Assumes no significant regulation • “Piece-wise” linear normalization
Normalization Confounds: Non-uniform Chip Behavior S-score Gene
http://www.ipam.ucla.edu/publications/fg2000/fgt_tspeed9.pdf Slide Normalization: Pieces and Pins “Lowess” normalization, Pin-specific Profiles After Print-tip Normalization See also: Schuchhardt, J. et al., NAR 28: e47 (2000)
Quality Assessment • Gene specific: R/G correlation, %BG, %spot • Array specific: normalization factor, % genes present, linearity, control/spike performance (e.g. 5’/3’ ratio, intensity) • Across arrays: linearity, correlation, background, normalization factors, noise
Statistical Analysis of Microarrays: “Not Your Father’s Oldsmobile”
Sources of Variability • Target Preparation • Group target preps • Chip Run • Minor, BUT… • Be aware of processing order • Chip Lot • Stagger lots across experiment if necessary • Chip Scanning Order • Cross and block chip scanning order
Secondary Analysis: Expression Patterns • Supervised multivariate analyses • Support vector machines • Non-supervised clustering methods • Hierarchical • K-means • SOM
-2 0 +2 relative change AvgDiff S-score Use of S-score in Hierarchical Clustering of Brain Regional Expression Patterns NAC NAC VTA PFC VTA PFC HIP HIP
Expression Profiling Prot-Prot Interactions BioMed Lit Relations Expression Networks HomoloGene Ontology Pharmacology Genetics Behavior
Array Analysis: Conclusions • Be careful! Assess quality control parameters rigorously • Single arrays or experiments are of limited value • Normalization and weighting for noise are critical procedures • Across investigator/platform/species comparisons will most easily be done with relative data