460 likes | 739 Views
Experimental Validation of Microarray Data. Patrick Tan MD PhD. Talk Outline. - Purposes of Experimental Validation - Validation in silico - Northern Blotting - PCR Assays (measure DNA/RNA) - Antibody Assays (measure protein) - Other molecular assays (CGH, SKY)
E N D
Experimental Validation of Microarray Data Patrick Tan MD PhD
Talk Outline - Purposes of Experimental Validation - Validation in silico - Northern Blotting - PCR Assays (measure DNA/RNA) - Antibody Assays (measure protein) - Other molecular assays (CGH, SKY) - Validation Across Centres and Populations - Phenotypic Validation
What is Validation? 1. To declare or make legally valid. 2. To mark with an indication of official sanction. 3. To establish the soundness of; corroborate. - How Robust (“sound”) are your findings? - Can they be replicated? - Can they be replicated by another laboratory/centre? - Are they dependent on the specific experimental design?
Why Validate Your Microarray Data? - To establish confidence in your scientific claims - To identify areas for future research (eg specific genes) - May lead to unexpected findings - A common requirement of reviewers (!)
Potential Sources of Error in Microarray Data - Cross-hybridization of probe sequences - Mistakes in Probe Assignment (particularly cDNA arrays) - Artefacts caused by sample processing (eg RNA amplication) - Artefacts caused by array experiments (eg Cy3/Cy5 dye-bias)
What is Required for Validation? Independent Verification : - Different experimental technique - Different researchers/laboratories - Different biological samples
Challenges of Validating Microarray Data - Microarrays are High-throughput, Validation techniques are Not - Different scales of comparison - eg different techniques for normalizing experimental measurements - Lack of Correlation between Transcriptosome and Proteome - Not inexpensive (eg Taqman Probes)
A Typical Validation Scenario Biological Samples Differentially Regulated Genes (Candidates) Array Experiment Selection of Genes for Validation Statistical Analysis Experimental Validation
Which Genes are “Suitable” for Validation? - Strong Regulation - Abundant absolute expression - Reasonably well-characterized (Reagents may be available) - Relevant to Further Research (ESTs?)
Validation in silico (Validation by internet) - Have these genes been previously identified in the literature and shown to be regulated? - Usually performed to assess data quality or as a preliminary analysis
Example : In silico Validation Comparative Genomics of Burkholderia pseudomallei • Gram negative bacterium. • Environmental saphrophyte endemic to SEA • Potential Biowarfare Agent (USA Category B) • Causative agent of the disease Melioidosis
Use of Whole-Genome B. pseudomallei DNA Microarrays • A Discovery Platform for • Strain Typing Markers • Patterns of Genetic Variation • Mechanisms of Virulence
Genomes Compared • B. pseudomallei • – 18 Regional Isolates • B. mallei • – 2 isolates • - Causative agent of glanders • - Biowarfare Category B • - Distinct Ecological Niche (Non-rhizospheric) • B. thailandensis • - 3 isolates • - V. closely related to B. pmbut clinically avirulent
Reference Strain B pm K96243 Sample Isolate e.g. B pm #22 B mallei B thailandensis Extract Genomic DNA Label with Cy5-dCTP Extract Genomic DNA Label with Cy3-dCTP BP 76 Competitive Hybridisation Array-Based Comparative Genomic Hybridization
B. pseudomalleivsB. thialandensis • A cluster of 21 Genes controls Type I O-PS Production in B. pseudomallei (Reckseidler et.al. (2001)) • 11 genes from this cluster are deleted from B. thailandensis by Southern Blot analysis • Are these deletions also present in the microarray data? • Are there additional deletions in this cluster?
B. pseudomalleivsB. thialandensis • 9 genes found commonly deleted in both studies. • 2 genes – wcbF and wcbK not represented in our microarray. • Additional 7 genes found deleted using microarray.
Advantages of in silico Validation - Can be performed in a batch manner - Rapid results - Availability of multiple public databases (eg Pubmed, SAGE, etc) - Cheap! Disadvantages - Requires prior knowledge - Requires confidence in data from others - Findings are not Novel
Validation by Northern Blotting - Arguably still the ‘gold standard’ for validation - Well-established in the literature - Experimental properties are well-known
RNA Mixture RNA RNA A Typical Northern Blot Adapted from http://oregonstate.edu/instruction/bb492/fignames/Southernblot.jpeg
Example of Northern Blotting “A Global Profile of Germline Gene Expression in C. elegans” Reinke et al., 2000. Molecular Cell 6, 605-616 - Used C. elegans DNA microarrays to compare the gene expression profiles of : (A) Wild-type worms vs Mutants with no germline (glp-4) (B) Oocyte-only (fem-1) vs Sperm-only (fem-3gf)
Identification of a Candidate Gene : pgl-1 Properties of pgl-1 - Highly expressed in germline (ie downregulated in glp-4) - Sex-biased expression : increased expression in oocytes) (ie upregulated in fem-2 vs fem-3gf) - Increasing expression in successive larval stages (L1-L4) - Validation of pgl-1 via Northern Blotting
Northern: Fold change (Northern) 100 2.6 12.5 11.1 1.7 Control Gene Kawasaki et al., 1998 WT adult no germline oocytes sperm adult L3 adult L4 adult L2 Microarray: 2.4 1.8 9.2 1.9 4.8 Comparing pgl-1 expression by Northern and Microarray Fold change
Advantages of Northern Blotting - Very sensitive - Blots are reusable - Technical protocol is relatively simple - Can detect mRNA splice variants Disadvantages - Use of radioactivity (although non-radioactive techniques are available) - Laborious if many genes need to be tested - Assay is time-consuming
Microarray Data vs Northern Blotting - Good preservation of trends - Microarray Ratios tend to be ‘compressed’ vs Northern - Northern Blots may detect subtle regulations missed by array
Validation by PCR Methodologies - May replace Northern Blotting as a ‘gold standard’ - Rapid developments in the field - Diverse variations on a common technique : A) Non-quantitative techniques B) Quantitative techniques
Basic PCR From www.faseb.org/opar/bloodsupply/ pcr.html
Major Variations of PCR Qualitative Quantitative Presence or Absence Relative Abundance Measurement Type Normalization Binary Control Genes Specialized Equipment And Reagents Standard PCR Cost
Example of Qualitative PCR - Array-CGH Data Set for Burkholderia pseudomallei - Identified Genes that are Differentially Present Between : (A) B. pseudomallei vs B. mallei (B) B. pseudomallei vs B. thailandensis (C) Between different isolates of B. pseudomallei
ORFs Deleted in B. mallei ORFs Deleted in B. thailandensis P P P P P T M M P P P P P T M M P P P P P T M M P P P P P T M M 1.5kb – 1.5kb – 100bp – 100bp – 3548902 3551702 3534302 3534002 ORFs Deleted in B. mallei and B. thailandensis ORFs Deleted Between Bpm Isolates P P P P P T M M P P P P P T M M P P P P P P P P P P 3433202 3542302 3490802 3528002 1.5kb – 100bp – Experimental Validation Using Qualitative PCR
B. pseudomallei Strain 576 is an Atypical Strain B. thailandensis B. mallei Group I Group III Group II
Gene Function Probe Status in 576 rmlA Glucose-1-phosphate thymidyltransferase + Not deleted rmlC dTDP-4-keto-6-deoxy-D-glucose 3,5 epimerase + --- rmlD dTDP-4-keto-L-rhamnose reductase + --- wzm ABC-2 Transporter + --- wzt ABC-2 Transporter + --- wbiA LPS-O-antigen Acetylase + --- wbiB UDP-glucose 4-epimerase + --- wbiC glycolsyltransferase + --- wbiD dihydroxypolyprenylbenzoate methyltransferase + --- wbiE UDP-hexose transferase + --- wbiF Rhamnosyltransferase + --- wbiG UDP-glucose-4-epimerase + --- wbiH UP-N-acetyl fucosamine transferase + --- wbiI Epimerase/dehydratase + --- Orf1 UP-N-acetylglucosaminyltransferase + Not deleted Orf2 UDP-glucose-4-epimerase + Not deleted
Strain 576 K96243
Semi-Quantitative PCR - Compares relative accumulation of amplified products during the PCR procedure - Can be performed using standard reagents - Aliquots are periodically removed from the PCR reaction (eg 10, 15, 20, 25 cycles) and analyzed on an agarose gel
Example of Semi-Quantitative PCR Comparing the Transcriptional Changes Associated with Maturation of Dendritic Cells Mature Dendritic Immature Dendritic Monocytes Macrophage - Used cDNA microarrays to compare the gene expression profiles of Monocytes, Macrophages, Immature Dendritic and Mature Dendritic cells
Control Gene Candidate Genes : TARC (Expressed in Dendritic Cells) RGS1 (Expressed in Monocytes and mDendritic) Removed after 20 cycles (usu >30 cycles) 1 - Monocytes 2 - Macrophages 3 - Activated Macrophages 4 - Immature Dendritic Cells 5 - Mature Dendritic Cells
Quantitative PCR (“Real-time PCR”) - Makes use of flouresent labels to monitor progress of PCR amplification - Two major methods : 1) Double-stranded DNA binding agents (eg SYBR Green) 2) Sequence Specific Probes (eg Taqman, Mol Beacons)
From http://www.sigmaaldrich.com/img/assets/6600/sg_ls_mb_pcrxdiagram.gif
Taqman PCR (Exploitation of FRET) Flouresence from Reporter is Quenched Cleavage and Release Of Reporter From www.probes.com/handbook/ figures/0710.html
Some Considerations in Quantitative PCR - Design of Gene-specific Primers A) Confirm Suitability of Primers for PCR (eg Primer3) B) Specificity of primer sequences (eg BLAST) C) 3’ bias of primers - Selection of Control Genes A) ‘Housekeeping’ Genes - not expected to vary eg b-actin, GADPH, 16S Ribosomal RNA B) Control primers should bind at comparable efficiency to target sequences C) Expression of control genes should be comparable to target sequences
Example of Quantitative PCR Gene Expression Differencs Between Estrogen Receptor Postive (ER+ve) And ER Negative Breast Cancers - Used oligonucleotide microarrays to compare the gene expression profiles of ER+ve and ER-ve primary human breast tumors - ER status predetermined using conventional histopathology
ER- ER+ ER+ ER- ER- ER+ Quantification Graph of ESR1 Gene (3 ER +ve and 3 ER -ve samples) Calibrator ER +ve samples ER -ve samples
ER- ER+ ER+ ER- ER- ER+ Melt Curves of ESR1 Gene and 18S rRNA (Control) Gene ER+ and ER- 18S { { { { { {
(ER-) (ER+) (ER+) (ER-) (ER-) (ER+) Results for ESR1 gene
Advantages of Validation by PCR - Well-accepted standard for validation - Rapid (approx 30-45 min) - Requires very small amounts of sample - Multiple genes can be tested in individual reactions - Can potentially be multiplexed Disadvantages - Need to be careful in primer design (esp 20-mers) - Quantitative PCR requires special equipment - Lack of comparisons between different real-time technologies