440 likes | 670 Views
Affymetrix case study. Jesper Jørgensen NsGene A/S jrj@nsgene.dk. Overview. Affymetrix GeneChip technology Data processing Expression level Normalisation Fold change Statistics Parkinson disease Ventral versus dorsal midbrain (case study) Verification of array data Q-PCR
E N D
Affymetrix case study Jesper Jørgensen NsGene A/S jrj@nsgene.dk
Overview • Affymetrix GeneChip technology • Data processing • Expression level • Normalisation • Fold change • Statistics • Parkinson disease • Ventral versus dorsal midbrain (case study) • Verification of array data • Q-PCR • In situ hybridization • Immunohistochemistry
Expression profiling • Expression profiling • Investigate mRNA expression profile. • Compare gene expression between two or more situations. • Case versus control. • Profiling methods • Differential display. • SAGE (Serial Analysis of Gene Expression) • Micro array (Custom spotted arrays / Affymetrix GeneChip).
Gene 5’ 3’ Mulitple oligo probes PM MM Affymetrix GeneChip technology Figure adapted from: David Givol, Weizman Institute of Science, http://www.weizmann.ac.il/home/ligivol/research_interests.html
Gene 5’ 3’ Mulitple oligo probes PM MM Affymetrix GeneChip technology Figure adapted from: David Givol, Weizman Institute of Science, http://www.weizmann.ac.il/home/ligivol/research_interests.html
Probe set design A probe set = 11-20 PM,MM pairs(Probe design is not optimized)
Gene 5’ 3’ Mulitple oligo probes PM MM Affymetrix GeneChip technology Figure adapted from: David Givol, Weizman Institute of Science, http://www.weizmann.ac.il/home/ligivol/research_interests.html
Preparation of samples for GeneChip U133B U133A Amplification (T7 RNA polymerase) Figure modified from: Knudsen (2002), “A Biologist's Guide to Analysis of DNA Microarray Data", Wiley.
Overview • Affymetrix GeneChip technology • Data processing • Expression level • Normalisation • Fold change • Statistics • Parkinson disease • Ventral versus dorsal mesencephalon (case study) • Verification of array data • Q-PCR • In situ hybridization • Immune histochemistry
Expression level (probe signal) Li-Wong model n: scaling factor obtained by fitting Several other models exists. Irizarry et al. (2002) uses log transformed PM values after carrying out a global background adjustment and across array normalisation. Irrizary et al. (2002) Biostatistics
qspline normalisation (M/A plot) Before Assumption: Most genes are unchanged. M/A plot: Raw chip data are used to plot, for each probe, the logarithm of the ratio between two chips versus the logarithm of the mean expression for the two chips. After Workman et al., (2002) Genome Biology, vol. 3, No. 9.
Variation A/A B/B Two different amplifications of the same RNA applied to GeneChips
Log fold (2) Fold change Fold change (Log fold) • Fold change = sample/control • Log transformation makes scale symmetric around 0 • All data log2 transformed
Statistical testing Is the regulation significant? • Student and Welch’s t-test • ANOVA • SAM • Wilcoxon • Kruskal-Wallis • Westfall-Young • ………..
Bonferroni correction At a P-value of 0.05 you expect: • 5 false positives if you look at 100 genes • 1200 false positives if you look at 24.000 genes Increased likelihood of getting a significant result by chance alone If you want 25% chance of having only one false positive in the list of regulated genes, you should only consider P-values more significant than the Bonferroni corrected cutoff. • 2.5x10-3 (0.25/100) if you look at 100 genes • 1.0x10-5 (0.25/24.000) if you look at 24.000 genes
Overview • Affymetrix GeneChip technology • Data processing • Expression level • Normalisation • Fold change • Statistics • Parkinson disease • Ventral versus dorsal mesencephalon (case study) • Verification of array data • Q-PCR • In situ hybridization • Immune histochemistry
Parkinson’s Disease (PD) • A fairly common neurodegenerative disorder (app. 2 million in USA/Europe) • Due to loss of the dopamine-producing neurons in the Substantia Nigra • Cardinal motor symptoms: tremor, rigidity and bradykinesia • Conventional treatment does not halt the progression nerve cell loss
Fetal Transplantation for PD • Cells from the developing midbrain (A) • are collected and dissociated (B) • and transplanted into the striatum (C) • The cells will integrate with the host brain and produce dopamine.
Stem cells in Parkinson disease • Langston JW., J Clin Invest. 2005 Jan;115(1):23-5.
Overview • Affymetrix GeneChip technology • Data processing • Expression level • Normalisation • Fold change • Statistics • Parkinson disease • Ventral versus dorsal mesencephalon (case study) • Verification of array data • Q-PCR • In situ hybridization • Immune histochemistry
* TH IHC Aim • In the human fetus, DA neurons can be found in the ventral part of the tegmentum (VT) from approximately 6 weeks. • In contrast, no DA neurons can be found in the neighboring dorsal part (DT). • We aim at finding genes associated with DA differentiation by using GeneChips to compare the expression profiles of VT and DT.
8wVT (A) 8wDT (A) 8wVT (B) 8wDT (B) High quality RNA from 8w GA human ventral midbrain
Experimental setup • Compare VT against DT (3x3) • Affymetrix Human Genome U133 Chip Set • HG-U133A: Well substantiated genes • HG-U133B: Mostly EST’s • Total: 45,000 probes (genome) A DORSAL B DORSAL C DORSAL A VENTRAL B VENTRAL C VENTRAL
U133A data permutations and filter • Red: VM versus DM: • VM (A1 VENTRAL, A2 VENTRAL, B VENTRAL) • DM (A1 DORSAL, A2 DORSAL, B DORSAL) • Other colors: Permutations • Low-stringency filter as dotted line: • Average expression > 50 • P-value < 0.04 • SLR>0.5 (42% up-regulation in VM) • Arrange with descending fold change. SLR
Genes up-regulated in VM on U133A Low-stringency filter: Average expression > 50, P-value<0.04, SLR>0.5 arranged with descending fold change. Total list 107 probes. Only SLR>1 displayed.
Literature verification • ALDH1A • DAT1 • VMAT2 • TH • Calbindin, 28kDa • HNF3a • 3x Nurr1 • 2x IGF • 4x SNCA • 4x DRD2 • KCNJ6 (Girk2) • Ret • PITX3 • BDNF • DLK1 (FA1) • SLC17A6 (VGLUT2) • EPHA5 • ERBB4
Overview • Affymetrix GeneChip technology • Data processing • Expression level • Normalisation • Fold change • Statistics • Parkinson disease • Ventral versus dorsal mesencephalon (case study) • Verification of array data • Q-PCR • In situ hybridization • Immune histochemistry
Verification of array data Array Data (100 candiate genes) Validation on array material (confirmation) Validation on new samples (universality) Desk work Statistics Literature Bioinformatics RNA Q-PCR ISH Northerns Protein IHC ELISA Westerns
ALDH1A1 RT-PCR cDNA#253 (VM) cDNA#257 (DM) cDNA#245 (DM) cDNA#254 (DM) cDNA#256 (VM) cDNA#244 (VM) 299bp 30x 299bp 40 30 35 35x
TH Q-PCR on a developmental series of subdissected human embryonic and fetal brain material OD260/280 were measured to 1.88 +/- 0.05 for all RNA samples
Q-PCR analysis and clustering OD260/280 were measured to 1.88 +/- 0.05 for all RNA samples
Fold change in a mixed population 1.5 fold up-regulation from no expression 1.5 fold up-regulation from some expression
Verification of array data Array Data (100 candiate genes) Validation on array material (confirmation) Validation on new samples (universality) Desk work Statistics Literature Bioinformatics RNA Q-PCR ISH Northerns Protein IHC ELISA Westerns
GeneChip verification with ISH ISH from: Vernay et al., J Neurosci. 2005 May 11;25(19):4856-67.
Verification of array data Array Data (100 candiate genes) Validation on array material (confirmation) Validation on new samples (universality) Desk work Statistics Literature Bioinformatics RNA Q-PCR ISH Northerns Protein IHC ELISA Westerns
GeneChip verification with IHC Courtesy of Josephine Jensen
Conclusions • Using arrays one will get at snapshot of the expression profile under the conditions investigated. • Careful experimental design • RNA quantity and quality are important • Since a single array experiment generates thousands of data points, the primary challenge of the technique is to make sense of data. • Calculations/Statistics (back and forth) • Literature mining • Independent methods are needed for verification • Q-PCR • In situ hybridization (ISH) • Immunohistochemistry (IHC)
NsGene, Ballerup, Denmark (http://www.nsgene.com/) Lars Wahlberg Bengt Juliusson Teit Johansen Neurotech, Huddinge University Hospital, Sweden Åke Seiger Department of Medical Genetics, IMBG, Panum Institute, Denmark Claus Hansen Karen Friis Wallenberg Neuroscience Center, Sweden Anders Björklund Josephine Jensen Elin Andersson CBS, DTU, Denmark Søren Brunak Steen Knudsen Nikolaj Blom Thomas Nordahl Petersen Acknowledgements