250 likes | 420 Views
Introduction Into The Gene Expression Platform of the IVM. 1. Principles and important terminology 2. RNA Preparation and quality controls 3. Data handling 4. Costs 5. Protocols 6. Information for collaboration partners 7. Downloads. 1. Principles and Terminology.
E N D
Introduction Into The Gene Expression Platform of the IVM 1. Principles and important terminology 2. RNA Preparation and quality controls 3. Data handling 4. Costs 5. Protocols 6. Information for collaboration partners 7. Downloads
1. Principles and Terminology 1. Principles and Terminology The human, murine, and other genome projects plus the availability of robust hardware- and software platforms to produce and evaluate microarrays have enabled genome-wide gene expression analyses, i.e. to quantify all mRNAs (> 30 000) of a total RNA extract relative to another RNA extract, within 48 hours. The platform used by the IVM (Affymetrix) is equipped with a hybridization oven, a washing station, a scanner and advanced software. The latter allows for mathematical, statistical, and information technology-based evaluation of the arrays.
1. Principles and Terminology Available: Whole Genome Arrays of Several Species Affymetrix produces expressionsarrays of several species (human, mouse, C. elegans and others; test the link !). These are available in different formats. Dependent on format and protocol 0,5 - 5 µg total RNA is required per array.
1. Principles and Terminology Production of Arrays Through Photolithography25mer socalled „Perfect Match“ (PM) oligonucleotides (ON) whose sequences are derived from the genome projects are synthesized on a glass slide. To subtract unspecific hybridizations a „Miss Match“ (MM) ON is also synthesized, that differs from the PM ON by a single nucleotide exchange at position 13. This results in PM – MM ON pairs, i.e. „probe pairs“. Signals of MM ONs are subtracted from the corresponding PM ON thereby enhancing sensitivity and specificity of each PM ON. Each mRNA sequence represented by a „Probe Set“ consists of 11 probe pairs. This allows for statistical analyses and thus quality assessment of each measurement.
1. Principles and Terminology The Principle: „Probe Set“ Miss Match (MM) Perfect Match (PM) Probe Pair Feature Nucleotidaustausch an Pos. 13 Probe Set
1. Principles and Terminology Synthesis of the „Probes“
1. Principles and Terminology Washing and Scanning Fluidics Station
1. Principles and Terminology • Background and „Noise“ • Percent present • Spiked oligo controls • poly(A)-RNA spikes Scanner electric and hybridization Probe quality and reproducibility Hybridization, Staining (Efficiency and Linearity) Quality of cRNA Synthesis • 3‘ - 5‘ Degradation Pattern of Housekeeping Genes Quality of cRNA probe (checks all procedures) Internal „Built-In“ Controls on The Array
2. RNA Preparation and Quality Control A high quality RNA preparation is critical to generate an array of high quality. Degradation and contamination need to be avoided. We recommend the Qiagen RNeasy Lipid Tissue Mini Kit. In addition, sample preps, storage conditions, and homogenization prior to RNA extraction are important. Protocols need to be worked out for each sample (cultured cell, tissue, type of organ).
2. RNA Preparation and Quality Controls RNA Preparation
2. RNA Preparation and Quality Controls RNA Integrity Using the “Agilent” System No short RNA fragments should be visible here 28s - 18s ratio >1.8 is required Example and Stages of RNA Degradation Short = degraded RNA fragments
3. Data Evaluation There are numerous approaches. Which one to choose depends on the questions asked in the experiment. Data evaluation is principally done in three steps: Raw data screening including „report“ on quality parameters. Statistical evaluation and application of „filters“. Annotation of genes and functional evaluation.
3. Data Evaluation Tools we Use to Evaluate Data Qualitäty Control GCOS (Report) Raw Data Analysis GCOS Data Bank GCOS Manager Statistics Excel, GeneSpring Function Evaluation GeneSpring, NetAffx, Gene Ontology, GenMapp, RefSeq, Unigene Clustering GeneSpring, Connect Raw Data with Software Access, GeneSpring, Excel Display Results GeneSpring, Excel, Fatigo
3. Data Evaluation Raw Data Evalution Using GCOS 7 x 7 Pixel per Feature DAT-File CEL-File A Number per Feature Signal Intensity giving Detection p-value per Probeset CHP-File For Each Arrray there is a „Report“ Giving Quality Check on Entire Experiment
3. Data Evaluation The „Call“ The statistics of the probe pairs, i.e. of a gene/mRNA, are converted by GCOS into a „call“. „Absent“ call (not detectable): Detection p-value > 0,065 „Marginal“ call (maybe detectable): Detection p-value 0,065 - 0,05 „Present call (expressed): Detection p-value < 0,05 Present means that the gene is significantly expressed, absent means gene is not expressed or expression is < sensitivity of probeset.
3. Data Evaluation The „Normalization“ To compare data from different arrays, data need to be adjusted or „normalized“. There are several possibilities to do that. We use: Standard: Scaling to a target value of 500 at mean. If saturated: Scaling to a target of 500 at median. Tests in general: Logarithmization and Scaling per gene at the 50th percentile.
3. Data Evaluation The „Scatter Plot“ Easiest evaluation of a 2 array experiment (control versus experimental) is the Scatter Plot. Results are plotted against each other logarithmically. Red: Present - Present; Yellow: Absent - Absent: Blue: Absent - Present > 30 fold differentially expressed gene FoldChange lines, 2x, 3x, 10x, 30x
3. Data Evaluation Statistics and Filters To perform statistics 3 repeated measurements are needed. This yields a p value. Filters then reduce the amount of data. . Filter: 1. Signal intensity value 2. Detection p-Wert 3. Fold Change 4. p-Wert of experiment . This results in a list of candidate genes that are - most likely - differentially expressed. The stringency of 1. to 4. determines the quality of the candidate list.
3. Data Evaluation Reduced „Straying“ Through Generation of Means
3. Data Evaluation Filter
3. Data Evaluation Combination of Filters: List of Genes
3. Data Evaluation The „Annotation“ • Problem: The investigator gets a list of genes that he doesn´t know: • Needed: Rapid procedure to identify the genes. • Generate data banks and structure your gene lists. • Test the links below ! List of Affymetrix Numbers via Access, GeneSpring, NetAffx Relate to Data Bank Terminology - Pubmed - UniGene - LocusLink / Entrez Gene - OMIM - Ensembl - ...
4. Cost The cost per array: 800.- € bis 1250.- €. Depending on: Array type and reagent/work load/experiment.
6. Information for collaborating partners • Contact per mail: • andreas.habenicht@mti.uni-jena.de • Discussion and advice • Sample transfer with „filled-in“ form (available at IVM) • Generation of Microarrays • Transfer of raw data files and Excel files • (Software tools available at IVM)
7. Downloads • Contract • Excel scheme for evaluating data • Manual for Excel scheme • Sheet „Project form“