1 / 25

Henrik Bengtsson hb@maths.lth.se Bioinformatics Group

cDNA Microarrays - an introduction. Henrik Bengtsson hb@maths.lth.se Bioinformatics Group Mathematical Statistics, Centre for Mathematical Sciences Lund University. Outline. The Genomic Code The Central Dogma of Biology The cDNA Microarray Technique

ronna
Download Presentation

Henrik Bengtsson hb@maths.lth.se Bioinformatics Group

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. cDNA Microarrays- an introduction Henrik Bengtsson hb@maths.lth.se Bioinformatics Group Mathematical Statistics, Centre for Mathematical Sciences Lund University

  2. Outline • The Genomic Code • The Central Dogma of Biology • The cDNA Microarray Technique • Data Analysis of cDNA Microarray Data • Statistical Problems • Take-home message

  3. The Genomic Code 22+1 chromosome pairs 120.000 genes ? 80.000 genes ? 35.000 genes ? or ? 3 180 000 000 bp

  4. DNA CCTGAGCCAACTATTGATGAA transcription RNA CCUGAGCCAACUAUUGAUGAA translation PEPTIDE Protein The Central Dogma of Biology

  5. The cDNA Microarray Technique • High-throughput measuring- 5000-20000 gene expressions at the same time • Identify genes that behaves different in different cell populations- tumor cells vs healthy cells- brain cells vs liver cells- same tissue different organisms • Time series experiments- gene expressions over time after treatment • ...

  6. Example of a cDNA Microarray

  7. cDNA clones (probes) excitation red laser green laser PCR product amplification purification emission Reference sample Tumor sample printing RNA RNA cDNA cDNA overlay images and normalise Hybridize 0.1nl / spot Overview scanning microarray analysis

  8. Creating the slides

  9. Reference sample Tumor sample RNA RNA cDNA cDNA Hybridize RNA Extraction & Hybridization

  10. Scanning & Image Analysis

  11. Data Output

  12. Biological question Differentially expressed genes Sample class prediction etc. Experimental design Microarray experiment 16-bit TIFF files Image analysis (Rfg, Rbg), (Gfg, Gbg) Normalization R, G Estimation Testing Clustering Discrimination Biological verification and interpretation

  13. Data Transformation Transformed data {(M,A)}n=1..5184: M = log2(R/G) (ratio), A = log2(R·G)1/2 = 1/2·log2(R·G) (intensity signal)  R=(22A+M)1/2, G=(22A-M)1/2 “Observed” data {(R,G)}n=1..5184: R= red channel signalG = green channel signal (background corrected or not)

  14. Normalization Biased towards the green channel & Intensity dependent artifacts

  15. Replicated measurements Scaled print-tip normalization Median Absolute Deviation (MAD) Scaling Averaging

  16. Identification of differentially expressed genes Extreme in M values? ...or extreme in some other statistics? Extreme in T values?

  17. List of genes that the biologist can understand and verify with other experiments Gene: MavgAavgT SE 2341-0.8610.9 -18.0 0.125 6412-0.7511.1 -14.7 0.102 6123-0.70 9.8 -12.2 0.121 1020.65 10.3 -14.5 0.136 20200.64 9.3 -11.9 0.118 31320.62 9.9 -14.4 0.090 4439-0.62 9.7 -14.6 0.088 2031-0.61 10.7 -13.7 0.087 657-0.60 9.2 -13.6 0.094 5020.58 10.0 -12.7 0.101 1239-0.58 9.8 -11.4 0.103 5392-0.57 9.9 -20.7 0.057 39210.52 11.3 13.5 0.083 ...

  18. Time Course Gene Expression Profiles

  19. Statistical Problems • Image analysis- what is foreground?- what is background? • Quality- which spots can we trust?- which slides can we trust? • Artifacts from preparing the RNA, the printing, the scanning etc. • Data cleanup • Normalization within an experiment:- when few genes change.- when many genes change.- dye-swap to minimize dye effects. • Normalization between experiments:- location and scaleeffects. • What is noise and what is variability? • Which genes are actuallyup- and down regulated? • P-values. • Planning of experiments:- what is best design?- what is an optimal sample sizes? • Classification:- of samples.- of genes. • Clustering:- of samples.- of genes. • Time course experiments. • Gene networks.- identification of pathways • ...

  20. 600 500 400 300 Number of papers 200 100 0 1995 1996 1997 1998 1999 2000 2001 (projected) Year Total microarray articles indexed in Medline

  21. Acknowledgments/Collaborators Statistics Dept, UC Berkeley: Sandrine Dudoit Terry Speed Yee Hwa Yang • Oncology Dept, Lund University: • Pär-Ola Bendahl • Åke Borg • Johan Vallon-Christersson • Enerst Gallo Research Inst., California: • Monica Moore • Karen Berger • Endocrinology, Lund University, Malmö: • Leif Groop • Peter Almgren • Lawrence Berkeley National Laboratory: • Saira Mian • Matt Callow • Mathematical Statistics, Chalmers University: • Olle Nerman • Staffan Nilsson • Dragi Anevski • CSIRO Image Analysis Group, Melbourne: • Michael Buckley

  22. Take-home message • Bioinformatics is the future! • More educated people are needed! • Statistics is fun when it is applied! • Master’s thesis project? Talk to us! http://www.maths.lth.se/matstat/bioinformatics/

  23. Finding genes in DNA sequence “This is one of the most challenging and interesting problems in computational biology at the moment. With so many genomes being sequenced so rapidly, it remains important to begin by identifying genes computationally.” – Terry Speed.

  24. DNA transcription RNA translation Protein The Central Dogma of Biology Challenges: Sequencing Fragment assembly Gene finding Linkage analysisetc Homology searches Annotation Isolation Sequencing RNA structure prediction Gene expression: microarraysetc Protein structure prediction Protein folding Homology searches Functional pathways Annotation

More Related