1 / 50

Design and Analysis of Microarray Experiments at CSIRO Livestock Industries

Design and Analysis of Microarray Experiments at CSIRO Livestock Industries. Toni Reverter Bioinformatics Group CSIRO Livestock Industries Queensland Bioscience Precinct 306 Carmody Rd., St. Lucia, QLD 4067, Australia. SSAI – QLD Branch – 6 Apr. 2004.

leora
Download Presentation

Design and Analysis of Microarray Experiments at CSIRO Livestock Industries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Toni Reverter Bioinformatics Group CSIRO Livestock Industries Queensland Bioscience Precinct 306 Carmody Rd., St. Lucia, QLD 4067, Australia SSAI – QLD Branch – 6 Apr. 2004

  2. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries CONTENTS Slides Minutes • Introduction …………………………… 4 6 • Technical Concerns ……...……………. 2 7 • Designs ………………..………………. 21 15 • Analysis ……………..………………… 14 16 • Coverage and Sensitivity ...……………. 5 7 • Summary …………....………………… 2 4 SSAI – QLD Branch – 6 Apr. 2004

  3. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries This is a Cow This is a Pig (female) This is a Sheep This is a Chicken 1. Introduction 1.a – The Material SSAI – QLD Branch – 6 Apr. 2004

  4. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Tissue Samples Treat A Treat B Analysis mRNA Extraction & Amplification + Image Capture cDNA “A” Cy5 cDNA “B” Cy3 Laser 1 Laser 2 Hybridization Optical Scanner 1. Introduction 1.b - The Method SSAI – QLD Branch – 6 Apr. 2004

  5. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Logical cDNA Distribution Quantitative Computer Sci. Statisticians Mathematicians ……. Non-Q Biochemists Physiologists Pathologists ……. 1800s – DATA 30-60s – METHODS 50-70s – SOFTWARE 1980s – COMPUTER     BANANA EGG Source Size “banana omelette” Historical Excitement Balance Interdisciplinary 1. Introduction 1.c - The Challenge Data Dependent Time Dependent Human Dependent Chronology Skill Integration Paradigm SSAI – QLD Branch – 6 Apr. 2004

  6. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries “The majority of microarray papers are analysed with substandard methods” C Tilstone (citing D Allison), Nature 2003, 424:610 CLAIM REASONS P Value 1. Introduction 1.c – Human-Dependent Challenge JOKE The Biologist and the Statistician are being executed. They are both granted one last request. The Statistician asks that he/she be allowed to give one final lecture on his/her Grand Theory of Statistics. The Biologist asks that he/she be executed first. • Biologists don’t care ………………………………… 10 • Statisticians are bad …………………………………. 20 • Unrealistic expectations ……………………………… 70 SSAI – QLD Branch – 6 Apr. 2004

  7. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries • Replication: • Animal • Sample • Array • Spot 2. Technical Concerns • Biochemist Level: • Preparation (Printing) of the Chip • RNA Extraction, Amplification and Hybridisation • Optical Scanner (Reading) • Quantitative Level: • Design • Image (data) Quality • Data Analysis • Data Storage Note:Randomisation intentionally neglected. SSAI – QLD Branch – 6 Apr. 2004

  8. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 2. Technical Concerns 2.a – Data Quality: GP3xCLI 2.b – Storage: GEXEX SSAI – QLD Branch – 6 Apr. 2004

  9. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Put more arrays on key questions Pooling? $ • Dye-Swap • Dye-Balancing • Self-Self Evaluation of Designs: O A O A O A B AB B AB B AB Loop All-Pairs Reference Variance of Estimated Effects(Relative to the All-Pairs) Reference 1 1 3 2 Loop 4/3 1 8/3 1 All-Pairs 1 1 2 1 Main effect of A Main effect of B Interaction AB Contrast A-B 3. Experimental Designs Key Issues: • Identify/Prioritise Questions • N of Available Samples • N of Available Arrays • Consider Dye Bias SSAI – QLD Branch – 6 Apr. 2004

  10. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries • Samples vs Slides vs Configurations Samples (S) 3 4 12 (S-1) 2 3 11 Arrays S(S-1) 6 12 132 3. Experimental Designs Glonek & SolomonFactorial and Time Course Designs for cDNA Microarray Experiments • Definition • A design with a total of n slides and design matrix X is said to be admissible • if there exists no other design with n slides and design matrix X* such that • ci*  ci • For all i with strict inequality for at least one i. Where ci* and ci are respectively • the diagonal elements of (X*’X*)-1 and (X’X)-1. N of Configurations? SSAI – QLD Branch – 6 Apr. 2004

  11. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 3. Experimental Designs N of Configurations? SA-1 SSAI – QLD Branch – 6 Apr. 2004

  12. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 3. Experimental Designs N of Configurations? Pie-Bald black Non-Pie-Bald black Normal White Recessive SA-1 = 53 = 125 SSAI – QLD Branch – 6 Apr. 2004

  13. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 3. Experimental Designs x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 x5 SSAI – QLD Branch – 6 Apr. 2004

  14. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 3. Experimental Designs N of Configurations? 0 hr 24 hr SA-1 = 109 = 1 Billion! SSAI – QLD Branch – 6 Apr. 2004

  15. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 3. Experimental Designs Transitivity (Townsend, 2003) & Extendability (Kerr, 2003) Opt 2: 10 Slides Opt 1: 10 Slides Opt 3: 11 Slides Opt 4: 9 Slides Opt 5: 9 Slides SSAI – QLD Branch – 6 Apr. 2004

  16. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 3. Experimental Designs N of Configurations? 0 hr 24 hr SA-1 = 1210 = 62 Billion! SSAI – QLD Branch – 6 Apr. 2004

  17. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 3. Experimental Designs N of Configurations? 0 hr 24 hr R G R G G R G R G R R G G R R G G R R G G R R G SSAI – QLD Branch – 6 Apr. 2004

  18. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 3. Experimental Designs Handling Constraints (Samples & Arrays): Pooling & Replication • Pavlidis et al.(2003) The effect of replication on gene • Expression microarray experiments. Bioinformatics 19:1620 >= 5 Replicates 10-15 Replicates • Peng et al.(2003) Statistical implications of pooling RNA • Samples for microarray experiments. BMC Bioinformatics 4:26 Power: n9c9  95%, n3c3  50%, n9c3  90% n25c5  n20c20 SSAI – QLD Branch – 6 Apr. 2004

  19. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 3. Experimental Designs Pooling & Replication R G F HS G R R M TM G R N of Arrays? F HS 24: 23 To 552 R G pooling M HS G G G G R F TM 14: 13 To 182 R R R M HS R R G G G F HS R G R G M HS R G SSAI – QLD Branch – 6 Apr. 2004

  20. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 3. Experimental Designs Pooling & Replication Reference Design Sum(ABS) 26.8 26.8 39.1 23.1 17.3 7.1 7.1 14.3 14.3 SSAI – QLD Branch – 6 Apr. 2004

  21. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 3. Experimental Designs Another (NEW?) Constraint: Amount of RNA A M avium slope 18 days 3 3-3-3 M avium broth 18 days 10 1-2-2-1-2-1-2-1-2-1 B M para broth 10 weeks 5 1-2-2-1-1 C M para broth 12 weeks 6 1-1-4-5-2-1 D M para in-vivo 3 1-1-1 E SSAI – QLD Branch – 6 Apr. 2004

  22. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 3. Experimental Designs Another (NEW?) Constraint: Amount of RNA   A B  A C   Importance due to Transitivity of AB with BC and BD A D   A E     B C    B D Procedure: Five configurations will be proposed and the statistical optimality of each evaluated. B E    C D C E  D E    SSAI – QLD Branch – 6 Apr. 2004

  23. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 3 3 3 1 2 2 1 2 1 2 1 2 1 1 2 2 1 1 1 1 4 5 2 1 1 1 1 SSAI – QLD Branch – 6 Apr. 2004

  24. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Configuration 1 3 3 3 1 2 2 1 2 1 2 1 2 1 1 2 2 1 1 1 1 4 5 2 1 1 1 1 SSAI – QLD Branch – 6 Apr. 2004

  25. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Configuration 2 3 3 3 1 2 2 1 2 1 2 1 2 1 1 2 2 1 1 1 1 4 5 2 1 1 1 1 SSAI – QLD Branch – 6 Apr. 2004

  26. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Configuration 3 3 3 3 1 2 2 1 2 1 2 1 2 1 1 2 2 1 1 1 1 4 5 2 1 1 1 1 SSAI – QLD Branch – 6 Apr. 2004

  27. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Configuration 4 3 3 3 1 2 2 1 2 1 2 1 2 1 1 2 2 1 1 1 1 4 5 2 1 1 1 1 SSAI – QLD Branch – 6 Apr. 2004

  28. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Configuration 5 3 3 3 1 2 2 1 2 1 2 1 2 1 1 2 2 1 1 1 1 4 5 2 1 1 1 1 SSAI – QLD Branch – 6 Apr. 2004

  29. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Imp Weight Squared Error 1 2 3 4 5 1 2 3 4 5 4 6 5 6 6 5 4 1 4 4 1 2 0 2 1 0 0 4 0 1 4 4 2 3 2 2 3 4 1 0 0 1 4 1 0 0 0 0 0 1 1 1 1 1 3 5 5 4 4 5 4 4 1 1 4 4 4 5 5 5 5 0 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 2 2 0 2 3 2 0 4 0 1 0 1 0 0 0 0 0 1 1 1 1 1 4 3 3 3 3 3 1 1 1 1 1 SSE 17 14 11 16 18 0 1 2 1 0 0 MSE .74 .64 .48 .66 .75 A B A C A D A E Conclusion: Configuration 3 B C B D B E C D C E D E Noise D D SSAI – QLD Branch – 6 Apr. 2004

  30. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 4. Data Analysis My (EDUCATED?) View: • Relaxed data acquisition criteria • Signal to Noise > 1.00 (relaxer (sp?) exist) • Mean to Median > 0.85 (Tran et al. 2002) • Moving away from • Ratios • “heavy-duty” normalisation techniques • Mixed-Model Equations • Check residuals • Check REML estimates of Variance Components • Proportion of Total V due to Gene x Variety • Process results Gene x Treatment • Mixtures of Distributions SSAI – QLD Branch – 6 Apr. 2004

  31. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 4. Data Analysis Mixed-Model Equations Log2 Intensities Residual (RANDOM) Gene x Variety (RANDOM) Comparison Group Array|Block|Dye (FIXED) Main Gene Effect (RANDOM) Gene x Array|Block (RANDOM) DE Genes Gene x Dye (RANDOM) Note: missing but (generally) unimportant. SSAI – QLD Branch – 6 Apr. 2004

  32. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 4. Data Analysis Mixed-Model Equations Log2(Int.) = CG + Gene + GDye + GArray + GVariety + Error Control of FDR The proportion of the Total Variation accounted for by the G x Variety Interaction anticipates the proportion of DE Genes CLAIM SSAI – QLD Branch – 6 Apr. 2004

  33. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries Y11 197,802 9.33 1.99 5.17 15.99 768 257.5 139 343 Y12 74,030 10.82 1.91 4.95 15.99 576 128.5 22 243 Y21 110,308 9.99 2.07 4.25 15.99 576 191.5 27 319 Y22 116,409 9.89 2.09 5.17 15.99 576 202.1 19 318 Y23 117,687 10.38 2.04 4.91 15.99 576 204.3 36 320 Y31 106,591 10.11 1.77 6.60 15.99 672 158.6 37 278 Y32 236,671 9.44 2.11 5.36 15.99 1,440 164.3 57 269 4. Data Analysis ObservationsComparison Groups Levels Observations N Mean SD Min Max Mean Min Max SSAI – QLD Branch – 6 Apr. 2004

  34. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 4. Data Analysis • 54 Array Slides • 959,498 Valid Intensity Records (S2N>1, M2M>0.85) • 7,638 Elements (genes) • 752,476 Equations • 56 (Co)Variance Components (REML) • BAYESMIX (Bayesian Mixtures of distributions) SSAI – QLD Branch – 6 Apr. 2004

  35. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 4. Data Analysis 56 (Co)Variance Components SSAI – QLD Branch – 6 Apr. 2004

  36. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 4. Data Analysis % Total Variance Due to: • Error 3.0 – 3.6 5.1 – 6.7 3.0 – 3.7 • Gene 83.6 – 90.4 78.3 – 81.9 47.5 – 83.9 • Gene x Array 3.5 – 9.8 10.4 – 12.6 10.6 – 43.5 • Gene x Variety 2.4 – 3.7 2.1 – 2.6 2.5 – 5.4 • Genetic Correlations Moderate (EXP3) to Strong • Gene  Variety Corr Strong (EXP1) to Moderate (EXP2) SSAI – QLD Branch – 6 Apr. 2004

  37. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 4. Data Analysis Measures of (Possible) Differential Expression i = 1, …, 7,638 genes j = 1, …, 7 variables t = 0, …, 5 time points (EXP3 only) • Other measure definitions could also be valid SSAI – QLD Branch – 6 Apr. 2004

  38. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 4. Data Analysis Mixtures of Distributions SSAI – QLD Branch – 6 Apr. 2004

  39. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 4. Data Analysis Mixtures of Distributions SSAI – QLD Branch – 6 Apr. 2004

  40. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 4. Data Analysis Differentially Expressed Genes Exp1 Exp2 Exp3 Up Down Up Down Up Down High-Low Up 409 0 26 13 36 11 Down 41 3 0 5 0 HOL-JBL Up 68 0 0 8 Down 319 10 6 TSS-UTS Up 252 0 Down 109 10 DE Elements across the 3 Exp (2 UP/DOWN/UP; 8 UP/UP/DOWN) SSAI – QLD Branch – 6 Apr. 2004

  41. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 4. Data Analysis Residuals Plots SSAI – QLD Branch – 6 Apr. 2004

  42. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 4. Data Analysis 178 @ Day 82 • Homologs • Orthologs • Paralogs Allocation of 238 DE Genes 55 123 30 93 12 43 40 11 42 36 53 36 36 46 36 10 75 41 5 5 22 5 14 5 114 @ Day 105 171 @ Inguinal 24 26 21 81 27 26 99 39 44 130 25 12 43 12 12 31 12 22 23 45 16 55 71 68 Bovine Up-Regulated Down-Regulated Ovine 139 @ Day 120 SSAI – QLD Branch – 6 Apr. 2004

  43. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 4. Data Analysis The “Real” Target: Molecular Interaction Maps Adapted from Aladjem et al. 2004, Sciences’s STKE SSAI – QLD Branch – 6 Apr. 2004

  44. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries MPSS Test Data No Tags = 25,503 S 1 S 2 100.00 100.00 57.14 49.87 36.11 33.66 10.89 10.74 5.73 5.67 1.21 1.13 0.57 0.55 0.15 0.11 0.05 0.05 cDNA Noise Paper PNAS 02, 99:14031 100.00 56.19 36.79 11.76 6.95 1.94 1.11 0.29 0.16 5. Coverage and Sensitivity MPSS Paper PNAS 03, 100:4702 tpm N Tags % > 1 (0.0) 27,965 100.00 5 (0.7) 15,145 54.16 10 (1.0) 10,519 37.61 50 (1.7) 3,261 11.66 100 (2.0) 1,719 6.15 500 (2.7) 298 1.07 1,000 (3.0) 154 0.55 5,000 (3.7) 26 0.09 10,000 (4.0) 7 0.02 SSAI – QLD Branch – 6 Apr. 2004

  45. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 5. Coverage and Sensitivity SSAI – QLD Branch – 6 Apr. 2004

  46. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries % Flat line (except Upper Bound) x 5. Coverage and Sensitivity Let NT = N of “Total” Genes ND = N of “Differentially Expressed” Genes (ND  NT) • The relevance of f(xi) is limited to the Concentration  Signal mapping. • At equilibrium the probability of an error either way equals. SSAI – QLD Branch – 6 Apr. 2004

  47. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 5. Coverage and Sensitivity ~ 5 tpm ~ 100 tpm SSAI – QLD Branch – 6 Apr. 2004

  48. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 5. Coverage and Sensitivity  <   =   >  Not many DE genes High Confidence Few False +ve Lots of DE genes High Power Few False -ve SSAI – QLD Branch – 6 Apr. 2004

  49. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 6. Summary • General (ie. not only CSIRO LI): • Still in its infancy (…possibly even embryonic stage) • Many decisions have a heuristic rather than a theoretical foundation • Prone to miss-conceptions: • Amount of Expression = Amount of Response • Same cut-off point to judge all genes • Over-emphasis in normalization (hence, despise “Boutique Arrays”) • Over-emphasis in variance stabilization • Over-emphasis in controlling false-positives • Over-emphasis in biological replicates (DANGER ) • No hope for a “One size fits all” software (even method) • Safer to aim towards “Tailor to individual’s needs” • Integration of interdisciplinary skills is a must SSAI – QLD Branch – 6 Apr. 2004

  50. Design and Analysis of Microarray Experiments at CSIRO Livestock Industries 6. Summary • Livestock Species: • Tailing humans (…at the moment) • Andersson & Georges (2004) Domestic-animal genomics: Deciphering the genetics of complex traits. Nature Genetics, March 2004, Vol 5:202-212 • Several key advantages • More relaxed ethical issues (…relative to R&D in humans) • Very strong similarities at the genome level with humans • The genome is (being) sequenced for several species • Strong background knowledge of genetics accumulated • Quantitative genetics • Mixed-Model equations • Computing expertise • Journals will soon be inundated • We have the opportunity to participate SSAI – QLD Branch – 6 Apr. 2004

More Related