1 / 120

Design & Analysis of Microarray Studies for Diagnostic & Prognostic Classification

Design & Analysis of Microarray Studies for Diagnostic & Prognostic Classification. Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute http://linus.nci.nih.gov/brb. http://linus.nci.nih.gov/brb. http://linus.nci.nih.gov/brb Powerpoint presentations

rmetzger
Download Presentation

Design & Analysis of Microarray Studies for Diagnostic & Prognostic Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Design & Analysis of Microarray Studies for Diagnostic & Prognostic Classification Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute http://linus.nci.nih.gov/brb

  2. http://linus.nci.nih.gov/brb • http://linus.nci.nih.gov/brb • Powerpoint presentations • Reprints & Technical Reports • BRB-ArrayTools software • BRB-ArrayTools Data Archive • Sample Size Planning for Targeted Clinical Trials

  3. Simon R, Korn E, McShane L, Radmacher M, Wright G, Zhao Y. Design and analysis of DNA microarray investigations, Springer-Verlag, 2003. Radmacher MD, McShane LM, Simon R. A paradigm for class prediction using gene expression profiles. Journal of Computational Biology 9:505-511, 2002. Simon R, Radmacher MD, Dobbin K, McShane LM. Pitfalls in the analysis of DNA microarray data. Journal of the National Cancer Institute 95:14-18, 2003. Dobbin K, Simon R. Comparison of microarray designs for class comparison and class discovery, Bioinformatics 18:1462-69, 2002; 19:803-810, 2003; 21:2430-37, 2005; 21:2803-4, 2005. Dobbin K and Simon R. Sample size determination in microarray experiments for class comparison and prognostic classification. Biostatistics 6:27-38, 2005. Dobbin K, Shih J, Simon R. Questions and answers on design of dual-label microarrays for identifying differentially expressed genes. Journal of the National Cancer Institute 95:1362-69, 2003. Wright G, Simon R. A random variance model for detection of differential gene expression in small microarray experiments. Bioinformatics 19:2448-55, 2003. Korn EL, Troendle JF, McShane LM, Simon R.Controlling the number of false discoveries. Journal of Statistical Planning and Inference 124:379-08, 2004. Molinaro A, Simon R, Pfeiffer R. Prediction error estimation: A comparison of resampling methods. Bioinformatics 21:3301-7,2005.

  4. Simon R. Using DNA microarrays for diagnostic and prognostic prediction. Expert Review of Molecular Diagnostics, 3(5) 587-595, 2003. Simon R. Diagnostic and prognostic prediction using gene expression profiles in high dimensional microarray data. British Journal of Cancer 89:1599-1604, 2003. Simon R and Maitnourim A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clinical Cancer Research 10:6759-63, 2004. Maitnourim A and Simon R. On the efficiency of targeted clinical trials. Statistics in Medicine 24:329-339, 2005. Simon R. When is a genomic classifier ready for prime time? Nature Clinical Practice – Oncology 1:4-5, 2004. Simon R. An agenda for Clinical Trials: clinical trials in the genomic era. Clinical Trials 1:468-470, 2004. Simon R. Development and Validation of Therapeutically Relevant Multi-gene Biomarker Classifiers. Journal of the National Cancer Institute 97:866-867, 2005. Simon R. A roadmap for developing and validating therapeutically relevant genomic classifiers. Journal of Clinical Oncology (In Press). Freidlin B and Simon R. Adaptive signature design. Clinical Cancer Research (In Press). Simon R. Validation of pharmacogenomic biomarker classifiers for treatment selection. Disease Markers (In Press). Simon R. Guidelines for the design of clinical studies for development and validation of therapeutically relevant biomarkers and biomarker classification systems. In Biomarkers in Breast Cancer, Hayes DF and Gasparini G, Humana Press (In Press).

  5. Myth • That microarray investigations should be unstructured data-mining adventures without clear objectives

  6. Good microarray studies have clear objectives, but not generally gene specific mechanistic hypotheses • Design and analysis methods should be tailored to study objectives

  7. Good Microarray Studies Have Clear Objectives • Class Comparison • Find genes whose expression differs among predetermined classes • Class Prediction • Prediction of predetermined class (phenotype) using information from gene expression profile • Class Discovery • Discover clusters of specimens having similar expression profiles • Discover clusters of genes having similar expression profiles

  8. Class Comparison and Class Prediction • Not clustering problems • Global similarity measures generally used for clustering arrays may not distinguish classes • Don’t control multiplicity or for distinguishing data used for classifier development from data used for classifier evaluation • Supervised methods • Requires multiple biological samples from each class

  9. Levels of Replication • Technical replicates • RNA sample divided into multiple aliquots and re-arrayed • Biological replicates • Multiple subjects • Replication of the tissue culture experiment

  10. Biological conclusions generally require independent biological replicates. The power of statistical methods for microarray data depends on the number of biological replicates. • Technical replicates are useful insurance to ensure that at least one good quality array of each specimen will be obtained.

  11. Class Prediction • Predict which tumors will respond to a particular treatment • Predict which patients will relapse after a particular treatment

  12. Class prediction methods usually have gene selection as a component • The criteria for gene selection for class prediction and for class comparison are different • For class comparison false discovery rate is important • For class prediction, predictive accuracy is important

  13. Clarity of Objectives is Important • Patient selection • Many microarray studies developing classifiers are not “therapeutically relevant” • Analysis methods • Many microarray studies use cluster analysis inappropriately or misleadingly

  14. Microarray Platforms for Developing Predictive Classifiers • Single label arrays • Affymetrix GeneChips • Dual label arrays using common reference design • Dye swaps are unnecessary

  15. Common Reference Design A1 A2 B1 B2 RED R R R R GREEN Array 1 Array 2 Array 3 Array 4 Ai = ith specimen from class A Bi = ith specimen from class B R = aliquot from reference pool

  16. The reference generally serves to control variation in the size of corresponding spots on different arrays and variation in sample distribution over the slide. • The reference provides a relative measure of expression for a given gene in a given sample that is less variable than an absolute measure. • The reference is not the object of comparison. • The relative measure of expression will be compared among biologically independent samples from different classes.

  17. Myth • For two color microarrays, each sample of interest should be labeled once with Cy3 and once with Cy5 in dye-swap pairs of arrays.

  18. Dye Bias • Average differences among dyes in label concentration, labeling efficiency, photon emission efficiency and photon detection are corrected by normalization procedures • Gene specific dye bias may not be corrected by normalization

  19. Dye swap technical replicates of the same two rna samples are rarely necessary. • Using a common reference design, dye swap arrays are not necessary for valid comparisons of classes since specimens labeled with different dyes are never compared. • For two-label direct comparison designs for comparing two classes, it is more efficient to balance the dye-class assignments for independent biological specimens than to do dye swap technical replicates

  20. Balanced Block Design A1 B2 A3 B4 RED B1 A2 B3 A4 GREEN Array 1 Array 2 Array 3 Array 4 Ai = ith specimen from class A Bi = ith specimen from class B

  21. Detailed comparisons of the effectiveness of designs: • Dobbin K, Simon R. Comparison of microarray designs for class comparison and class discovery. Bioinformatics 18:1462-9, 2002 • Dobbin K, Shih J, Simon R. Statistical design of reverse dye microarrays. Bioinformatics 19:803-10, 2003 • Dobbin K, Simon R. Questions and answers on the design of dual-label microarrays for identifying differentially expressed genes, JNCI 95:1362-1369, 2003

  22. Common reference designs are very effective for many microarray studies. They are robust, permit comparisons among separate experiments, and permit many types of comparisons and analyses to be performed. • For simple two class comparison problems, balanced block designs require many fewer arrays than common reference designs. • Efficiency decreases for more than two classes • Are more difficult to apply to more complicated class comparison problems. • They are not appropriate for class discovery or class prediction. • Loop designs are less robust, and dominated by either common reference designs or balanced block designs, and are not suitable for class prediction or class discovery.

  23. What We Will Not Discuss Today • Image analysis • Normalization • Clustering methods • Class comparison • SAM and other methods of gene finding • FDR (false discovery rate) and methods for controlling the number of false positive genes

  24. Simple Procedures • If each gene is tested for significance at level  and there are k genes, then the expected number of false discoveries is k  . • To control E(FD) u • Conduct each of k tests at level  = u/k • Bonferroni control of familywise error (FWE) rate at level 0.05 • Conduct each of k tests at level 0.05/k • At least 95% confident that FD = 0

  25. False Discovery Rate (FDR) • FDR = Expected proportion of false discoveries among the tests declared significant • Studied by Benjamini and Hochberg (1995):

  26. Controlling the Expected False Discovery Rate • Compare classes separately by gene and compute significance levels {pi} • Rank genes in order of significance • p(1) < p(2) < ... < p(k) • Find largest index i for which • p(i)k / i  FDR • Consider genes with the i’th smallest p values as statistically significant

  27. Problems With Simple Procedures • Bonferroni control of FWE is very conservative • p values based on normal theory are not accurate at extremes quantiles • Difficult to achieve extreme quantiles for permutation p values of individual genes • Controlling expected number or proportion of false discoveries may not provide adequate control because distributions of FD and FDP may have large variances • Multiple comparisons are controlled by adjustment of univariate (single gene) p values and so may not take advantage of correlation among genes

  28. Additional Procedures • “SAM” - Significance Analysis of Microarrays • Tusher et al., PNAS, 2001 • Estimate FDR • Statistical properties unclear • Empirical Bayes • Efron et al., JASA, 2001 • Related to FDR • Ad hoc aspects • Multivariate permutation tests • Korn et al., 2001 (http://linus.nci.nih.gov/brb) • Control number or proportion of false discoveries • Can specify confidence level of control

  29. Multivariate Permutation Procedures(Korn et al., 2001) Allows statements like: FD Procedure: We are 95% confident that the (actual) number of false discoveries is no greater than 5. FDP Procedure: We are 95% confident that the (actual) proportion of false discoveries does not exceed .10.

  30. Class Prediction Model • Given a sample with an expression profile vector x of log-ratios or log signals and unknown class. • Predict which class the sample belongs to • The class prediction model is a function f which maps from the set of vectors x to the set of class labels {1,2} (if there are two classes). • f generally utilizes only some of the components of x (i.e. only some of the genes) • Specifying the model f involves specifying some parameters (e.g. regression coefficients) by fitting the model to the data (learning the data).

  31. Problems With Many Diagnostic/Prognostic Marker Studies • Are not reproducible • Retrospective non-focused analysis • Multiplicity problems • Inter-laboratory assay variation • Have no impact • Not therapeutically relevant questions • Not therapeutically relevant group of patients • Black box predictors

  32. Components of Class Prediction • Feature (gene) selection • Which genes will be included in the model • Select model type • E.g. Diagonal linear discriminant analysis, Nearest-Neighbor, … • Fitting parameters (regression coefficients) for model • Selecting value of tuning parameters

  33. Do Not Confuse Statistical Methods Appropriate for Class Comparison with Those Appropriate for Class Prediction • Demonstrating statistical significance of prognostic factors is not the same as demonstrating predictive accuracy. • Demonstrating goodness of fit of a model to the data used to develop it is not a demonstration of predictive accuracy. • Statisticians are used to inference, not prediction • Most statistical methods were not developed for p>>n prediction problems

  34. Feature Selection • Genes that are differentially expressed among the classes at a significance level  (e.g. 0.01) • The  level is selected only to control the number of genes in the model

  35. t-test Comparisons of Gene Expression • xj~N(j1 , j2) for class 1 • xj~N(j2 , j2) for class 2 • H0j: j1 = j2

  36. Estimation of Within-Class Variance • Estimate separately for each gene • Limited degrees of freedom • Gene list dominated by genes with small fold changes and small variances • Assume all genes have same variance • Poor assumption • Random (hierarchical) variance model • Wright G.W. and Simon R. Bioinformatics19:2448-2455,2003 • Inverse gamma distribution of residual variances • Results in exact F (or t) distribution of test statistics with increased degrees of freedom for error variance • For any normal linear model

  37. Feature Selection • Small subset of genes which together give most accurate predictions • Combinatorial optimization algorithms • Genetic algorithms • Little evidence that complex feature selection is useful in microarray problems • Failure to compare to simpler methods • Some published complex methods for selecting combinations of features do not appear to have been properly evaluated

  38. Linear Classifiers for Two Classes

  39. Linear Classifiers for Two Classes • Fisher linear discriminant analysis • Requires estimating correlations among all genes selected for model • y = vector of class labels • Diagonal linear discriminant analysis (DLDA) assumes features are uncorrelated • Naïve Bayes classifier • Compound covariate predictor (Radmacher) and Golub’s method are similar to DLDA in that they can be viewed as weighted voting of univariate classifiers

  40. Linear Classifiers for Two Classes • Compound covariate predictor Instead of for DLDA

  41. Linear Classifiers for Two Classes • Support vector machines with inner product kernel are linear classifiers with weights determined to separate the classes with a hyperplain that minimizes the length of the weight vector

  42. Support Vector Machine

  43. Perceptrons • Perceptrons are neural networks with no hidden layer and linear transfer functions between input output • Number of input nodes equals number of genes selected • Number of output nodes equals number of classes minus 1 • Number of inputs may be major principal components of genes or major principal components of informative genes • Perceptrons are linear classifiers

  44. Naïve Bayes Classifier • Expression profiles for class j assumed normal with mean vector mj and diagonal covariance matrix D • Likelihood of expression profile vector x is l(x; mj ,D) • Posterior probability of class j for case with expression profile vector x is proportional to πjl(x; mj ,D)

  45. Compound Covariate Bayes Classifier • Compound covariate y = tixi • Sum over the genes selected as differentially expressed • xi the expression level of the ith selected gene for the case whose class is to be predicted • ti the t statistic for testing differential expression for the i’th gene • Proceed as for the naïve Bayes classifier but using the single compound covariate as predictive variable • GW Wright et al. PNAS 2005.

  46. When p>>n The Linear Model is Too Complex • It is always possible to find a set of features and a weight vector for which the classification error on the training set is zero. • Why consider more complex models?

  47. Myth • Complex classification algorithms such as neural networks perform better than simpler methods for class prediction.

  48. Artificial intelligence sells to journal reviewers and peers who cannot distinguish hype from substance when it comes to microarray data analysis. • Comparative studies have shown that simpler methods work as well or better for microarray problems because they avoid overfitting the data.

More Related