80 likes | 366 Views
Statistics Applied to Bioinformatics. Multivariate analysis Summary. Typical questions in multivariate analysis. No criterion variable Can the objects be separated in distinct classes on the basis of the variables ? Cluster analysis
E N D
Statistics Applied to Bioinformatics Multivariate analysisSummary Jacques van HeldenJacques.van.Helden@ulb.ac.be
Typical questions in multivariate analysis • No criterion variable • Can the objects be separated in distinct classes on the basis of the variables ? Cluster analysis • Which variables, or combiations of variables (factors), are the most explanatory for the differences between objects ? Factor analysis • Quantitative criterion variable • Is the criterion variable correlated with the predictor variables ? Correlation analysis • Can we predict the value of the criterion variable on the basis of the predictor variables ? Regression analysis • Nominal criterion variable • Can we predict the value of the criterion variable on the basis of the predictor variables ? Discriminant analysis
Flowchart of the approaches in multivariate analysis Principal component analysis Cluster analysis Clusters none criterionvariable ? multivariate table Regression analysis Predicted value quantitative Discriminant analysis Predicted class nominal Multidimensionalscaling distance matrix
Process Data display Raw data Adapted from Gilbert et al. (2000). Trends Biotech.18(Dec), 487-495. Processing Visualization • Matrix • n rows • p columns • coloring • Ordering (optional) • row swapping • column swapping Matrix viewer • Dendrogram • rooted • unrooted • n leaves Tree drawing Clusters,Tree Clustering • Multivariate data matrix • n objects • p variables Pairwise distance measurement • Distance matrix • n x n distances • symmetrical Coloring (optional) • Euclidian space • 1D to 3D • n dots • coloring • dot volume • interactive • Multidimensional scaling • PCoA • spring embedding Space explorer (VRML) • Coordinates • n elements • d dimensions Principal component analysis • Normalization • mean • variance • covariance • Normalized table • n elements • p dimensions Reduction to significant dimensions