Analysis of Multiple Experiments TIGR Multiple Experiment Viewer (MeV)

Analysis of Multiple ExperimentsTIGR Multiple Experiment Viewer (MeV) Joseph White DFCI January 24,2008

MeV • Stand-alone java application for analysis • New version: 4.1 • Not database centric; uses TDMS files • Writes TDMS files • Primarily for normalized data • MeV does not currently write MAGE-TAB • Download MeV from: tm4.org

Outline • Description of MeV • How MeV treats expression • Some essential concepts • Demo: basic operations in MeV • New file loader • ANOVA example • Demo of MeV new features • Affymetrix file reader • Non-parametric tests • CGH • GCOD

Exp 1 Exp 2 Exp 3 Exp 4 Exp 5 Exp 6 Gene 1 Gene 2 Gene 3 Gene 4 Gene 5 Gene 6 The Expression Matrix is a representation of data from multiple microarray experiments. Each element is a log ratio (usually log 2 (Cy5 / Cy3) ) Black indicates a log ratio of zero, i. e., Cy5 and Cy3 are very close in value Green indicates a negative log ratio , i.e., Cy5 < Cy3 Gray indicates missing data Red indicates a positive log ratio, i.e, Cy5 > Cy3

1.5 -0.8 1.8 0.5 -0.4 -1.3 1.5 0.8 Expression Vectors -Gene Expression Vectors encapsulate the expression of a gene over a set of experimental conditions or sample types. Log2(cy5/cy3)

Expression Vectors As Points in‘Expression Space’ Exp 1 Exp 2 Exp 3 G1 -0.8 -0.3 -0.7 G2 -0.7 -0.8 -0.4 G3 Similar Expression -0.4 -0.6 -0.8 G4 0.9 1.2 1.3 G5 1.3 0.9 -0.6 Experiment 3 Experiment 2 Experiment 1

Distance and Similarity -the ability to calculate a distance (or similarity, it’s inverse) between two expression vectors is fundamental to clustering algorithms -distance between vectors is the basis upon which decisions are made when grouping similar patterns of expression -selection of a distance metric defines the concept of distance

Exp 1 Exp 2 Exp 3 Exp 4 Exp 5 Exp 6 x1A x2A x3A x5A Gene A x4A x6A Gene B x1B x2B x3B x4B x5B x6B 6 6 • Manhattan: i = 1 |xiA – xiB| Distance: a measure of similarity between genes. p1 • Some distances: (MeV provides 11 metrics) • Euclidean: i = 1(xiA - xiB)2 p0 3. Pearson correlation

Distance Metric: EuclideanPearson(r*-1) D D Distance is Defined by a Metric 1.4 -0.90 4.2 -1.00

Normal distribution σ = std. deviation of the distribution X = μ (mean of the distribution)

Hierarchical Clustering K Means clustering Support Trees for HCL EASE (annotation clustering Self-organizing maps K-Nearest Neighbors Support Vector Machines Relevance Networks Template Matching PCA CGH Bayesean Networks T-test ANOVA One and two factor SAM Non-parametric tests Wilcoxon Fisher Exact Test Mack-Skillings Kruskat-Wallins BRIDGE Current MeV Algorithms

Demos • File loaders • HTA data: ANOVA • Affymetrix data: SAM • Non-Parametric tests • CGH

GeneChip Oncology Database

GCOD statistics • Studies: 52 • Hybridizations: 4591 • Analysis Result sets: 12,637 • Signal values: 204,296,195 • Samples: 3644 • Probesets: 160,817 eg.(HG-U133A: 22,293) (HG_U133_Plus_2: 54,684) • Arraydesigns: 9 • Accessions: 54,414

MeV Team • Eleanor Howe • Sarita Nair • Raktim Sinha • mev@tigr.org

Analysis of Multiple Experiments TIGR Multiple Experiment Viewer (MeV)

Analysis of Multiple Experiments TIGR Multiple Experiment Viewer (MeV)

Presentation Transcript

Multiple Regression Analysis

MULTIPLE REGRESSION ANALYSIS

Multiple Regression Analysis

Multiple Discriminant Analysis

Multiple Regression Analysis

Multiple Regression Analysis

Multiple Regression Analysis

Multiple Regression Analysis

Multiple Comparisons in Factorial Experiments

Multiple Regression Analysis

MULTIPLE ANALYSIS

Multiple Regression Analysis

Analysis of Multiple Experiments TIGR Multiple Experiment Viewer (MeV)

Multiple Correspondence Analysis

Multiple Causes of Food Insecurity: Multiple Regression Analysis

Multiple Regression Analysis

Multiple Regression Analysis

Multiple Regression Analysis

Multiple Regression Analysis

Multiple Regression Analysis

Multiple Regression Analysis

Multiple Regression Analysis