1 / 63

Gonzalo G ómez, PhD. ggomez@cnio.es

Course on Functional Analysis. ::: Gene Set Enrichment Analysis - GSEA -. Madrid, Feb 16th, 2009. Gonzalo G ómez, PhD. ggomez@cnio.es. Bioinformatics Unit CNIO. ::: Contents. Introduction. GSEA Software Data Formats Using GSEA GSEA Output GSEA Results Leading Edge Analysis.

Download Presentation

Gonzalo G ómez, PhD. ggomez@cnio.es

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Course on Functional Analysis ::: Gene Set Enrichment Analysis - GSEA - Madrid, Feb 16th, 2009. Gonzalo Gómez, PhD. ggomez@cnio.es Bioinformatics Unit CNIO

  2. :::Contents. Introduction. GSEA Software Data Formats Using GSEA GSEA Output GSEA Results Leading Edge Analysis

  3. :::Contents. Introduction. GSEA Software Data Formats Using GSEA GSEA Output GSEA Results Leading Edge Analysis

  4. ::: Introduction. Gene Set Enrichment Analysis - GSEA - GSEA MIT Broad Institute v 2.0 available since Jan 2007 v 2.0.1 available since Feb 16th 2007 Version 2.0 includes Biocarta, Broad Institute, GeneMAPP, KEGG annotations and more... Platforms: Affymetrix, Agilent, CodeLink, custom... (Subramanian et al. PNAS. 2005.)

  5. ::: Introduction. Gene Set Enrichment Analysis - GSEA - ::: How works GSEA? GSEAapplies Kolmogorov-Smirnof test to find assymmetrical distributions for defined blocks of genes in datasets whole distribution. Is this particular Gene Set enriched in my experiment? Genes selected by researcher, Biocarta pathways, GeneMAPP sets, genes sharing cytoband, genes targeted by common miRNAs …up to you…

  6. Dataset distribution Gene set 2 distribution ::: Introduction. Gene Set Enrichment Analysis - GSEA - ::: K-S test The Kolmogorov–Smirnov test is used to determine whether two underlying one-dimensional probability distributions differ, or whether an underlying probability distribution differs from a hypothesized distribution, in either case based on finite samples. The one-sample KS test compares the empirical distribution function with the cumulative distribution functionspecified by the null hypothesis. The main applications are testing goodness of fit with the normal and uniform distributions. The two-sample KS test is one of the most useful and general nonparametric methods for comparing two samples, as it is sensitive to differences in both location and shape of the empirical cumulative distribution functions of the two samples. Gene set 1 distribution Number of genes Gene Expression Level

  7. FDR<0.05 ttest cut-off FDR<0.05 Biological meaning? ::: Introduction. Gene Set Enrichment Analysis - GSEA - ::: How works GSEA? ClassA ClassB ...testing genes independently...

  8. Gene set 3 enriched in Class B ttest cut-off Gene set 2 enriched in Class A ::: Introduction. Gene Set Enrichment Analysis - GSEA - ::: How works GSEA? Gene Set 1 Gene Set 2 Gene Set 3 ClassA ClassB - ES/NES statistic +

  9. ::: Introduction. Gene Set Enrichment Analysis - GSEA - ES examples :::

  10. ::: Introduction. Gene Set Enrichment Analysis - GSEA - The Enrichment Score ::: NES pval FDR Benjamini-Hochberg

  11. :::Contents. Introduction. GSEA Software Data Formats Using GSEA GSEA Output GSEA Results Leading Edge Analysis

  12. ::: GSEA software. Gene Set Enrichment Analysis - GSEA - Download ::: http://www.broad.mit.edu/gsea/

  13. ::: GSEA software. Gene Set Enrichment Analysis - GSEA - Main Window :::

  14. ::: GSEA software. Gene Set Enrichment Analysis - GSEA - Loading data ::: !!!

  15. ::: GSEA software. Gene Set Enrichment Analysis - GSEA - Running GSEA :::

  16. ::: GSEA software. Gene Set Enrichment Analysis - GSEA - Leading Edge Analysis :::

  17. ::: GSEA software. Gene Set Enrichment Analysis - GSEA - MSigDB ::: Chip to Chip Mapping :::

  18. :::Contents. Introduction. GSEA Software Data Formats Using GSEA GSEA Output GSEA Results Leading Edge Analysis

  19. ::: Data Formats. Gene Set Enrichment Analysis - GSEA -

  20. ::: Data Formats. Gene Set Enrichment Analysis - GSEA -

  21. ::: Data Formats. Gene Set Enrichment Analysis - GSEA - Expression datasets ::: *.gct

  22. ::: Data Formats. Gene Set Enrichment Analysis - GSEA - Expression datasets ::: *.res

  23. ::: Data Formats. Gene Set Enrichment Analysis - GSEA - Expression datasets ::: *.pcl

  24. ::: Data Formats. Gene Set Enrichment Analysis - GSEA - Expression datasets ::: *.txt

  25. ::: Data Formats. Gene Set Enrichment Analysis - GSEA - Phenotype datasets ::: *.cls For categorical phenotypes (e.g. Tumor vs Control)

  26. ::: Data Formats. Gene Set Enrichment Analysis - GSEA - Phenotype datasets ::: For continuous phenotypes (e.g. Gene correlated to GeneSet) Time serie (each 30 minutes) Peak profile wanted For continuous phenotypes (e.g. Gene vs Time Series)

  27. ::: Data Formats. Gene Set Enrichment Analysis - GSEA - Gene Set Database ::: *.gmx

  28. ::: Data Formats. Gene Set Enrichment Analysis - GSEA - Gene Set Database ::: *.gmt

  29. ::: Data Formats. Gene Set Enrichment Analysis - GSEA - Ranked list format ::: *.rnk

  30. :::Contents. Introduction. GSEA Software Data Formats Using GSEA GSEA Output GSEA Results Leading Edge Analysis

  31. ::: Using GSEA. Gene Set Enrichment Analysis - GSEA - Loading data :::

  32. ::: Using GSEA. Gene Set Enrichment Analysis - GSEA - Loading data :::

  33. ::: Using GSEA. Gene Set Enrichment Analysis - GSEA - Running GSEA :::

  34. ::: Using GSEA. Gene Set Enrichment Analysis - GSEA - ::: MSigDB. gsea_home

  35. ::: Using GSEA. Gene Set Enrichment Analysis - GSEA - Running GSEA ::: 1. Choosetrue (default) to have GSEA collapse each probe set in your expression dataset into a single gene vector, which is identified by its HUGO gene symbol. In this case, you are using HUGO gene symbols for the analysis. The gene sets that you use for the analysis must use HUGO gene symbols to identify the genes in the gene sets. 2. Choose falseto use your expression dataset "as is." In this case, you are using the probe identifiers that are in your expression dataset for the analysis. The gene sets that you use for the analysis must also use these probe identifiers to identify the genes in the gene sets.

  36. ::: Using GSEA. Gene Set Enrichment Analysis - GSEA - Running GSEA ::: Phenotype Gene Sets (few samples)

  37. ::: Using GSEA. Gene Set Enrichment Analysis - GSEA - Running GSEA :::

  38. ::: Using GSEA. Gene Set Enrichment Analysis - GSEA - Chip2Chip mapping ::: Chip2Chip translates the gene identifiers in a gene sets from HUGO gene symbols to the probe identifiers for a selected DNA chip.

  39. ::: Using GSEA. Gene Set Enrichment Analysis - GSEA - Enrichment statistic ::: To calculate the enrichment score, GSEA first walks down the ranked list of genes increasing a running-sum statistic when a gene is in the gene set and decreasing it when it is not. The enrichment score is the maximum deviation from zero encountered during that walk. This parameter affects the running-sum statistic used for the analysis.

  40. Signal2Noise tTest Cosine Euclidean Manhatten Pearson (time series) Ratio of Classes Diff of Classes Log2_Ratio_of_Classes ::: Using GSEA. Gene Set Enrichment Analysis - GSEA - Ranking Metric ::: Categorical phenotypes Continuous phenotypes

  41. ::: Using GSEA. Gene Set Enrichment Analysis - GSEA - Ranking Metric :::

  42. ::: Using GSEA. Gene Set Enrichment Analysis - GSEA - Ranking Metric :::

  43. abs 8.2 8.1 8.0 7.9 7.7 7.5 … real 8.2 8.1 8.0 … -7.5 -7.7 -7.9 ::: Using GSEA. Gene Set Enrichment Analysis - GSEA - More parameters ::: parameter to determine whether to sort the genes in descending (default) or ascending order.

  44. ::: Using GSEA. Gene Set Enrichment Analysis - GSEA - Launching Analysis :::

  45. :::Contents. Introduction. GSEA Software Data Formats Using GSEA GSEA Output GSEA Results Leading Edge Analysis

  46. ::: GSEA output. Gene Set Enrichment Analysis - GSEA - Results Accession ::: By default in gsea_home C:\Documents and settings\username\gsea_home /Users/yourhome/gsea_home

  47. :::Contents. Introduction. GSEA Software Data Formats Using GSEA GSEA Output GSEA Results Leading Edge Analysis

  48. Heat map of the top 50 features for each phenotype and a plot showing the correlation between the ranked genes and the phenotypes. In a heat map, expression values are represented as colors, where the range of colors (red, pink, light blue, dark blue) shows the range of expression values (high, moderate, low, lowest). ::: GSEA results. Gene Set Enrichment Analysis - GSEA - Index.html :::

  49. ::: GSEA results. Gene Set Enrichment Analysis - GSEA - Enrichment results in html :::

  50. ::: GSEA results. Gene Set Enrichment Analysis - GSEA - Enrichment results in html :::

More Related