1 / 18

Presented by Rhee, Je-Keun Graduate Program in Bioinformatics

Identification of cell cycle-related regulatory motifs using a kernel canonical correlation analysis. Presented by Rhee, Je-Keun Graduate Program in Bioinformatics Center for Biointelligence Technology (CBIT) Biointelligence Laboratory Seoul National University. Contents. Introduction

omar
Download Presentation

Presented by Rhee, Je-Keun Graduate Program in Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Identification of cell cycle-related regulatory motifsusing a kernel canonical correlation analysis Presented by Rhee, Je-Keun Graduate Program in Bioinformatics Center for Biointelligence Technology (CBIT) Biointelligence Laboratory Seoul National University

  2. Contents • Introduction • Kernel canonical correlation analysis (kernel CCA) • Datasets & Experiments • Experimental results • Conclusion (c) 2009 Biointelligence Laboratory, Seoul National University

  3. Introduction • One of the major challenges in gene regulation studies is to identify regulators affecting the expression of their target genes in specific biological processes. • In the present study, we propose a kernel-based approach to efficiently identify core regulatory elements involved in specific biological processes using gene expression profiles. • Using yeast cell cycle data, we explored significant relationships between motifs and expression profiles, and searched for regulatory motifs and their pairs correlated with specific expression patterns. G1 S M G2 (c) 2009 Biointelligence Laboratory, Seoul National University

  4. Φ: x→φ(x) Kernel methods • The kernel trick is a method to solve a non-linear problem by mapping the original non-linear observations into a higher-dimensional space. (c) 2009 Biointelligence Laboratory, Seoul National University

  5. Canonical correlation analysis (CCA) • Canonical correlation analysis (CCA) is a classical multivariate statistical method for finding linearly correlated features from a pair of datasets. • Suppose there is a pair of multivariates xi and xj, CCA finds a pair of linear transformations such that the correlation coefficient between extracted features is maximized. xi xj ai aj ui uj … … … (c) 2009 Biointelligence Laboratory, Seoul National University

  6. Kernel canonical correlation analysis(kernel CCA) • Kernel CCA offers a solution for overcoming the linearity problem by projecting the data into a higher dimensional feature space. • While CCA is limited to linear features, kernel CCA can capture nonlinear relationships. xexp xseq fseq fexp useq uexp Φseq Φexp … … … … … expression profiles sequence data (c) 2009 Biointelligence Laboratory, Seoul National University

  7. Preparation of datasets • Gene expression datasets • Expression profiles of all ORFs (open reading frames) during the yeast cell cycle that consists of 18 time points by Spellman et al. • Sequence datasets • Upstream sequences of ORFs scanned for the presence of 42 known motifs extracted by Pilpel et al. using the AlignACE program • Raw upstream sequences extracted ~1kb upstream sequences of each gene. (c) 2009 Biointelligence Laboratory, Seoul National University

  8. Experiments • Identification of the relationship between gene expression and known motifs using a set of motifs extracted by AlignACE • 42 motifs • Identification of cell cycle-related motifs from raw upstream sequence • A total of 1,024 features (window size l=5) • Combinatorial effects of regulatory motifs • Searching the motif pairs that have synergistic or co-regulatory effects in the yeast cell cycle (c) 2009 Biointelligence Laboratory, Seoul National University

  9. Known regulatory motifs in yeast (c) 2009 Biointelligence Laboratory, Seoul National University

  10. Relationship between gene expression and sequence motifs (c) 2009 Biointelligence Laboratory, Seoul National University

  11. The list of top ranked motifs by the kernel CCA (c) 2009 Biointelligence Laboratory, Seoul National University

  12. Weight distributions for motifs derived from cellcycle and non cell cycle-related datasets MCB MCB SWI5 SFF’ SFF’ SWI5 (c) 2009 Biointelligence Laboratory, Seoul National University

  13. Correlation between expression profiles and motifs derived by using the raw upstream sequence data (c) 2009 Biointelligence Laboratory, Seoul National University

  14. High-scored motifs in the first and the second components using 5-mer raw upstream sequences (c) 2009 Biointelligence Laboratory, Seoul National University

  15. Measurement of the effect of motif pairs • ECRScore (Expression Coherence coRrelation Score) • It is calculated by a Pearson correlation coefficient of expression profiles for all possible pairs of genes whose upstream regions had the two motifs, mi and mj. • N(mi ∩ mj) is the number of all pairs of genes whose upstream regions have the two motifs. • Nτ(mi ∩ mj) is the number of gene pairs whose correlation coefficient is larger than the threshold τ. • The threshold was chosen based on the fifth percentile of the distribution for correlation coefficients of randomly sampled gene pairs. (c) 2009 Biointelligence Laboratory, Seoul National University

  16. Heat map of weight values of motif pairs related to cell cycle regulation (c) 2009 Biointelligence Laboratory, Seoul National University

  17. Combinational effects of regulatory motifs (c) 2009 Biointelligence Laboratory, Seoul National University

  18. Conclusion • We presented a novel method that can identify the candidate conditional specific regulatory motifs by employing kernel-based methods. • In summary, given expression profiles, our method was able to identify regulatory motifs involved in specific biological processes. • The method could be applied to the elucidation of the unknown regulatory mechanisms associated with complex gene regulatory processes. • In the future research, we will apply the proposed method to diverse gene expression datasets, especially cancer-related datasets. (c) 2009 Biointelligence Laboratory, Seoul National University

More Related