220 likes | 383 Views
Inference of gene regulatory networks using regression based network method. 2005. 08. 11 Ha Seong, Kim Bioinformatics & Biostatistics Lab., SNU. Table of contents. Introduction Gene regulatory networks Gene regulatory network inference methods Drawbacks of previous network methods
E N D
Inference of gene regulatory networks using regression based network method 2005. 08. 11 Ha Seong, Kim Bioinformatics & Biostatistics Lab., SNU
Table of contents • Introduction • Gene regulatory networks • Gene regulatory network inference methods • Drawbacks of previous network methods • Linear modeling of genetic network • Method • Data pre-processing • Regression based network • Result • Caulobacter Crecentus • Mouse stem cell (hanyang univ.) • Discussion • Advantages • Weakness • Discussion
Objective Introduce a new method to construct the gene regulatory networks using multiple regression method.
Gene regulatory networks Promoter sequence analysis Transcription factor analysis Gene expression level analysis
Gene regulatory network inference methods • Boolean networks • Kauffman • Somogyi • Akutsu • Shmulevich • Beysian networks • Friedman • Miyano • Imoto • Linear model • D’Haeseleer • Van Someren • Genetic network • Neural network
Drawbacks of previous network methods • Boolean networks • Data binarization cause loss of information • Beysian networks • Heavy computing time • Can not find self-regulated genes • Linear model • Dimensionality problem • Inherent linearity
Linear modeling of genetic networks • D’Haeseleer • Van Someren
Data pre-processing • Discrete Cosine Transform algorithm • ex. Select 553 genes (cell-cycle) from total 1500 genes (T. Laub, 2000) • Matlab • Clustering gene expression profiles with self-organizing maps (SOMs) • ex. Identify groups of 553 genes with similar expression patterns. (T. Laub, 2000) • Known transcription factor (published papers)
Regression-based network I Regression models for G4 Gene expression data Time complexity Beta = 0 test Select significant beta. Cf.
Regression-based network II • Calculate the effect between genes using regression coefficient • Determine the directions and positive, negative effects G3 G5 G3 G5 Predict variables G4 G4 Response variable Do not significant beta3 Significant the beta3 • SEM (Structural Equation Model) • Path analysis
Regression-based network III Representation of the interaction term in gene regulatory network structure G5 G3 G3 G5 ? G4 G4 Do not significant interaction Significant the interaction
Interpretation of the interaction term HIS의 mRNA양이 Hybrid 1과 Hybrid 2에 의해서 조절. Hybrid 1의 수준에 따라서 Hybrid 2의 효과가 달라짐.
Advantage • The method directly utilize the continuous gene expression data. • No loss of information • Interaction term could improve the network accuracy. • Obtain the effect between genes as a quantitative value • Apply various statistical approach to the method. • Low time complexity • We could be apply this method to large scale data set
Weakness • Treatment of models with high adjusted R-square • Clustering • If we use more higher indegree value, we have to consider the multicolinearity.
Future works • Select several models and apply probabilistic approach to this method. • Promoter sequence analysis • Mouse stem cell data