220 likes | 382 Views
Threshold selection in gene co-expression networks using spectral graph theory techniques. Andy D Perkins*,Michael A Langston BMC Bioinformatics . O utline. Introduce How to construct a gene co-expression network? Steps and our criterion Method Result & Analysis . Introduce.
E N D
Threshold selection in gene co-expression networks usingspectral graph theory techniques Andy D Perkins*,Michael A Langston BMC Bioinformatics
Outline • Introduce • How to construct a gene co-expression network? • Steps and our criterion • Method • Result & Analysis
Introduce • In gene co-expression networks, nodes represent gene transcripts. • Two genes are connected by an edge if their expression values are highly correlated. • Definition of “high” correlation is somewhat tricky • One can use statistical significance… • But we propose a criterion for picking threshold parameter: spectral graph theory.
Methods • Microarray data sets • Homo sapiens • Saccharomycescerevisiae: baker’s yeast
Methods • Network construction • Construct a complete graph • Compute Pearson correlation coefficient between each nodes. • A high-pass filter between 0.70 to 0.95 threshold • Network representation • Laplacian of the graph G
Methods • Eigenvalue and eigenvector computation • Aim to solve the eigenvalue problem defined above. • resulting eigenvalues and associated eigenvectors , • The eigenvector associated with λ1 was exacted and sorted in increasing order.
Exmaple Result: λ1=0.7216 V1=
Methods • Cluster detection • Using a sliding window technique • Significant difference m + s/2 , m:median ; s:standard deviation • If less than 10 nodes, discard
Methods • Paracliqueextraction [17.] • The base maximum clique size is 3.
Methods • Functional comparisons • To analyze some resulting paracliques in yeast and human, respectively. • Use SaccharomycesGenome Database GO Slim Viewer and Ingenuity Pathways Analysis software
Results and discussion • A nearly-disconnected components. [10.] Result: λ1 The ability to find the nearly-disconnected pieces allows us to identify those nodes sharing a well connected,ordense, cluster.
Results and discussion • Spectral properties & Algebraic connectivity • the multiplicity of the zero eigenvalue is equal to the number of connected components in the graph. • When analyzing only the spectrum of the largest component , the smallest nonzero eigenvalue (λ1): algebraic connectivity
Results and discussion • Spectral properties & Algebraic connectivity Algebraic connectivity yeast human
Results and discussion • Spectral clustering :potential threshold • Resulting in a likely nearly-disconnected component. 0.78 0.83
Results and discussion • Comparison with other results • Traditional methods
Results and discussion • Comparison with other results • Previous studies (1) [19.] • Based on RMT approach to determine correlation threshold • result = 0.77,corresponds approximately to 0.78
Results and discussion • Comparison with other results • Previous studies (2) [14.] • We select g=3, to enumerate paraclique.
Results and discussion • Functional comparisons : SGD & IPA • yeast Largely the same categories appeared within the three largest paracliques in both groups.
Results and discussion • Functional comparisons : SGD & IPA • human
Conclusion • Here presented a systematic threshold selection method that make use of spectral graph theory. • The results in agreement with previous study. • At higher threshold • Fewer of these genes fail to be categorized based upon the gene ontology. • Fewer networks were identified as being enriched in the paracliques, making interpretation of the results easier.
Reference • [10.] Ding CHQ, He X and Zha H: A spectral method to separate disconnected and nearly-disconnected web graph components.Proceedingsof the Seventh ACM International Conference on Knowledge Discovery and Data Mining: 26–29 August 2001; San Francisco 2001. • [14.] Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, EisenMB,Brown PO, Botstein D and Futcher B: Comprehensive identification of cell cycle-regulated genes of the yeast Saccaromycescerevisiae by microarray hybridization. Molecular Biology of the Cell 1998, 9 • [17.]CheslerEJ and Langston MA: Combinatorial genetic regulatory network analysis tools for high throughput transcriptomic data. RECOMB Satellite Workshop on Systems Biology and Regulatory Genomics: 2–4 December 2005; San Diego 2005.