290 likes | 302 Views
Introducing TORQUE algorithm for topology-free querying of protein networks, addressing limitations, key concepts, algorithms, and experiments.
E N D
Topology-Free Querying of Protein Interaction Networks Amit Ganz & Itai Yeshurun
Introduction How similar are they? Why do we need sequence-based searches? Homology relations Inferring gene function Protein structure Seen in last lecture… S. cerevisiae E. coli
Introduction Query Proteins a ? ? Previous works limitations? b Need precise interaction pattern of the query pathway For rats, mouse and bovine, most PPI is unknown Can handle only small queries due to complexity c Network
What is TORQUE? TOpology-free netwoRkQUErying
TORQUE Network Query Proteins a Input set of proteins representing a protein complex or pathway A network in which the search is to be conducted Parameters b c Best match Output matching sets of proteins that span connected regions in the network with highest score
Some definitions… PPI Network – proteins () - PPIs Coloring constraint , C-colorful if and there is a function c that assigns each a color from , such that there is exactly one vertex in H of each color in C. Schizophrenia PPI
Main Problem C-Colorful connected subgraph Given a graph , a color set and a coloring constraint function , Is there a connected subgraph of G that is C-colorful?
Indels Query a Insertions Adding colors to the solution that weren’t necessarily in the query Network b c Remember, we need connection regions… May have a solution if insertion of is allowed d
Indels Query a Deletions Removing colors from the query Network b c We may have a solution if allowing the removal of
Two Approaches DP ILP Dynamic Programming Integer Linear Programming
Dynamic Programming Single color constraints associates each with a single color
Dynamic Programming Single color constraints associates each with a single color Multiple color constraints associates each with a set of colors Reducing the problem Connected subgraph trees
Dynamic Programming • is there an S-colorful tree with root ? • InitializationFor and , • iff a b c Recurrence = { }
Dynamic Programming Example a d Computation b c false true false a true true b false Indels true c false true Deletions – native solution via DP Insertions – requires a bit more work… true false d false
Integer Linear Programming (ILP) Goal Maximize total weights of a subgraph that is C-colorful Idea Using flow constraints to ensure subgraph is C-colorful
TORQUE Algorithm Connected Components Input DP/ILP BLAST/Weights thresholds Set of Proteins PPI Network
TORQUE Algorithm Connected Components Color Input DP/ILP Using BLAST
TORQUE Algorithm Connected Components Color Input DP/ILP Search each component independently Consider only feasible components At least colors
TORQUE Algorithm Connected Components Color Input DP/ILP Choose ILP if Calculate DP/ILP
TORQUE Algorithm Best match Connected Components Color Input DP/ILP Subgraph induced by a top-scoring match
Sample run Input Mouse DNA Synthesome Yeast network
Experiments Data Acquisition Quality Evaluation Comparison to QNet PPI networks from up-to-date public databases Complexes from CORUM website Functional coherence Specificity Yeast, fly and human were large-scale networks are available QNet may use source network to infer topology Topology Free Queries Running time Mouse, rat and bovine protein complexes for which no large-scale data is available Complex size, homologs for each query protein and size of the connected component.
Reference Torque: topology-free querying of protein interaction networks Sharon Bruckner, Falk Hueffner, Richard M. Karp, Ron Shamir and Roded Sharan (2009)