1 / 20

Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps

Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Authors: Nabieva, et al. Source: Bioinformatics 2005 Reviewed by BH Shen. Goal of the Research Problem.

anila
Download Presentation

Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps Authors: Nabieva, et al. Source: Bioinformatics 2005 Reviewed by BH Shen

  2. Goal of the Research Problem • To predict the functions of proteins of unknown function from a given set of protein interaction networks and some annotated proteins.

  3. Problem instance • A protein interaction network • Node: represent a protein, which is either annotated or not annotated • A annotated node is labeled with a particular function. • Undirected Edge: indication for an interaction between the two endpoints. Such data are from experimental or computational results.

  4. An Example Network

  5. Assumptions • Postulation: Protein interaction networks provide hints to the higher-level organization of the cell. • The closer a protein to another in the network the more they share the same function. • Interaction networks can be partitioned into functional modules.

  6. Prior work • Grouping not based on functional annotations • Share interactions (Brun et al., 2003; Schlitt et al., 2003; Strong et al., 2003; von Mering et al., 2003b; Lee et al., 2004) • Shortest path vectors(Rives and Galitski, 2003) • Clique-based (Spirin and Mirny, 2003)

  7. Prior work • Neighboring interactions: 3 most freqent annotations for an unknown protein with mahority votes (Schwikowski et al., 2000) • Disavantages: limited use of the underlying graph structure.

  8. Prior work • Neighborhood: extend the neighboring method to nodes within certain radius (Hishigaki et al., 2001). • Not considering the network topology within the neigborhood.

  9. Prior work • Generalized Multiway K-cut (Vazquez et al., 2003; Karaoz et al., 2004) • Minimize the number of different annotations associated with neighboring proteins in a group. • A more general version of multiway k-cut problem. • To assign a unique function to all the unannotated nodes to minimize the sum of the costs of the edges joining node with no function in common. • Disadvantages: not reward local proximity

  10. Functional Flow – Nodes / Proteins • Uses the idea of network flow. • Each protein of known functional annotations is treated as a ‘flow source’. • The amount of flow at a source is infinite. • However, there are no sinks as the usual network flow problem.

  11. Functional Flow – Edges / Interactions • The flow at a source propagate to neighboring unannotated nodes within a predetermined number of steps. • Update of the flow to neighbors of a node at each iteration. • Each Edge has a ‘capacity’, incorporated a distance effect. • Multiple paths between two proteins result in more flow.

  12. Functional Flow – Edge reliability as the weight/capacity • Integrate multiple sources of experimental and computational results (5 in this research). • The Reliability ri of an interaction from a source i • fraction of the interaction connect proteins with a known shared function. • Combining the reliability from all sources

  13. Functional Flow – Objective functions • The score of a function for an unannotated node is the total amount of flow entering the node. • The amount of flow leaving the node is irrelevant. • The locality effect is similar in some ways to the locally constrained diffusion kernel, but the flow in this proposed method is limited by capacities on edges.

  14. Functional Flow – Rule for flow • Initial flow of function a at a node u at time 0. • The flow of function a at node u at time t.

  15. Functional Flow – flow propagation and score • The flow of function a on edge from u to v at time t. • The functional score of a for node u (the total amount of flow enters the node)

  16. Metrics for Experimental Results • N-fold cross-validation • 2-, 3-, 5-, and 10- fold cases were tested. • Performance of an algorithm evaluated by whether the top scoring prediction above some threshold is a known functional annotation (true positive, TP) or not (false positive, FP). • For multiple prediction (tricky situation),count a protein’s prediction as a TP if more than half of the predictions made for it are correct and as a FP otherwise.

  17. Experimental Results

  18. Experimental Results

  19. Experimental Results

  20. Conclusion • The proposed algorithm utilizes indirect network interactions, network topology, network distances and edges weighted by reliability estimated from multiple data sources. • The simplest methods, such as Majority, perform well if there are enough direct neighbors with known function. • Only simple reliability estimation were used. • The proposed algorithm only applied to baker’s yeast in this research, but it is likely useful when analyzing less characterized proteomes.

More Related