200 likes | 562 Views
Protein-protein interaction. Data sets and comparisons Roland Krause. Outline. Important data sets of protein-protein interaction Comparison of data sets in yeast Overview of current methods Suggested modifications. Biochemical purifications Yeast Gavin et al. (2002) Nature
E N D
Protein-protein interaction Data sets and comparisons Roland Krause
Outline • Important data sets of protein-protein interaction • Comparison of data sets in yeast • Overview of current methods • Suggested modifications
Biochemical purifications Yeast Gavin et al. (2002) Nature Ho et al. (2002) (HMS-PCI) Nature Krogan et al. (2004) Mol Cell Yeast-two hybrid Ito et al. PNAS (2000, 2001), Uetz et al. Nature (2000) (yeast) Giot et al. Science (2004) (Drosophila) Li S., et al (2004) Science (C. Elegans) Synthetic lethals Tong (2004) In silico predictions Methods Co-occurrence (Phylogenetic profiling) Neighborhood (Operon) Fusion (Rosetta) Non-orthologous replacement See review by Osterman and Overbeek in Curr Opin Chem Biol. (2003) mRNA-co-expression Eisen et al., PNAS (1998) Marcotte Nature (1999) Comparison of different data sets
Interactions MIPS DIP YPD Intact (EBI) BIND/ Blueprint GRID MINT Prediction server Predictome (Boston U) Plex (UTexas) STRING (EMBL) Protein complexes MIPS YPD Reference data bases
Comparison of protein-protein interaction screens Differences between individual methods and reference sets
energy production aminoacid metabolism other metabolism translation transcription transcriptional control protein fate cellular organization transport and sensing stress and defense genome maintenance cellular fate/organization uncharacterized Interaction density E G 0 10 M Interaction pro 1000 possible P T B F O A R D C U E G M P T B F O A R D C U Interaction density
Conclusions • The overlap between the individual methods is surprisingly small • Different methods complement each other • Individual methods are not exhausted • Single experimental methods can be as reliable as combined sets • Integration [ Bader, G. and Hogue, C. (2002) Nat. Biot.] [Kemmeren H., et al. (2002) Mol. Cell] [Von Mering C., Krause, R., et al. (2002) Nature] [Edwards et al. (2002) Trends Genet. ]
Motivation • Modules are an important aspect of networks • Delineation of functional groups/ processes • Transfer of information • Structural principle • Deconvolute the network
Proposal for the algorithms work group • Identification of protein complexes in interaction networks to obtain shared components • Modules/complexes received some attention in the last year • Current methods are not satisfactory to several questions posed • Coverage • Boundaries • Shared components
Early work • Snel et al (2000) PNAS – Modules of predicted functional interaction networks • Date et al (2003) Nat Biot. – Novel functional clusters • Von Mering et al (2003) PNAS
Finding protein complexes • MCODE • Bader and Hogue, BMC Bioinformatics (2003) • Vertex weighting, dense regions are identified as clusters • Spirin • Spirin and Mirny, PNAS (2003) • Monte Carlo simulation • Clustering purifications • Krause et al, Bioinformatics (2003) • Bayesian approach • Sharan et al. (2004)
Suggestion • Use weighted edges • Repetition • Estimate false negatives/ false positives • Use of all available experimental information • Bait-prey direction • Existence of disproving experimental information (such as failure by reversal in Y2H or complex purification) • Novel algorithms (rather that k-core or cliques)