230 likes | 360 Views
Interactions and more interactions. Rob Russell Cell Networks University of Heidelberg. Aloy & Russell Nature Rev Mol Cell Biol 2006.
E N D
Interactions and more interactions Rob Russell Cell Networks University of Heidelberg
But instead of a cell dominated by randomly colliding individual protein molecules, we now know that nearly every major process in a cell is carried out byassemblies of 10 or more protein molecules Bruce Alberts, Cell 1998
a Native GAL4 UASG GAL1-lacZ Y Y X X b Individual hybrids with GAL4 domains c Interaction between hybrids reconstitutes GAL4 activity GAL4 DNA- binding domain GAL4 DNA- binding domain UASG GAL1-lacZ UASG GAL1-lacZ GAL4 activating region UASG GAL1-lacZ Yeast two-hybrid system Fields & Song, Nature, 340, 245, 1989 Applied to whole Yeast genome Uetz et al, Nature, 403, 623, 2000. Ito et al, PNAS, 98, 4569, 2001.
Interaction discovery I The two-hybrid system Binary interactions: Bait Prey FUS3 DIG2 DIG2 FUS3 LSM2 PAT1 CKS1 CLB1 NPL4 UFD1 NPL4 CDC48 NPL4 FUC1 NPL4 SUA7 . . . . . . x1000s Uetz et al, Nature, 2000. (Yeast) Ito et al, PNAS, 2001. (Yeast) Rain et al, Nature, 2002. (H.pylori) Giot et al, Science, 2003 (D. melanogaster) Li et al, Science, 2004 (C. elegans)
Gal-4 (C) (hypothetical) Native GAL4 CDC28 Y X Cyclin A c Interaction between hybrids reconstitutes GAL4 activity S12 CKS GAL4 DNA- binding domain Gal-4 (N) L22 UASG GAL1-lacZ The system works, but how?
Two datasets in Yeast See: Ito et al, PNAS, 2001 (comparing to Uetzet al, Nature, 2000)
Complexes: Bait Co-purification partners FUS3 DIG2 DIG1 DIG3 DIG2 FUS3 DIG2 NPL4 UFD1 CDC48 FUC1… (Etc.) x1000s Gavin et al, Nature, 2002. (Yeast) Ho et al, Nature, 180, 2002. (Yeast) Interaction discovery II Affinity purification (e.g. TAP/MS)
Trying to define binary interactions from purification data Reality Purifications only report a collection of proteins and don’t provide any information about precisely who interacts with whom. There are thus two models for representing binary interactions from complexes, neither of which are real. Purification Spoke Matrix Hakes et al, Comp Funct Genomics, 2006
Different worlds Comparing interactions to known 3D structures shows that original yeast two-hybrid datasets contain more transient interactions, compared to affinity purification datasets that contain more stable complexes (e.g. of 25 Uetz et al interactions with structures, 23 are transient, 2 are dedicated or stable) Aloy & Russell, Trends Biochem Sci, 2003
Error rates in interaction discovery False negatives: interactions known to occur that are missed by a screen - To asses this one needs a reference set of positives (i.e. known interactions) among a set of proteins being screened. The fraction of these missed is the false-negative rate. Relatively simple - normally one has a set of previously determined interactions or “gold standard” False positives: interactions reported by a screen that are incorrect - To assess this one needs a set of interactions that are known not to occur that are seen in a screen. Very difficult to obtain – how can you know that two proteins definitely do not interact? - tricks include taking pairs of proteins presumed to never see each other (i.e. different cellular compartment, etc.) Von Mering et al, Nature, 2002
Error rates in interaction discovery: the old view Von Mering et al, Nature, 2002
Error rates in interaction discovery: the new view Yu et al, Nature, 2002
Sociological bias affects the perceived performance Interactions determined on a protein by protein basis are focused around what the investigator wants to study, and thus biased towards particular areas of biology that are hot. High-throughput techniques are used precisely to find new interactions. Thus using the previously determined networks as a “gold standard” is likely to be unfair. Braun et al, Nature Methods, 2008
Interaction data: predictions I Groups of proteins entirely absent in one or more organisms among a closely related set are often functionally/physically associated Proteins in the same bacterial operon are typically functionally associated, and often physically interacting. Proteins that are separate in some organisms and fused in others are likely interacting physically. Aloy & Russell, Nature Rev Mol Cell Biol2006
Interaction data: predictions II Pairs of proteins containing a pair of domains often seen in interacting proteins can be used to infer interactions in proteins where interactions have not been observed. Pairs of proteins homologous to pairs of proteins seen to interact in known 3D structures can interact in the same way. The presence of a linear motif can indicate interactions with proteins known to bind this motif.. Aloy & Russell, Nature Rev Mol Cell Biol 2006
Interaction databases Resources are very different in appearance and content Efforts are underway to make a unified search/view, but not complete Thus one needs currently to look at several sites to check if an interaction is known Some are content (e.g. IntAct, MINT) others are processed and augmented (e.g. STRING) with additional predicted/inferred interactions
Grb-2 Sos-1 Ga/q Node RGS-4 Edge Node RGS-3 Interaction networks
Biological interaction networks • Nodes: • Proteins • Genes • Chemicals • Effects(?) • Edges: • Physical interaction (e.g. yeast two-hybrid) • Co-expression (e.g. microarrays) • Same operon • Regulation of gene expression (protein to gene) • Catalysis (e.g. metabolic networks) Node Node Edge
Jeong et al, Nature, 2001. Interaction networks Biological networks tend to be scale free: most nodes (e.g. proteins) are connected to only a few others with a handful of “hubs” making manymore interactions. They are also “small-world” in that any pair of nodes tends to be connected via a relatively small number of intermediate nodes.
“Hubs” in networks Hubs are more likely to be lethal when deleted Jeong et al, Nature, 2001 Hubs are more likely to be disordered. Haynes et al, PLoS Comp Biol, 2006
USP7 MDM2 CYCLIN Linear motifs in p53 15:DNA-PK,RSK2,ATM P 18:CK1s P NES 20:CHK2 P 9:Unknown Tetramerization domain (323-356) P 386 S DNA binding domain (95-289) 33:GSK-3s,CDK7,CDKs 55:MAPKs P P P P P P 215:AuroraA 37:DNA-PK/ATM P 371,376,378:CDK7 46:HIPK2 P P P 315:AuroraA,CDKs IUPred disorder prediction 392:CDK2s,CDK7,EIF2AK2 Russell & Gibson, FEBS Lett. 2008