1.27k likes | 1.67k Views
DNA Genomics RNA Genomics/Transcriptomics Protein Proteomics Metabolites Metabolomics. The Central Dogma-omics. Protein Machines. The polyAdenylation Machinery. The Proteosome. Key Concept: Biochemical functions are carried out by multi-protein machines.
E N D
DNA Genomics RNA Genomics/Transcriptomics Protein Proteomics Metabolites Metabolomics The Central Dogma-omics
Protein Machines The polyAdenylation Machinery The Proteosome Key Concept: Biochemical functions are carried out by multi-protein machines Key Concept: A Protein Function can be inferred by it’s binding partners Key Concept: Knowledge of a Machine’s components is required to understand how it works and how it is regulated
Protein Machines Interaction with each other in Higher order Networks Key Concept: Highly Clustered areas typically serve the same biological function.
Understanding the Network May Give Insights into Emergent Behaviors -Homeostasis -Robustness -Periodicity -Morphogenesis -Tumorigenesis Key Concept: Complex phenotypes can be understood in a network context
Proteins Are Organized in a “Small World” Network Key Concept: The proteome is HIGHLY Networked
The Small World Hypothesis: Six Degrees of Separation Stanley Milgram study in 1967 -put ads in newspapers in Nebraska and Kansas asking for volunteers for an experiment. The volunteers were asked to contact a divinity student in Boston by going through people that they new on a first name basis who would then contact their friends and so on. -the number of people (degrees) be- tweenthe volunteers and the target ranged between 2 and 10 with the meanbeing 6.
Properties of Small World Networks: -highly clustered: “my friends are also friends” -most nodes are not connected: “most people are strangers” -presence of hubs (nodes with a lot of connections): “Facebook Whales” -can find a short path between any two nodes. “Two strangers meet and realize they know some of the same person” This path is often referred to as the degree of separation -network should be resistant to pertub- ation: “Life goes on”
Distribution of Connections Number of nodes with k links 80/20 Law Number of Links (k)
Ten best Centers 1. CLU1 1.843 2. CDC33 1.867 3. TIF2 1.875 4. MDH1 1.898 5. SRP1 1.912 6. YBL004W 1.914 7. RPT3 1.914 8. HAS1 1.914 9. YGR090W 1.917 10. PFK1 1.918 Ten Worst Centers CAC2 3.803 PSR1 3.838 RAM2 3.840 RAM1 3.840 ORC2 3.863 UBA3 3.902 MAK10 3.975 YNL056W 4.003 YNR046W 4.089 VPS4 4.433 Shortest and longest Pathways Median Degree of Separation : 2.38
Is S. cerevisaeRobust??? -Environmentally Robust -Robust to temperature (4-40 C) -Robust to Nutrient Sources -Robust to Starvation -Robust to Osmolarity (0-1 M NaCl) -Is it Robust to Genetic Perturbation (mutation)??? -S. cerevisiae Genome Deletion Project has deleted 95% of all S. cereviae genes -18.7% of genes are essential -in a typical small world network you can lose ~20% of all nodes before the network crashes.
Ten best Centers 1. CLU1 1.843 2. CDC33 1.867 3. TIF2 1.875 4. MDH1 1.898 5. SRP1 1.912 6. YBL004W 1.914 7. RPT3 1.914 8. HAS1 1.914 9. YGR090W 1.917 10. PFK1 1.918 Ten Worst Centers CAC2 3.803 PSR1 3.838 RAM2 3.840 RAM1 3.840 ORC2 3.863 UBA3 3.902 MAK10 3.975 YNL056W 4.003 YNR046W 4.089 VPS4 4.433 Is there any biology behind the network hypothesis? Median Degree of Separation : 2.38 Essential ORF deletions are only available as heterozygous diploids, while non-essential ORF deletions are available as haploids, homozygous diploids and heterozygous diploids.
Ten best Centers 1. CLU1 1.843 2. CDC33 1.867 3. TIF2 1.875 4. MDH1 1.898 5. SRP1 1.912 6. YBL004W 1.914 7. RPT3 1.914 8. HAS1 1.914 9. YGR090W 1.917 10. PFK1 1.918 Ten Worst Centers CAC2 3.803 PSR1 3.838 RAM2 3.840 RAM1 3.840 ORC2 3.863 UBA3 3.902 MAK10 3.975 YNL056W 4.003 YNR046W 4.089 VPS4 4.433 Is there any biology behind the network hypothesis? Median Degree of Separation #: 2.38 Essential ORF deletions are only available as heterozygous diploids, whilenon-essential ORF deletions are available as haploids, homozygous diploids and heterozygous diploids. Key Concept: Connectivity and essentiality are correlated.
Evolutionary Effects of Connectedness -Connected genes are non randomly distributed in the genome -Connected genes are less likely to undergo duplication -Connected genes are less likely to have close homologs -Connected genes are less likely to have introns
Is S. cerevisae Robust??? -Environmentally Robust -Robust to temperature (4-40 C) -Robust to Nutrient Sources -Robust to Starvation -Robust to Osmolarity (0-1 M NaCl) -Is it Robust to Genetic Perturbation (mutation)??? -S. cerevisiae Genome Deletion Project has deleted 95% of all S. cereviae genes -18.7% of genes are essential Is Cancer a Robust Network -Environmentally Robust -It Lives under a constant state of genomic stress
Summary -Proteins are organized in functional units (machines) -these machines do virtually all the work in the cell -understanding the components of a machine is critical for functionally annotating the genome -understanding the components of a machine is critical for determining how a machine is regulated -the effects of mutation are great at this level -Protein Machines are organized into higher order Networks -the Network architecture has left its imprint on evolution -the Network is likely to be rewired under pathological pathological conditions -especially in the case of cancer -understanding the Network is important for understanding the complex behavior of the system Key Concept: High Throughput mapping of protein:protein interactions will provide important insights into human biology
Understanding the Network Requires a lot of Information -Direction of Information -Sign -Magnitude -Timing
Understanding the Network Requires a lot of Information -Direction of Information -Sign -Magnitude -Timing
Understanding the Network Requires a lot of Information -Direction of Information -Sign -Magnitude -Timing
Understanding the Network Requires a lot of Information -Direction of Information -Sign -Magnitude -Timing
Understanding the Network Requires a lot of Information -Direction of Information -Sign -Magnitude -Timing
Approaches for Mapping Protein:Protein Interactions -Mapping by Inference: -if two proteins interact in one organism than they interact in other organisms. -can be extended to domains/motifs as well -if two proteins are coregulated on microarrays they are likely to interact -Direct Mapping: -In vitro binding experiment -Genetic Screen/Trap -Yeast 2-hybrid assay -Affinity Co-purifications -IP:Western blot -IP:Mass Spectrometry
Interactomics by Genetic Screens Uetz et al 2001 Key Concept: Genetic Complementation allows the identification of direct (binary) interactions.
Interactomics by Genetic Screens Advantages of Genetic Complementation: -can do genome scale screening -quick -cheap -adaptable -works best when the screen is based on selection Problems of Genetic Complementation: -sensitive to dynamic range -protein interaction may be incompatible with the complementation scheme -can not perturb the system -more false positives than true positives Key Concept: No matter how good something is…there are always problems.
Affinity Governs the formation of Protein Complexes Affinity is Determined by the shapes of the proteins and how well they fit together. -hydrophobic interactions -ionic interactions -hydrogen bonding Affinity is usually expressed as Kd which is the [ ] that results in equivalent [ ] and [ ]. Implicitly, there is usually a mixture of free and complexed components and this ratio is [] dependent. + Kd= [ ] x [ ] [ ]
Affinity Governs the formation of Protein Complexes + A weak interaction may only form if the concentration is high enough. +
Interactomics by Co-purification Tap Tagging: Rigaut et al. Nat. Biotech. 1999. Key Concept: Interacting proteins will co-purify
Interactomics by Co-purification Advantages of Co-purification: -proteins isolated from their native source -the system can be perturbed Problems of Co-purification: -sensitive to dynamic range -real interactions may be lost during purification -can be difficult to purify the target protein -no “amplification” -need a way to identify the co-purifying proteins
antibody bead antibody bead IP trypsin digest direct Key Concept: Cutting out steps is one of the hallmarks of high through put approaches. This increases the through put and usually also increases the sensitivity.
Affinity Purification - coIP antibody bead antibody bead trypsin digest directly from beads IP
NS NS NS NS NS NS NS antibody bead NS NS NS IP Key Concept: Complex mixtures can not be manually interpreted. The average protein generates ~50 proteolytic fragments…..so you will have 1000s and 1000s to interpret.
Sources of “Non-Specific Binding” -Not enough washing. -Biofluids have a high dynamic range so you must wash away the super abundant stuff to see the less concentrated proteins -Proteins that stick to the beads -Proteins that stick to the antibodies on the beads -Proteins that stick to the wall of the tube -Proteins that stick to your complex of interest -Proteins that are real binders but are biologically irrelevant
Are All Protein Complexes Biologically Relevant? An interaction will be selected for if it is beneficial. An interaction will be selected against if it is detrimental. What happens if the interaction is neither beneficial nor detrimental? What would be the cost of allowing only beneficial interactions? +
Comparison of Three Analysis Techniques on Lysates 10-100 Proteins (6 hours) 100-300 Proteins (2 hours) 1000-6000 Proteins (10 hours)
Comparison of Three Analysis Techniques on IPs 53 Proteins (6hours) 76 Proteins (2 hours) 82 Proteins (10 hours)
Protein ID by Mass Spectrometry ~10,000 MS/MS per hour Key Concept: LC-MS/MS workflows can not be manually interpreted
Spectra matched 100% acquired spectrum x 0% 1 theoretical spectrum (y/b ions) 0
Compute a Correlation Score predicted? (1,0) spectrum intensities 100% matched peaks (y/b ions) 0%
The Truth about Spectral Matching predicted? (1,0) spectrum intensities -Spectral matching produces an “answer” for every spectra, even those that are artifacts. -Experimental spectra always deviate from theoretical spectra. -A high correlation score is not a guarantee that it is correct. -Peptide must be in the database in order to be found.