170 likes | 307 Views
Proteomics, the next step. What does each protein do? Where is each protein located? What does each protein interact with, if anything? What role does it play in the cell or tissue?. Gene Ontology. Biological process Molecular function Cellular component
E N D
Proteomics, the next step • What does each protein do? • Where is each protein located? • What does each protein interact with, if anything? • What role does it play in the cell or tissue?
Gene Ontology • Biological process • Molecular function • Cellular component • Kinda like the EC commission, a mechanism for uniformity among disparate systems • http://www.geneontology.org/
Assigning ontology • Functional genomics using mini-Tn • Promoter-less lacZ with selectable markers , epitope tag and flanked by lox recombination sites • Isolated >11,000 Tn mutants that turn blue during vegetative growth (transcription and translation occur) • In addition to ontology, you also get info on regulation (expression) via lacZ handle • 1917 ORF’s/6358 strains were mutated at least once, many multiple times
Macroarrays • How do you examine that many yeast strains? • Robots spot strains on agar plates, and plates incubated under various conditions • Limited by number of conditions – used 20 in this study, some quite inventive
Caveats • Can mutate same gene, with different effects • 11 distinct insertions were characterized for Imp2 • Observe differing effects on glycerol metabolism an cell wall synthesis
Clustering • Each column a condition, each row a mutant strain – which ones behave most similarly • Allows visualization of proteins involved in >1 process or observing a potential role for an uncharacterized ORF – but how do you get ontology?
Everything you want • Phenotypic analysis will divine with certainty at least one biological process the gene is involved • Molecular function would have to be assessed biochemically • Use epitope as localization tool
Bar coding genes • Delete ORFs by homologous recombination and replace with selectable marker and 20 bp DNA sequences UPTAGS and DOWNTAGS • Each TAG (barcode) is unique to an ORF • Can PCR amplify TAGS using same set of primers • Construct a DNA barcode microarray • Each mutant strain had two, and only two spots that it would hybridize to
Contamination! • With this strategy, the investigators mixed 558 strains in same flask…removed samples at different time points (first point – label PCR reaction red and second (6hrs later) label green). • Use the microarray to determine which strains are able to better compete
So… • Bar code methodology shows which proteins provide a selective advantage in a mixed population of cells • Two spots of different sequences provides an internal control for the experiment
How do you know you’ve sampled enough cells? • Binomial probability distribution • Binary outcome, either get a 1 or 0 • If p is the probability of getting a 1 (in a single trial), (1-p) is the probability of getting a 0, then the probability that k out of N tries gives a 1 is: • P(k ‘1’s out of N) = (N over k) pk (1-p)N-k
pk (1-p)N-k • These terms represent the probability of observing a “1” with k successes and N- k failures
(N over k) • This term counts the number of ways you get a “1” • (N over k) denotes the number of ways of choosing k objects from N which is the factorial function n!/k!(n-k)! • This is also known as the binomialcoefficient • Work through math minute 6.1
Structural Genomics • Inferring function from structure (Archaea) • Aquaporin structure/function • Co-crystals as binding assays • Prion protein in yeast – Sup35 • Overproduction of prion protein in yeast leads to infectious particle
Protein interaction networks • Identified by comparative genomics analyses – neighbors interact • Yeast two-hybrid system • Various repositories for interaction data: • BIND www.bind.ca • DIP http://dip.doe-mbi.ucla.edu/hold/
Deciphering protein network graphs • Node, edges, degrees • Example in MM 6.2 • Calculate the mean and standard deviation for degrees in any given network • If the degree of a node is > than mean degree + 2 S. D., it has “high” connectivity in that network
Schwikowski paper • www.uwp.edu/~barber/bioinformatics/benno.pdf • Book web-site • http://occawlonline.pearsoned.com/bookbind/pubbooks/bc_mcampbell_genomics_1/chapter1/deluxe.html • Discovery questions 44-49 for next week