320 likes | 330 Views
This message provides important announcements for the Protein Structure & Function course, including exam grades, project deadlines, seminar schedules, and reading assignments.
E N D
11/4/05Protein Structure & Function D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Announcements Exam 2 - Has been graded - Will be returned at end of class today Grade statistics – 444 Average = 81/100 544 Average = 100/118 Questions? D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Announcements BCB 544 Projects - Important Dates: Nov 2 Wed noon - Project proposals due to David/Drena Nov 4 Fri PM - Approvals/responses & tentative presentation schedule to students Dec 2 Fri noon - Written project reports due Dec 5,7,8,9 class/lab - Oral Presentations (20') (Dec 15 Thurs = Final Exam) D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Bioinformatics Seminars Nov 4 Fri 12:10 PM BCB Faculty Seminar in E164 Lago How to do sequence alignments on parallel computers Srinivas Aluru, ECprE & Chair, BCB Program http://www.bcb.iastate.edu/courses/BCB691F2005.html Next week: Nov 10 Thurs 3:40 PM ComS Seminar in 223 Atanasoff Computational Epidemiology Armin R. Mikler, Univ. North Texas http://www.cs.iastate.edu/~colloq/#t3 D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Bioinformatics Seminars CORRECTION: Week after next - Baker Center/BCB Seminars: (seminar abstracts available at above link) Nov 14 Mon 1:10 PM Doug Brutlag, Stanford Discovering transcription factor binding sites Nov 15 Tues 1:10 PM Ilya Vakser, Univ Kansas Modeling protein-protein interactions both seminars will be in Howe Hall Auditorium D Dobbs ISU - BCB 444/544X: Protein Structure & Function
RNA Structure & Function/Prediction Protein Structure & Function Mon Review - promoter prediction RNA structure & function Wed RNA structure prediction 2' & 3' structure prediction miRNA & target prediction - Lab 10 Fri - a few more words re: Algorithms Protein structure & function D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Reading Assignment (for Fri/Mon) • Mount Bioinformatics • Chp 10 Protein classification & structure prediction http://www.bioinformaticsonline.org/ch/ch10/index.html • pp. 409-491 • Ck Errata:http://www.bioinformaticsonline.org/help/errata2.html • Other? That should be plenty… D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Review last lecture:RNA Structure Prediction D Dobbs ISU - BCB 444/544X: Protein Structure & Function
microRNApathway RNAi pathway Exogenous dsRNA, transposon, etc. MicroRNA primary transcript Drosha Dicer Dicer precursor siRNAs miRNA target mRNA RISC RISC RISC “translational repression” and/or mRNA degradation mRNA cleavage, degradation miRNA and RNAi pathways C Burge 2005 D Dobbs ISU - BCB 444/544X: Protein Structure & Function
miRNA Challenges for Computational Biology • Find the genes encoding microRNAs • Predict their regulatory targets • Integrate miRNAs into gene regulatory pathways & networks Computational Prediction of MicroRNA Genes & Targets Need to modify traditional paradigm of "transcriptional control" primarily by protein-DNA interactions to include miRNA regulatory mechanisms! C Burge 2005 D Dobbs ISU - BCB 444/544X: Protein Structure & Function
RNA structure prediction strategies Secondary structure prediction • Energy minimization (thermodynamics) 2) Comparative sequence analysis (co-variation) 3) Combined experimental & computational D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Secondary structure prediction strategies Energy minimization (thermodynamics) • Algorithm: Dynamic programming to find high probability pairs (also, some genetic algorithms) • Software: Mfold - Zuker Vienna RNA Package - Hofacker RNAstructure - Mathews Sfold - Ding & Lawrence R Knight 2005 D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Secondary structure prediction strategies 2) Comparative sequence analysis (co-variation) • Algorithms: Mutual information Stochastic context-free grammars • Software: ConStruct Alifold Pfold FOLDALIGN Dynalign R Knight 2005 D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Secondary structure prediction strategies 3) Combined experimental & computational • Experiment: Map single-stranded vs double-stranded regions in folded RNA • How? Enzymes: S1 nuclease, T1 RNase Chemicals: kethoxal, DMS R Knight 2005 D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Experimental RNA structure determination? • X-ray crystallography • NMR spectroscopy • Enzymatic/chemical mapping • Molecular genetic analyses D Dobbs ISU - BCB 444/544X: Protein Structure & Function
1) Energy minimization method What are the assumptions? Native tertiary structure or "fold" of an RNA molecule is (one of) its "lowest" free energy configuration(s) Gibbs free energy = G in kcal/mol at 37C = equilibrium stability of structure lower values (negative) are more favorable Is this assumption valid? in vivo? - this may not hold, but we don't really know D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Basepair Basepair A U A U A=U A=U G = -1.2 kcal/mole Why 1.2 vs 1.6? A U U A A=U U=A G = -1.6 kcal/mole Free energy minimization What are the rules? What gives here? C Staben 2005 D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Energy minimization calculations:Base-stacking is critical - Tinocco et al. C Staben 2005 D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Nearest-neighbor parameters Most methods for free energy minimization use nearest-neighbor parameters (derived from experiment) for predicting stability of an RNA secondary structure (in terms of Gat 37C) & most available software packages use the same set of parameters: Mathews, Sabina, Zuker & Turner, 1999 D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Energy minimization - calculations: Total free energy of a specific conformation for a specific RNA molecule = sum of incremental energy terms for: • helical stacking (sequence dependent) • loop initiation • unpaired stacking (favorable "increments" are < 0) Fig 6.3 Baxevanis & Ouellette 2005 D Dobbs ISU - BCB 444/544X: Protein Structure & Function
But how many possible conformations for a single RNA molecule? Huge number: Zuker estimates (1.8)N possible secondary structures for a sequence of N nucleotides for 100 nts (small RNA…) = 3 X 1025 structures! Solution? Not exhaustive enumeration… • Dynamic programming O(N3) in time O(N2) in space/storage iff pseudoknots excluded, otherwise: O(N6 ), time O(N4 ), space D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Algorithms based on energy minimization For outline of algorithm used in Mfold, including description of dynamic programming recursion, please visit Michael Zuker's lecture:http://www.bioinfo.rpi.edu/~zukerm/lectures/RNAfold-html From this site, you may also download his lecture as either PDF or PS file. Hmmm, something based on this might make an interesting "Final Exam" question: how could one apply dynamic programming approaches learned in first half of course to RNA structure prediction problem? D Dobbs ISU - BCB 444/544X: Protein Structure & Function
2) Comparative sequence analysis (co-variation) Two basic approaches: • Algorithms constrained by initial alignment Much faster, but not as robust as unconstrained Base-pairing probabilities determined by a partition function • Algorithms not constrained by initial alignment Genetic algorithms often used for finding an alignment & set of structures D Dobbs ISU - BCB 444/544X: Protein Structure & Function
RNA Secondary structure prediction: Performance? • How evaluate? • Not many experimentally determined structures • currently, ~ 50% are rRNA structures • so "Gold Standard" (in absence of tertiary structure): • compare with predicted RNA secondary structure with that determined by comparative sequence analysis (!!??)using Benchmark Datasets • NOTE: Base-pairs predicted by comparative sequence analysis for large & small subunit rRNAs are 97% accurate when compared with high resolution crystal structures! - Gutell, Pace D Dobbs ISU - BCB 444/544X: Protein Structure & Function
RNA Secondary structure prediction: Performance? • Energy minimization (via dynamic programming) 73% avg. prediction accuracy - single sequence 2) Comparative sequence analysis 97% avg. prediction accuracy - multiple sequences (e.g., highly conserved rRNAs) much lower if sequence conservation is lower &/or fewer sequences are available for alignment 3) Combined - recent developments: combine thermodynamics & co-variation & experimental constraints? IMPROVED RESULTS D Dobbs ISU - BCB 444/544X: Protein Structure & Function
RNA structure prediction strategies Tertiary structure prediction Requires "craft" & significant user input & insight • Extensive comparative sequence analysis to predict tertiary contacts (co-variation) e.g., MANIP - Westhof • Use experimental data to constrain model building e.g., MC-CYM - Major • Homology modeling using sequence alignment & reference tertiary structure (not many of these!) • Low resolution molecular mechanics e.g., yammp - Harvey D Dobbs ISU - BCB 444/544X: Protein Structure & Function
New Today: Protein Structure & Function D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Protein Structure & Function • Protein structure - primarily determined by sequence • Protein function - primarily determined by structure • Globular proteins: compact hydrophobic core & hydrophilic surface • Membrane proteins: special hydrophobic surfaces • Folded proteins are only marginally stable • Some proteins do not assume a stable "fold" until they bind to something = Intrinsically disordered • Predicting protein structure and function can be very hard --& fun! D Dobbs ISU - BCB 444/544X: Protein Structure & Function
4 Basic Levels of Protein Structure D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Primary & Secondary Structure • Primary • Linear sequence of amino acids • Description of covalent bonds linking aa’s • Secondary • Local spatial arrangement of amino acids • Description of short-range non-covalent interactions • Periodic structural patterns: -helix, b-sheet D Dobbs ISU - BCB 444/544X: Protein Structure & Function
Tertiary & Quaternary Structure • Tertiary • Overall 3-D "fold" of a single polypeptide chain • Spatial arrangement of 2’ structural elements; packing of these into compact "domains" • Description of long-range non-covalent interactions (plus disulfide bonds) • Quaternary • In proteins with > 1 polypeptide chain, spatial arrangement of subunits D Dobbs ISU - BCB 444/544X: Protein Structure & Function
"Additional" Structural Levels • Super-secondary elements • Motifs • Domains • Foldons D Dobbs ISU - BCB 444/544X: Protein Structure & Function