160 likes | 300 Views
Proteins. Proteins control the biological functions of cellular organisms e.g. metabolism, blood clotting, immune system Building blocks – amino acids amino group ( NH 2 ), carboxyl group ( COOH ), side chain R. The Protein Data Bank. Protein sequence and structure.
E N D
Proteins • Proteins control the biological functions of cellular organisms • e.g. metabolism, blood clotting, immune system • Building blocks – amino acids • amino group (NH2), carboxyl group (COOH), side chain R
Protein sequence and structure • Protein alphabet consists of 20 amino acids Sequence view Structure view ADKELKFLVVDDFSTMRRIV.....
Protein structure and function • Function is determined by 3D shape/structure Thrombin Facilitates blood clotting Hirudin Anticoagulant (blocks active site)
Protein structure and function • Structure conserves better evolution information Myoglobin family 1MBC: VLSEGEWQLVLHVWAKVE..... 2FAL: XSLSAAEADLAGKSWAPV.....
Structural Bioinformatics • Pairwise alignment algorithms • DALI (Holm and Sander, Journal of Molecular Biology, 1993) • LOCK (Singh and Brutlag, ISMB, 1997) • CE (Shindyalov and Bourne, Protein Engineering, 1998) • SSM (Krissinel and Henrick, Acta Cryst., 2004) • Ye et al. JBCB, 2004 • Multiple alignment algorithms • Gerstein and Levitt, ISMB, 1996: Iterative dynamic programming • SSAP (Orengo and Taylor, Methods Enymol., 1996): Two-level DP • Leibowitz et al., ISMB, 1999): Geometric hashing • CE-MC (Guda et al., PSB, 2001) • MAMMOTH (Lupyan et al., Bioinformatics, 2005) • MAPSCI (Ye at al., WABI, 2006)
Structural Bioinformatics • Homology detection • Hidden Markov models (Jaakola et al., JCB, 2000) • Spectrum, Mismatch kernel (Leslie et al., Bioinformatics, 2002) • Structure kernel (Qiu et al., Bioinformatics, 2007) • Protein structure prediction • Jones and Hadley, Bioinformatics: Sequence, structure and databanks. 2000. • FUGUE (Shi et al., J. Mol. Biol., 2001) • SCOP (Andreeva, Nucleic Acids Res., 2004) • Protein docking • Shoichet et al., J. Comput. Chem., 1992. • Choi et al., WABI, 2004. • Wang et al., PSB, 2005. • Sousa et al., Proteins, 2006.
Pairwise Structure Alignment • Given two proteins represented by the Cαatoms (backbone) • find 3D transformation that superimposes a large number of the Cα atoms • ensure that overall distance between matched pairs is as small as possible • Trade-off between number of matches and total distance between
Pairwise Structure Alignment Ye et al. JBCB 2004 • Uses orientation independent representation of proteins based on the fact that Cα atoms are ~4 Ǻ apart
Pairwise Structure Alignment Ye et al. JBCB 2004 • The protein is represented as a sequence of angle triplets {(α1, β1, γ1), (α2, β2, γ2), …, (αn, βn, γn) }
Pairwise Structure Alignment Ye et al. JBCB 2004 • Compute a local alignment based on angle representation • Find maximal subset of runs with similar transformation matrices
Pairwise Structure Alignment Ye et al. JBCB 2004 • The main algorithm • Compute the angle based representation • Align the angle based representation • Identify runs with similar transformation matrices • Compute initial structural alignment • Refine the alignment iteratively • Running time is ~(m+n)2where m, n are the protein lengths
Multiple Structure Alignment • Given a set of proteins represented by the Cαatoms (backbone) • find a simultaneous alignment of all structures • find a consensus structure that represents all of them
Multiple Structure Alignment • The main algorithm • find initial consensus structure (one of the given proteins) • pairwise align the consensus and each of the proteins • merge the pairwise alignments from previous step • recompute the consensus protein; repeat from step 2 • Merging the pairwise alignments similar to sequence case P1 = BBCA, P2 = CBBA, P3 = BCCA P1: -BBCA, P1:= BBCA P: -BBCA P2: CBB-A, P3:= BCCA P: CBB-A P: -BCCA
Multiple Structure Alignment • Computation of consensus structure (after merging alignments)
Multiple Structure Alignment • Algorithm flowchart