350 likes | 563 Views
Predicting Protein Properties and Structure. Rui Alves. Organization of the Talk. From cDNA sequence to protein sequence. Analyzing the information in the protein sequence Predicting the fold (secondary structure) of a protein Predicting the (tertiary) structure of a protein.
E N D
Organization of the Talk • From cDNA sequence to protein sequence. • Analyzing the information in the protein sequence • Predicting the fold (secondary structure) of a protein • Predicting the (tertiary) structure of a protein
Predicting protein sequence from DNA sequence • Protein sequence can be predicted by translating the cDNA and using the genetic code.
Translating cDNA into protein sequence ATGTCTCTTATATGA… No Gene!!!!! Met Ser Leu Ile Ter
Translating yeast mitochondrial cDNA into protein sequence ATGTCTCTTATATGA………SECIS sequence There is a Gene with a considerably different protein sequence from the one we would predict from the universal genetic code!!!!! Trp Ser Thr Met sCys Met Ser Leu Ile Ter
Organization of the Talk • From cDNA sequence to gene sequence. • Analyzing the information in the protein sequence • Predicting the fold (secondary structure) of a protein • Predicting the (tertiary) structure of a protein
Protein Sequence Database Inferring function from sequence No Known Homologues in the Database Oh, $#!¥!!! Go to the Protein Databank to get structure & Live happily ever after Your Sequence
Analyzing the information in the protein sequence • Physical-Chemical Information
Analyzing the physical chemical information in the protein sequence Why are these properties useful? For example, they help identifying your protein in an electrophoresis gel
How to predict molecular mass Ala Cys -H2O Molecular Mass: 71.09 71.09+103.15-18
+ Protein Charge 0 - 0 16 pH How to predict isoelectric point At each value of pH, calculate the state of hydrogenation of each residue and thus the charge of the whole protein Amino acid pKa is dependent upon environment Buried amino acids do not gain/loose protons as easily as exposed amino acids … Does not work very well Isoelectric point is the pH at which the protein is not charged Ala Cys … ~10 Isoelectric Point: - 9.3 …
Analyzing the information in the protein sequence • Physical-Chemical Information • e.g. http://prowl.rockefeller.edu/prowl-cgi/sequence.exe/.fsa • Localization, modifications & secondary structure Information • E.g. http://seq.cbrc.jp/proteinLocalizationResources/localizationLinks.html
How is the localization of a protein predicted? • Search for homology to the relevant TS in your protein • Complications: • Small sequences, divergence, change between organisms • Signal Peptides • Nuclear localization signals at the N-terminal • Mitochondrial TS • Peroxysomal TS • …
How are post translational modifications to a protein predicted? • Signal sequences • Search for homology to pattern peptides
a-helix coil b-strand Training set of known structures Test set of known structures Database of known structures Training set of corresponding sequences Test set of corresponding sequences Database of corresponding sequences … How is 2ndary structure predicted? ACDEFGTYAEE… Predict 2ary structure Compare Bad Predictions: Reshuffle training set and test set and repeat until predictions are correct Good Predictions: Method ready for new sequence 2ndary structure prediction p(aa1-coil) p(aa1-helix) p(aa1-strand) …
17 aa residues How are transmembrane regions predicted? Transmembrane segments are 17 residues long Hydrophobic Hydrophobic Two Transmembrane helices
How is membrane orientation predicted? Signal Peptide NH Outside HN- Cytosol NH 15 aa 15 aa +++ --- 17 aa
Organization of the Talk • From cDNA sequence to gene sequence. • Analyzing the information in the protein sequence • Predicting the fold (secondary structure) of a protein • Predicting the (tertiary) structure of a protein
What is fold? • Fold can be roughly defined as the succession of a-b-coil structures in a protein
Fold Prediction Fold Prediction Database of known structures Database of corresponding sequences Homology based helix coil-strand profile folds database Server … … Database of probabilities of aa in 2ndary structure How is fold predicted? YOUR SEQUENCE Weak/No Homology Strong Homology Helix-coil-strand profile prediction
Organization of the Talk • From cDNA sequence to gene sequence. • Analyzing the information in the protein sequence • Predicting the fold (secondary structure) of a protein • Predicting the (tertiary) structure of a protein
Predicting protein structure • Homology Modeling • 3D-JIGSAW, SWISSMODEL • Ab initio Modeling • ROBETTA
Database of known structures Database of corresponding sequences Server/ Program … … How does homology modeling work? …YDVRSEQVENCE… … Optimization via energy minimization, etc… Thread sequence to predict over known structure according to alignment …YDVR-SEQVENCE… …YDVRMSD-VDNCD… Best possible Sequence alignment … Strong Homologues …YDVR-SEQVENCE… …YDVRMSD-VDNCD…
Predicting protein structure • Homology Modeling • 3D-JIGSAW,SWISSMODEL • Ab initio Modeling • ROBETTA
Database of corresponding sequences Database of structures for smaller amino acid runs Server/ Program … … Predicting protein structure by ab initio methods …YDVRSEQVENCE… …YDVR-SEQ …YDVRMSD-… …YDVR-SEQ …YPVRMSD-… … …VENCE… …YDNCD… Assemble …VENCE… …VEQCE… Energy minimization & optimization NO Homologues …
Summary • From cDNA sequence to gene sequence. • Analyzing the information in the protein sequence • Predicting the fold (secondary structure) of a protein • Predicting the (tertiary) structure of a protein