1 / 89

Domains, their prediction and domain databases

An introduction to bioinformatics and the study of sequence-structure-function relationships, with a focus on prediction methods, domain databases, and functional genomics.

eodegaard
Download Presentation

Domains, their prediction and domain databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. C E N T E R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U Lecture 16: Domains, their prediction and domain databases Introduction to Bioinformatics

  2. Sequence-Structure-Function Ab initio prediction and folding Sequence Structure Function impossible but for the smallest structures Threading Function prediction from structure Homology searching (BLAST) very difficult

  3. Functional Genomics – Systems Biology Genome Expressome Proteome TERTIARY STRUCTURE (fold) Metabolome TERTIARY STRUCTURE (fold) Metabolomics fluxomics

  4. Systems Biology is the study of the interactions between the components of a biological system, and how these interactions give rise to the function and behaviour of that system (for example, the enzymes and metabolites in a metabolic pathway). The aim is to quantitatively understand the system and to be able to predict the system’s time processes • the interactions are nonlinear • the interactions give rise to emergent properties, i.e. properties that cannot be explained by the components in the system • Biological processes include many time-scales, many compartments and many interconnected network levels (e.g. regulation, signalling, expression,..)

  5. Systems Biology understanding is often achieved through modeling and simulation of the system’s components and interactions. Many times, the ‘four Ms’ cycle is adopted: Measuring Mining Modeling Manipulating

  6. ‘The silicon cell’ (some people think ‘silly-con’ cell)

  7. A system response Apoptosis: programmed cell death Necrosis: accidental cell death

  8. Human Yeast ‘Comparative metabolomics’ We need to be able to do automatic pathway comparison (pathway alignment) Important difference with human pathway This pathway diagram shows a comparison of pathways in (left) Homo sapiens(human) and (right)Saccharomycescerevisiae(baker’s yeast). Changes in controlling enzymes (square boxes in red) and the pathway itself have occurred (yeast has one altered (‘overtaking’) path in the graph)

  9. Experimental • Structural genomics • Functional genomics • Protein-protein interaction • Metabolic pathways • Expression data

  10. Issue when elucidating function experimentally • Partial information (indirect interactions) and subsequent filling of the missing steps • Negative results (elements that have been shown not to interact, enzymes missing in an organism) • Putative interactions resulting from computational analyses

  11. Protein function categories • Catalysis (enzymes) • Binding – transport (active/passive) • Protein-DNA/RNA binding (e.g. histones, transcription factors) • Protein-protein interactions (e.g. antibody-lysozyme) (experimentally determined by yeast two-hybrid (Y2H) or bacterial two-hybrid (B2H) screening ) • Protein-fatty acid binding (e.g. apolipoproteins) • Protein – small molecules (drug interaction, structure decoding) • Structural component (e.g. -crystallin) • Regulation • Signalling • Transcription regulation • Immune system • Motor proteins (actin/myosin)

  12. Catalytic properties of enzymes Michaelis-Menten equation: Vmax × [S] V = ------------------- Km + [S] Vmax Km kcat E + S ES E + P • E = enzyme • S = substrate • ES = enzyme-substrate complex (transition state) • P = product • Km = Michaelis constant • Kcat = catalytic rate constant (turnover number) • Kcat/Km = specificity constant (useful for comparison) Moles/s Vmax/2 Km [S]

  13. Protein interaction domains http://pawsonlab.mshri.on.ca/html/domains.html

  14. Energy difference upon binding Examples of protein interactions (and of functional importance) include: • Protein – protein (pathway analysis); • Protein – small molecules (drug interaction, structure decoding); • Protein – peptides, DNA/RNA The change in Gibb’s Free Energy of the protein-ligand binding interaction can be monitored and expressed by the following equation:   G =  H – T  S  (H=Enthalpy, S=Entropy and T=Temperature)

  15. Protein-protein interaction networks

  16. Protein function • Many proteins combine functions • Some immunoglobulin structures are thought to have more than 100 different functions (and active/binding sites) • Alternative splicing can generate (partially) alternative structures

  17. Protein function & Interaction Active site / binding cleft Shape complementarity

  18. Protein function evolution Chymotrypsin

  19. How to infer function • Experiment • Deduction from sequence • Multiple sequence alignment – conservation patterns • Homology searching • Deduction from structure • Threading • Structure-structure comparison • Homology modelling

  20. Cholesterol Biosynthesis: Cholesterol biosynthesis primarily occurs in eukaryotic cells. It is necessary for membrane synthesis, and is a precursor for steroid hormone production as well as for vitamin D. While the pathway had previously been assumed to be localized in the cytosol and ER, more recent evidence suggests that a good deal of the enzymes in the pathway exist largely, if not exclusively, in the peroxisome (the enzymes listed in blue in the pathway to the left are thought to be at least partly peroxisomal). Patients with peroxisome biogenesis disorders (PBDs) have a variable deficiency in cholesterol biosynthesis

  21. Cholesterol Biosynthesis: from acetyl-Coa to mevalonate Mevalonate plays a role in epithelial cancers: it can inhibit EGFR

  22. Epidermal Growth Factor as a Clinical Target in Cancer A malignant tumour is the product of uncontrolled cell proliferation. Cell growth is controlled by a delicate balance between growth-promoting and growth-inhibiting factors. In normal tissue the production and activity of these factors results in differentiated cells growing in a controlled and regulated manner that maintains the normal integrity and functioning of the organ. The malignant cell has evaded this control; the natural balance is disturbed (via a variety of mechanisms) and unregulated, aberrant cell growth occurs. A key driver for growth is the epidermal growth factor(EGF) and the receptor for EGF (the EGFR) has been implicated in the development and progression of a number of human solid tumours including those of the lung, breast, prostate, colon, ovary, head and neck.

  23. Energy housekeeping: Adenosine diphosphate (ADP) – Adenosine triphosphate (ATP)

  24. Chemical Reaction

  25. Add Enzymatic Catalysis

  26. Add Gene Expression

  27. Add Inhibition

  28. Metabolic Pathway: Proline Biosynthesis Proline as end product effects a negative feedback loop

  29. Transcriptional Regulation

  30. Methionine Biosynthesis in E. coli

  31. Shortcut Representation

  32. High-level Interaction representation

  33. Levels of Resolution

  34. SREBP Pathway

  35. Signal Transduction Important signalling pathways: Map-kinase (MapK) signalling pathway, or TGF- pathway

  36. Transport

  37. Phosphate Utilization in Yeast

  38. Multiple Levels of Regulation • Gene expression • Protein posttranslational modification • Protein activity • Protein intracellular location • Protein degradation • Substrate transport

  39. Graphical Representation – Gene Expression

  40. Protein interaction domains http://pawsonlab.mshri.on.ca/index.php?option=com_content&task=view&id=30&Itemid=63

  41. Domain function Active site / binding cleft

  42. Protein-protein (domain-domain) interaction Shape complementarity

  43. A domain is a: • Compact, semi-independent unit (Richardson, 1981). • Stable unit of a protein structure that can fold autonomously (Wetlaufer, 1973). • Recurring functional and evolutionary module (Bork, 1992). “Nature is a tinkerer and not an inventor” (Jacob, 1977). • Smallest unit of function

  44. Delineating domains is essential for: • Obtaining high resolution structures (x-ray but particularly NMR – size of proteins) • Sequence analysis • Multiple sequence alignment methods • Prediction algorithms (SS, Class, secondary/tertiary structure) • Fold recognition and threading • Elucidating the evolution, structure and function of a protein family (e.g. ‘Rosetta Stone’ method) • Structural/functional genomics • Cross genome comparative analysis

  45. Domain connectivity linker

  46. Structural domain organisation can be nasty… Pyruvate kinase Phosphotransferase b barrel regulatory domain a/b barrel catalytic substrate binding domain a/b nucleotide binding domain 1 continuous + 2 discontinuous domains

  47. Domain size • The size of individual structural domains varies widely • from 36 residues in E-selectin to 692 residues in lipoxygenase-1 (Jones et al., 1998) • the majority (90%) having less than 200 residues (Siddiqui and Barton, 1995) • with an average of about 100 residues (Islam et al., 1995). • Small domains (less than 40 residues) are often stabilised by metal ions or disulphide bonds. • Large domains (greater than 300 residues) are likely to consist of multiple hydrophobic cores (Garel, 1992).

More Related