1 / 30

How to use computational tools to maximize the coverage of

How to use computational tools to maximize the coverage of protein sequence/structure/function space. PSI Bottlenecks 1) Not enough connection between modeling and biology/experiment 2) “ Modelability ” not used in defining families or a dynamic target selection strategy

ira
Download Presentation

How to use computational tools to maximize the coverage of

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. How to use computational tools to maximize the coverage of protein sequence/structure/function space PSI Bottlenecks 1) Not enough connection between modeling and biology/experiment 2) “Modelability” not used in defining families or a dynamic target selection strategy 3) Incomplete use of functional information in model building Murray Lab: Nebojsa Mirkovic, Tonya Silkov, Hunjoong Lee, Frank Indiviglio, Janey Li Honig Lab: Markus Fischer and Donald Petrey

  2. Phosphoinositide signaling processes denotes a phosphoinositide headgroup

  3. Biophysical properties of cellular protein/membrane interactions Intracellular membranes contain distinct lipid compositions and carry different charge densities Binding behavior of a +8e peptide to membranes carrying different negative charge densities

  4. Proteins that function in phosphoinositide pathways contain multiple membrane binding motifs Motif 1 Motif 2 C1/DAG C2/Ca2+ Protein kinase C–,, PH/PIP2 C2/Ca2+ Phospholipase C– PH/PIP2 PX/PI3P Phospholipase D FYVE/PI3P PH/PI FGD1(a Rho/Rac GEF) Basic/PS PH/PIP2 GPCR kinase C2/Ca2+ NonpolarCytosolicphospholipase A2 ENTH/PIP2Prot/protEpsin1, AP180 MyristateBasic/PS Src, MARCKS, (HIV-1 Gag) Multiple inputs: Temporal and spatial control of subcellular targeting through coincidence counting

  5. Many peripheral proteins, especially those involved in subcellular targeting , are either highly basic or charge polarized. +25 mV -25 mV

  6. Quantitative physical theory for the interaction of proteins with membrane surfaces

  7. Connection among biophysical properties, membrane binding behavior, and subcellular localization No calcium Calcium Phospholipase C C2 domains Homology models of all isoforms 5-lipoxygenase C2 domain Homology model

  8. Structural genomics and proteomics-level studies of lipid-interacting domains: Northeast Structural Genomics and Arabidopsis 2010 Apply what we have learned to whole families BAR domains C1 domains C2 domains ENTH domains FERM domains FYVE domains GRAM domains PDZ domains PH domains PHD domains PX domains Sec14 domains START domains VHS domains High-throughout comparative modeling: Leverage structure information

  9. All lipid-binding domains in all model genomes Use what we have learned computationally and experimentally to develop: 1. More complete lists of peripheral proteins of known structure from the PDB; 2. Detect and model all instances of peripheral proteins in sequence databases; 3. Discover new instances, novel functionalities, new families; 4. Create databases to house this information; 5. Use this information to annotate protein sequences of unknown function.

  10. SkyLine: High-throughput comparative modeling “Modelability”: Create “reliable” models using known structures as templates Nebojsa Mirkovic Proteins 66:766 PDB Structure Secondary structure DSSP ClustalW Multiple alignments Sequence Target reprioritization PSI-BLAST Modeling alignments MarkUs: Function annotation Homologues Homologous structures Family analysis Non-redundant & unsolved Specialized databases Modeller or Nest Data on homologues (species, IDs, coverage, length, e-value, seq. is.) Models Web-accessible models database PROSA, pG score Model quality Leverage: unique models pG > 0.7

  11. NESG Models Database Frank Indiviglio

  12. Hunjoong Lee Models Database: http://156.145.102.40/nesg3/nesg.php “Leverage”: Number and quality of 3D models produced from a set of structures as templates PSI1 and PSI2: NESG leverage ~220 sequence unique models

  13. Alternative models based on different PDB templates, reliability measures and sequence coverage

  14. Additional search mechanisms: Expand methodology to the entire PDB, create specialized family and genome databases

  15. C2 domains from phospholipase C isoforms: Comparative functionality Kd 2.3x10-9 M Kd 2.6x10-9 M

  16. C2 domains from phospholipase C isoforms: Comparative functionality Kd 8.9x10-8 M → 6.2x10-9 M 4.0x10-8 M

  17. Differences between d1 and d4: Detection of specificity determinants leads to hypotheses for differential regulation Kd 2.3x10-9 M Kd 8.9x10-8 M → 6.2x10-9 M

  18. Whole family modeling: FYVE domains FYVE domain family: Electrostatic properties of models correlate with in vitro binding measurements and subcellular localization: Comparison of different members

  19. FYVE domain family: Electrostatic properties of models correlate with in vitro binding measurements and subcellular localization: Residue substitution of a single family member

  20. Dynamic target re-prioritization is an important strategy Model/Computation Experiment Structure There is no straightforward prescription: Each family has to be dealt with individually “Modelability”: Create “reliable” models using known structures as templates

  21. START domain leverage Modelability (7378) versus 30% sequence identity (2767) 409 395 83 410 36 35 171 341 86 29 16 54 78 356 134 71 63

  22. Collaborations with Experimental Groups Characterize different START domains based on structural information Discriminate whether START domains bind cholesterol or PC (PI) or other ligands Provide leads for chemical library studies for function-interfering compounds Detailed computational analysis and function annotation Fine-grain structure analysis in the absence and presence of potential ligand Experimental characterization: Protein production, SPR analysis, cellular studies Cho Lab: High-throughput analysis of Human and Arabidopsis START domains Clark Lab: Docking studies of ubiquinone into nematode START domain, electron transport

  23. START domains in the Arabidopsis thaliana genome SkyLine produces quality models for 58 non-redundant sequences versus 35 Arabidopsis START domains detected by sequence searches (Genome Biology 5:R41) Key Findings (Tonya Silkov) • 45 sequences are of the Birch antigen class • 2. Two sequences correspond to AHA1 domains (Activator of Hsp90 ATPase) • SCOP classifies AHA domains as belonging to the Birch antigen superfamily • Two sequences predicted in databases as integral membrane proteins of unknown function • Five sequences for related models apparently represent a group of uncharacterized plant • START domains

  24. Fig. 1 Cross-genomic studies Structure similarity among lipid-binding domains PIP2 PIP2 ENTH domain ANTH domain VHS domain Tonya Silkov

  25. ENTH and ANTH: similar topology, different membrane binding mechanism Helix 0 J Biol Chem. 278:28993 with Cho Lab ENTH ANTH ANTH ENTH

  26. Arabidopsis domain with novel dual ENTH and ANTH functionality Tonya Silkov ENTH ANTH From above ENTH ANTH Helix 0 Helix 0 Cho Lab: First 25 amino acids are required for both PIP2 binding and membrane penetration. Produce enough protein to obtain crystals.

  27. Fig. 1 A novel functional subclass of VHS domains ENTH domain ANTH domain VHS domain Tonya Silkov

  28. A new VHS-related family, “VR domains”, found in other genomes KIAA1530 (Homo sapiens) XP_420852 (Gallus gallus) Tonya Silkov CAB71110 (Arabidopsis thaliana) XP_747424 (Strongylocentrotus purpuratus)

  29. Among this subset of VHS domains, the basic surface patch is conserved Hypothesis: It constitutes a phosphoinositide-specific binding site VR domain family of membrane-binding VHS domains Human and Arabidopsis constructs are being examined in the Cho lab Tonya Silkov

  30. The ability to construct a quality model of a sequence is a more strategic definition of a protein family member Allows for the discovery of distantly related members With function annotation, allows for the discovery of new sub-groups Structures + Sequences -> Models + Function annotation (Markus) More comprehensive coverage of protein sequence/structure/function space By constantly updating resources as new information becomes available, we produce a more relevant (dynamic) target selection strategy

More Related