1 / 13

Identifying Functional signatures in Proteins - a computational design approach

Identifying Functional signatures in Proteins - a computational design approach. David Bernick Rohl group 16-Mar-2005. The big picture. what is function? hinges substrate/DNA/protein binding/alignment/recognition catalytic sites what isn’t function ? (structure) secondary structures,

dalmar
Download Presentation

Identifying Functional signatures in Proteins - a computational design approach

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Identifying Functional signatures in Proteins - a computational design approach David Bernick Rohl group16-Mar-2005

  2. The big picture • what is function? • hinges • substrate/DNA/protein binding/alignment/recognition • catalytic sites • what isn’t function ? (structure) • secondary structures, • fold architecture • thermodynamically required elements • nature selects for function (structure is implicit) • computational methods select for structure • can we predict…quickly ?

  3. Some terms • pssm - position specific score matrix • a [20 x length] model of residue frequencies for every position of sequence family • homolog - natural sequences evolved from a common parent • morpholog - computationally derived sequence generated from a parent structure • ortholog - common ancestor, derived by speciation (constrained functional divergence) • paralog - common ancestor, same species (unconstrained functional divergence)

  4. pssm from an alignment

  5. structure ensembles • Larson (2003) - Improved homology searches • Pei(2003) - Homology detection and active site searches • Kuhlman(2000) - Structural optimality of Natural sequences

  6. Results - SH3 domain 11 Structures 62 additional sequences

  7. Results - S100 domain Ca++ loop1 not detected backbone coordinated residues Ca++ loop2 not detected insufficient homolog depth 11 structures 30 additional sequences

  8. the protocol Sequence CE+SCOPTaylorDomsFlexible Design cogs, pfam, reverse blast blast representative structure homolog Alignment paralog structures fixeddesign score pssmH pssmM statistical geometric

  9. genome scale • high cost step - producing pssmM • precalculate pssmM for every domain

  10. morpholog pssmsgenome scale • Data Sources • Taylor parsed Domain database • CE all-to-all + SCOP • Precompute pssms for every domain • ~8000 domains • 100 sequences ~90% diversity1000 sequences ~99% diversity • ~4-8 wks, 70p cluster for initial set

  11. scoring • compare PSSMh to PSSMm • PSSMm contains only structure signal • PSSMh contains both function and structure • each position represents a count-normalized position in 20-space (H or M) • R-position -- average aa position • RH and RM define 20 space vectors • ‘function vector’ • ‘structure vector’

  12. next steps • complete this set of domains - verification • full domain pssmM generation

  13. acknowledgements • Carol Rohl • Kevin Karplus • Craig Lowe • Rohl group • HP

More Related