300 likes | 312 Views
Explore the concept of recursive domains in proteins, folding rules, super-secondary structures, and the generation of protein-like structures. Learn about folding pathways, graph theoretical tools, and the complexity of protein folds.
E N D
Recursive domains in proteins Teresa Przytycka NCBI, NIH Joint work with G.Rose & Raj Srinivasan; JHU
Domain: “Polypeptide chain (or a part of it) that can independently fold into stable tertiary structure” (Baranden & Tooze; Introduction to Protein Structure) Two-domain protein.
The 3D structure of a protein domain can be described as a compact arrangement of secondary structures Alpha helix Beta strand
There are not so many of them : PDB contains about 17000 structures and less than 1000 different folds. Proportion of "new folds" (light blue) and "old folds" (orange) for a given year. (fold = fold domain)
Possible sources of restricted number of folds: • Evolutionary history. • Given enough time would domains look “more random”? • Existence of general restrictions/rules which render some (compact) arrangements of secondary structures non-feasible. • Can real protein domains be seen as sentences in a language, which can be generated by an underlying grammar?
Can protein domains be described using a set of folding rules? • We restrict our attention to all beta domains: • they admit variety of topologies • they are difficult to predict from sequence
Parallel anti-parallel mixed “forbidden” crossed conformation Understanding b-folds • Patterns in b-sheets • Richardson 1977 • folding rules for b-sheets • Zhang and Kim 2000 • Hydrogen bonding pattern • Polypeptide chain seems to avoid “complications” • Properties of b-sandwiches • Woolfson D. N., Evans P. A., Hutchinson E. G., and Thornton J. M. 1993
Expectations for good folding rules • We need to look at fold properties that occur in non-homologous proteins. • Preferably: The provide a model for the folding process.
Super-secondary structures as precursors of folding rules • Super-secondary structure – frequently occurring arrangements of a small number of secondary structures • The occurrences of super-secondary structures in unrelated families supports possibility of their independent formation.
Example 2: Greek key and suggested folding pathway for it Folding pathway for Greek key proposed by Ptitsyn. Pattern from a Greek vase
Two level of folding rules: • Primitive folding rules – based on super secondary structures • Closure operation – allows for hierarchical application of the primitive rules
supersecondary structures -primitive folding rules hairpin Hairpin rule Bridge Greek key
Indirect wind Direct wind
Closure-composite rules • Super-secondary structures are composed of secondary structures that are neighboring in the chain sequence • However from the presence of a super-secondary structure, like a hairpin, in a protein structure follows that residues that are non consecutive become neighboring in space. Closure - “short cut” in the sequence due to a folding rule
Recursive domains Recursive domain is a part of a protein fold that can be generated using folding rules supported with the closure operation. A protein that can be fully generated using folding rules has onerecursive domain.
Examples • Example 1 • Example 2 • Example 3 • Example 4
Recursive domains Recursive domain is a part of a protein fold that can be generated using folding rules supported with the closure operation. A protein that can be fully generated using folding rules has onerecursive domain.
Graph theoretical tools and recursive domains Fold graph: Vertices – strands Edges – two types: Neighbor edges: directed edges between strands that are neighbors in chain or vie the closure operation. Domain edge: edges between stands used in the same folding rule Recursive domains = connected component of the fold graph without neighbor edges.
Can the rules generate all known folds? Partition into recursive components for small (<=10 strands) proteins Comparison with the partition for computer generated set of all possible 8-strand sandwiches Control Protein data One recursive fold
Offenders Hedhehog intein domain
Given a fold, is there a unique sequence of folding steps leading to it? Usually no. Usually there alternative sequences of folding steps leading to a construction of the same domain. Do such alternative folding sequences correspond to alternative folding pathways?
Are the rules complete? Probably not. e.g.: For propeller, each blade is in one recursive domain but we do not have a rule that will put the blades together.
Conclusions: We are getting some idea how things work... It is so nice outside. It would be nice to take the dog for a walk! Nice… dog… walk
Conclusions • Protein folds can be described by simple folding rules. • The folding rules capture at least some aspects of fold simplicity and regularity. • The sequence of folding steps leading to a given fold is usually not unique. • The folding rules generate protein-like structures.
Future directions • Can folding rules guide fold prediction? • Would hierarchical description of a fold provided by folding rules be useful for fold classification / comparison ? • Adding statistical evaluation of a recursive domain.
Acknowledgments George Rose Raj Srinivasen Rohit Pappu Venk Murthy NIH, K01 grant