140 likes | 154 Views
Learn about the classification of proteins into families based on sequence, structure, function, and evolution. Understand the terms motif, domain, signature, profile, seed, family, and cluster. Explore the importance of protein families in function prediction, evolutionary research, finding new protein folds, and similarity-based functional research. Discover the role of domains in gene evolution and the definition of a protein family. Explore resources like Pfam, Prosite, SMART, PRINTS, tigrFam, and ProDom for protein family integration.
E N D
Classification to Families We can classify proteins into families by: • A. Sequence • B. Structure • C. Function (annotation) • D. Evolution
Used Terms:Motif = Domain = Signature = Profile = Seed Family = Cluster These terms are used interchangeably, They are very (too) flexible
Motif = Domain = Function ??? • A motif is a sequence signature. • Structural definition of a domain: an independently folding structural unit. • A protein family is not well-defined. • Protein function is not well-defined (some proteins can have several functions). • Conclusion: these terms are used interchangeably, but they are very flexible.
Protein folds Toxin binding protein (TolB) Glucose dehydrogenase Di-isopropyl-fluorophosphatase
Dominant domain fold types. Holm and Sander. PROTEINS: Structure, Function, and Genetics 33:88–96 (1998)
Why Research Protein Families? • Function prediction and annotation. • Evolutionary research - finding orthologs and paralogs. • Search for new protein folds. • Functional research by similarity in characteristics.
Domains are the building blocks of evolution: some facts.. 3 domains Each occurs in diverse sets of protein families Number of domains in proteins ranges from 1 up to tens Structural based domain are ~ 150 aa Length varies: some are very short 30-40 aa, other are long > 500 aa Domain definition is somewhat blurred Domain boundary is an unsolved problem Pyruvate kinase, PDB:1pkn
How is a novel gene born? • Domains are the evolutionary units of sequence that comprise the gene coding regions. • Most genes are built from more than one domain. • Novel genes can be created by recombination of domains into new domain arrangements.
Correspondence between functional associations and genes linked by the fusion method From Glycolysis: M. genitalium PGK Glycerone-P M. genitalium TIM PGK1 M. genitalium GAPDH Glyceraldehyde-3P GAPDH Glycerate-1,3P2 Thermotoga Maritima PGK+TIM TIM Glycerate-3P Phytophthora infestans TIM+GAPDH
What is a Protein Family? • Protein family: A group of proteins that have a common protein ancestor. • Is it that simple? • Domains: non-linear evolution Who is in this family?
A protein can have several same or different domains Fibronectin protein–1fnf
SCOP CATH FSSP GO KEGG The Power of Integration Pfam, Prosite, SMART, PRINTS, tigrFam, ProDom InterPro