140 likes | 349 Views
Pfam( P rotein fam ilies ). Pfam 27.0 (March 2013, 14831 families). http://pfam.sanger.ac.uk/. Protein family. A protein family is a group of evolutionarily-related proteins
E N D
Pfam(Protein families ) Pfam 27.0 (March 2013, 14831 families)
Protein family • A protein family is a group of evolutionarily-related proteins • Proteins in a family descend from a common ancestor (homology) and typically have similar three-dimensional structures, functions, and significant sequence similarity. While it is difficult to evaluate the significance of functional or structural similarity, there is a fairly well developed framework for evaluating the significance of similarity between a group of sequences using sequence alignment methods. • Proteins that do not share a common ancestor are very unlikely to show statistically significant sequence similarity, making sequence alignment a powerful tool for identifying the members of protein families.
Superfamily – family - subfamily • A common usage is that superfamilies contain families which contain sub-families. • Many proteins comprise multiple independent structural and functional units or domains. Due to evolutionary shuffling, different domains in a protein have evolved independently. This has led, in recent years, to a focus on families of protein domains. A number of online resources are devoted to identifying and cataloging such domains.
Superfamily – family - subfamily • Superfamily: The domains in a fold are grouped into superfamilies, which have at least a distant common ancestor. • Family: The domains in a superfamily are grouped into families, which have a more recent common ancestor. • Protein domain: The domains in families are grouped into protein domains, which are essentially the same protein. • Species: The domains in "protein domains" are grouped according to species. • Domain: part of a protein.
An example The human cyclophilin family, as represented by the structures of the isomerase domains of some of its members.
Protein family resources • There are many biological databases that record examples of protein families and allow users to identify if newly identified proteins belong to a known family. Here are a few examples:
Protein family resources • Pfam - Protein families database of alignments and HMMs • PROSITE - Database of protein domains, families and functional sites • PIRSF - SuperFamily Classification System • PASS2 - Protein Alignment as Structural Superfamilies v2 • SUPERFAMILY - Library of HMMs representing superfamilies and database of (superfamily and family) annotations for all completely sequenced organisms • SCOP and CATH - classifications of protein structures into superfamilies, families and domains
Pfam • The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs).
SUPERFAMILY • SUPERFAMILY is a database of structural and functional annotation for all proteins and genomes. • The SUPERFAMILY annotation is based on a collection of hidden Markov models, which represent structural protein domains at the SCOPsuperfamily level. • The Structural Classification of Proteins (SCOP) database is a largely manual classification of protein structural domains based on similarities of their structures and amino acid sequences.
SUPERFAMILY • SUPERFAMILY classifies amino acid sequences into known structural domains, especially into SCOP superfamilies. The superfamilies are groups of proteins which have structural evidence to support a common evolutionary ancestor but may not have detectable sequence homology.