340 likes | 1.57k Views
PRINTS Overview. PRINTS - a database of protein family “ fingerprints ” Fingerprints - groups of motifs excised from alignments used to provide diagnostic signatures for protein families Used in gene family analysis, genome annotation , etc . PRINTS forms basis of derived resources
E N D
PRINTS Overview • PRINTS - a database of protein family “fingerprints” • Fingerprints - groups of motifs excised from alignments • used to providediagnostic signaturesfor protein families • Used in gene family analysis, genome annotation, etc. • PRINTS forms basis of derived resources • e.g., blocks, emotif,InterPro
Advantages of PRINTS • Several other similar resources - PROSITE, Pfam, SMART… • multiple motif matching allows increased sensitivity • PRINTS approach allows diagnosis of protein subfamilies • PROSITE & PRINTS unique in providing substantial amounts of annotation, aiming to • document the constituent families • rationalise conserved regions instructural & functionalterms
Limitations • The annotation process is entirely manual • hence the rate-limiting step in the growth of the database • PROSITE & PRINTS therefore small by comparison with their automatically-derived counterparts
gc; gx; gn; ga; gt; gp; bb; gr; bb; gd; bb; si; SUMMARY INFORMATION si; ------------------- sd; 37 codes involving 8 elements sd; 0 codes involving 7 elements sd; 0 codes involving 6 elements sd; 0 codes involving 5 elements sd; 0 codes involving 4 elements sd; 1 codes involving 3 elements sd; 0 codes involving 2 elements bb; ci; COMPOSITE FINGERPRINT INDEX ci; --------------------------- cr; cd; 8| 37 37 37 37 37 37 37 37 cd; 7| 0 0 0 0 0 0 0 0 cd; 6| 0 0 0 0 0 0 0 0 cd; 5| 0 0 0 0 0 0 0 0 cd; 4| 0 0 0 0 0 0 0 0 cd; 3| 1 0 0 0 1 1 0 0 cd; 2| 0 0 0 0 0 0 0 0 cd; --+----------------------------------------- cd; | 1 2 3 4 5 6 7 8 bb; tp; PRIO_COLGU PRIO_MACFA PRIO_CEREL PRIO_ODOHE KA; P40251 M1 P40254 M1 P79142 M1 P47852 M1 tp; PRIO_GORGO PRIO_PANTR PRIO_HUMAN O46648 KA; P40252 M1 P40253 M1 P04156 M1 O46648 M1 tp; PRIO_SHEEP PRIO_CALJA PRIO_BOVIN PRP2_BOVIN KA; P23907 M1 P40247 M1 P10279 M1 Q01880 M1 bb; tt; PRIO_COLGU MAJOR PRION PROTEIN PRECURSOR (PRP) (PRP27-30) (PRP33-35C) - COLOBUS GUEREZA. tt; PRIO_MACFA MAJOR PRION PROTEIN PRECURSOR (PRP) (PRP27-30) (PRP33-35C) - MACACA FASCICULARIS (CRAB EATING MACAQUE) tt; PRIO_CEREL MAJOR PRION PROTEIN PRECURSOR (PRP) - CERVUS ELAPHUS (RED DEER). tt; PRIO_ODOHE MAJOR PRION PROTEIN PRECURSOR (PRP) - ODOCOILEUS HEMIONUS (MULE DEER) (BLACK-TAILED DEER). tt; PRIO_GORGO MAJOR PRION PROTEIN PRECURSOR (PRP) (PRP27-30) (PRP33-35C) - GORILLA GORILLA GORILLA (LOWLAND GORILLA) tt; PRIO_PANTR MAJOR PRION PROTEIN PRECURSOR (PRP) (PRP27-30) (PRP33-35C) - PAN TROGLODYTES (CHIMPANZEE) tt; PRIO_HUMAN MAJOR PRION PROTEIN PRECURSOR (PRP) (PRP27-30) (PRP33-35C) (ASCR) - HOMO SAPIENS (HUMAN). ‘Naked’ fingerprint
Fingerprint Annotation • Structured fields : • identifier, accession number, title, creation date, etc • literature references • database cross-links • Free text : • function and structure of the protein family • associated diseases • family relationships • Technical information : • fingerprint details • motif locations, etc
Annotated Fingerprint gc; PRION gx; PR00341 gt; Prion protein signature gp; INTERPRO; IPR000817 gp; PROSITE; PS00291 PRION_1; PS00706 PRION_2 gp; BLOCKS; BL00291 gp; PFAM; PF00377 prion bb; gr; 1. STAHL, N. AND PRUSINER, S.B. gr; Prions and prion proteins. gr; FASEB J. 5 2799-2807 (1991). gr; gr; 2. BRUNORI, M., CHIARA SILVESTRINI, M. AND POCCHIARI, M. gr; The scrapie agent and the prion hypothesis. gr; TRENDS BIOCHEM.SCI. 13 309-313 (1988). gr; gr; 3. PRUSINER, S.B. gr; Scrapie prions. gr; ANNU.REV.MICROBIOL. 43 345-374 (1989). bb; gd; Prion protein (PrP) is a small glycoprotein found in high quantity in the brain of animals infected with gd; certain degenerative neurological diseases, such as sheep scrapie and bovine spongiform encephalopathy (BSE), gd; and the human dementias Creutzfeldt-Jacob disease (CJD) and Gerstmann-Straussler syndrome (GSS). PrP is gd; encoded in the host genome and is expressed both in normal and infected cells. During infection, however, the gd; PrP molecules become altered and polymerise, yielding fibrils of modified PrP protein. gd; gd; PrP molecules have been found on the outer surface of plasma membranes of nerve cells, to which they are gd; anchored through a covalent-linked glycolipid, suggesting a role as a membrane receptor. PrP is also gd; expressed in other tissues, indicating that it may have different functions depending on its location. gd; gd; The primary sequences of PrP's from different sources are highly similar: all bear an N-terminal domain gd; containing multiple tandem repeats of a Pro/Gly rich octapeptide; sites of Asn-linked glycosylation; an gd; essential disulphide bond; and 3 hydrophobic segments. These sequences show some similarity to a chicken gd; glycoprotein, thought to be an acetylcholine receptor-inducing activity (ARIA) molecule. It has been gd; suggested that changes in the octapeptide repeat region may indicate a predisposition to disease, but it is gd; not known for certain whether the repeat can meaningfully be used as a fingerprint to indicate susceptibility. gd; gd; PRION is an 8-element fingerprint that provides a signature for the prion proteins. The fingerprint was gd; derived from an initial alignment of 5 sequences: the motifs were drawn from conserved regions spanning gd; virtually the full alignment length, including the 3 hydrophobic domains and the octapeptide repeats gd; (WGQPHGGG). Two iterations on OWL18.0 were required to reach convergence, at which point a true set comprising gd; 9 sequences was identified. Several partial matches were also found: these include a fragment (PRIO_RAT) gd; lacking part of the sequence bearing the first motif,and the PrP homologue found in chicken - this matches gd; well with only 2 of the 3 hydrophobic motifs (1 and 5) and one of the other conserved regions (6), but has an gd; N-terminal signature based on a sextapeptide repeat (YPHNPG) rather than the characteristic PrP octapeptide.
Fingerprint Types • Family : • eg, 5-HT2A receptors • similar structure/function • Super-family : • eg, GPCRs • common structural framework (7 transmembrane helices) • transduce extracellular signals by coupling to G-proteins • often different physiological functions • Domain family : • eg, sequences containing SH2 domains • different structure, functions etc • (non-receptor protein tyrosine kinases, structural proteins such as tensin; small adaptor molecules such as oncoprotein Crk; etc…)
gc; OPSD gx; PR gn; COMPOUND(12) ga; 21-MAR-2002 gt; Rhodopsin signature gp; PRINTS; PR00237 GPCRRHODOPSN; PR00238 OPSIN; PR00579 RHODOPSIN gp; PROSITE; PS00237 G_PROTEIN_RECEP_F1_1; PS50262 G_PROTEIN_RECEP_F1_2 gp; PROSITE; PS00238 OPSIN gp; PFAM; PF00001 7tm_1 gp; INTERPRO; IPR000276; IPR001760 gp; PDB; 1BOJ; 1BOK; 1EDS; 1EDV; 1EDW; 1EDX; 1F88; 1FDF gp; SCOP; 1BOJ; 1BOK; 1EDS; 1EDV; 1EDW; 1EDX; 1F88; 1FDF gp; CATH; 1BOJ; 1BOK; 1EDS; 1EDV; 1EDW; 1EDX; 1F88; 1FDF gp; MIM; 180380; 268000; 163500 bb; gr; 1. HUNT, D.M., FITZGIBBON, J., SLOBODYANYUK, S.J., BOWMAKER, J.K. AND DULAI, K.S. gr; Molecular evolution of the cottoid fish endemic to Lake Baikal deduced from nuclear DNA gr; evidence. gr; MOL.PHYLOGENET.EVOL. 8 415-422 (1997). gr; gr; 2. FYHRQUIST, N., DONNER, K., HARGRAVE, P.A., MCDOWELL, J.H., POPP, M.P. gr; AND SMITH, W.C. gr; Rhodopsins from three frog and toad species: sequences and functional comparisons. gr; EXP.EYE RES. 66 295-305 (1998). gr; gr; 3. SUNG, C.H., DAVENPORT, C.M., HENNESSEY, J.C., MAUMENEE, I.H., JACOBSON, gr; S.G., HECKENLIVELY, J.R., NOWAKOWSKI, R., FISHMAN, G., GOURAS, P. AND NATHANS, J. gr; Rhodopsin mutations in autosomal dominant retinitis pigmentosa. gr; PROC.NATL.ACAD.SCI.USA 88 6481-6485 (1991). gr; gr;4. PALCZEWSKI, K., KUMASAKA, T., HORI, T., BEHNKE, C.A., MOTOSHIMA, H., FOX, B.A., LE, TRONG, gr;I., TELLER, D.C., OKADA, T., STENKAMP, R.E., YAMAMOTO, M. AND MIYANO, M. gr;Crystal structure of rhodopsin: a G protein-coupled receptor. gr; SCIENCE 289 739-745 (2000). gr; gr;5. YEAGLE, P.L., DANIS, C., CHOI, G., ALDERFER, J.L. AND ALBERT, A.D. gr;Three dimensional structure of the seventh transmembrane helical domain of the G-protein gr;receptor, rhodopsin. gr;MOL.VISION 6 125-131 (2000).
gd; Function: gd; Visual pigments are the light-absorbing molecules that mediate vision. They consist of gd; an apoprotein, opsin, covalently linked to cis-retinal. gd; gd; Additional Information: gd; Integral membrane protein. gd; gd; Disease: gd; Defects in rho are one of the causes of autosomal dominant retinitis pigmentosa (adrp). gd; Patients typically have night vision blindness and loss of midperipheral visual field; gd; as their condition progresses, they lose their far peripheral visual field and gd; eventually central vision as well. (OPSD_HUMAN; P08100; Q16414) gd; gd; Defects in rho are one of the causes of autosomal recessive retinitis pigmentosa (arrp). gd; (OPSD_HUMAN; P08100; Q16414) gd; gd; Defects in rho are also one of the causes of congenital stationary night blindness (csnb4). gd; (OPSD_HUMAN; P08100; Q16414) gd; gd; Family and structural information: gd; The structure has been determined, e.g. "Crystal structure of rhodopsin: a G gd; protein-coupled receptor" [4] and "Three dimensional structure of the seventh gd; transmembrane helical domain of the G-protein receptor, rhodopsin” [5]. gd; gd; Belongs to family 1 of g-protein coupled receptors. Opsin subfamily. gd; gd; Keywords: Retinal protein; Transmembrane; G-protein coupled receptor; Phosphorylation; gd; Lipoprotein; Palmitate; Acetylation; Retinitis pigmentosa; Disease mutation; gd; 3D-structure; Glycoprotein; Vision; Photoreceptor. gd; gd; OPSD is a 12-element fingerprint that provides a signature for the rhodopsin proteins. gd; The fingerprint was derived from an initial alignment of 23 sequences: the motifs were gd; drawn from conserved regions spanning virtually the full alignment length. Two iterations gd; on SPTR40_18f were required to reach convergence, at which point a true set comprising 79 gd; sequences was identified.
Limitations • Annotation SWISS-PROT dependent • only well-structured, annotated sequence database available • inherits SWISS-PROTerrors gd; Polymorphism at position 171 may be related to the gd; alleles of scarpie incubation-control (sic) gene in this species. • fails where SWISS-PROT annotationlimited or inconsistent gd; The muscarinic acetylcholine receptor mediates various cellular responses, including gd; inhibition of adenylate cyclase, breakdown of phosphoinositides & modulation of potassium gd; channels through the action of g proteins. Primary transducing effect is inhibition of gd; adenylate cyclase. gd; The muscarinic acetylcholine receptor mediates various cellular responses, including gd; inhibition of adenylate cyclase, breakdown of phosphoinositides & modulation of potassium gd; channels through the action of g proteins. Primary transducing effect is adenylate gd; cyclase inhibition.
BioMinT • Augmentation of core PRECIS annotation • biological literature abstracts • full texts online • PRECIS as 1st pass in iterative annotation process • Concise SWISS-PROT information wrt families • Synonyms, gene names, etc • Exploitable database X-links (eg, OMIM) • Literature abstracts