160 likes | 416 Views
Tao Ma Georgia Institute of Technology 29 June, 2006. PIR: Protein Information Resource. Content. Overview Major Modules Search and Analysis Tools Demo Pros and Cons. Overview. An integrated public resource of functional annotation of protein data
E N D
Tao Ma Georgia Institute of Technology 29 June, 2006 PIR: Protein Information Resource
Content • Overview • Major Modules • Search and Analysis Tools • Demo • Pros and Cons
Overview • An integrated public resource of functional annotation of protein data • Support genomic/proteomic research and scientific discovery • Provide PIRSF family classification system • Provide iProClass integrated database of protein family, function, and structure
Major Modules • UniProtUniversal Protein Resource • iProClass Integrated Protein Classification • iProLink Integrated Protein Literature, Information and Knowledge • PIRSF PIR Super Family
UniProt Overview • The world’s most comprehensive catalog of information on proteins • Created by joining the information contained in Swiss-Prot, TrEMBL, and PIR • Retrieve curated, reliable, comprehensive information on proteins
UniProt Structure http://pir.georgetown.edu/pirwww/about/brochure.pdf
iProClass Overview • Provide summary descriptions of protein family, function and structure for UniProt sequences • Link to over 90 biological databases • Comprise reports for all UniProtKB proteins • Present comprehensive up-to-date information on proteins and protein data mapping • Retrieve thorough information about a protein
iProClass Structure http://pir.georgetown.edu/pirwww/about/brochure.pdf
iProLINK Overview • Provide annotated literature, protein name dictionary and other information • facilitate Natural Language Processing technology development • Obtain literature source that describes protein entries • Literature mining of protein phosphorylation • Mapping protein/gene names to UniProtKB entries • Text mining algorithm development using an annotated data set
iProLINK Structure http://pir.georgetown.edu/pirwww/about/brochure.pdf
PIRSF Overview • A network with multiple levels of sequence diversity • From superfamilies to subfamilies • The primary PIRSF classification unit is the homeomorphic family Homologous Homeomorphic • Manual curation for membership, etc. • Retrieve reliable curated information for your protein sequence
Search and Analysis Tools • Peptide Match • Pattern Match • Multiple Alignment • Pairwise Alignment • Text Search • Batch Retrieval • BLAST Search • FASTA Search • Related Sequence • ID Mapping • Composition/Molecular Weight Calculation
Pros & Cons • Powerful Provide many DBs and Tools • Convenient A Web Based Retrieval System • PIRSF family classification system Based on Evolutionary Relationships of Full-length Proteins • Weak in Supporting Visualization of Data
Reference • http://pir.georgetown.edu/pirwww/about/brochure.pdf • Wu CH, et al. The Protein Information Resource: an integrated public resource of functional annotation of proteins. Nucleic Acids Research, 30: 35-37, 2002. • Huang H, et al.The PIR integrated protein databases and data retrieval system. Data Science 3: 163-174, 2004.