230 likes | 447 Views
iProClass Protein Knowledgebase. Integration of Protein Family, Function, Structure Rich Links to >90 Databases Value-Added Reports for UniProtKB Proteins. Ways to get to iProClass text search. iProClass Text Search. Search tips:
E N D
iProClass Protein Knowledgebase • Integration of Protein Family, Function, Structure • Rich Links to >90 Databases • Value-Added Reports for UniProtKB Proteins
Ways to get to iProClass text search iProClass Text Search Search tips: 1- Use “not null” or “null” to search entries that “contain” or “do not contain” information in the selected search field, respectively. In the present example, we want to search for proteins that have enzymatic activity corresponding to EC 1.14.16.1 and have 3D structure (PDB ID not null). 2- Use and/or/not logical operators Select field Add (+)/delete (-) input boxes Search!
Things you can do from the result table: Add search terms or start over iProClass Text Search Result (I) 2. Customize the table columns 3. Save your results as table or FASTA format 4. Select entries using check boxes and perform analysis using tool bar options 5. Links to protein records, to protein names (BioThesaurus), to protein families (PIRSF) 1 2 3 4 5
b- Use the > to add item into the “Fields in display” box 2. How to customize the table columns: Display PDB ID column iProClass Text Search Result (II) a- Select PDB ID in the “Fields not in display” box c- Now PDB ID should be in the “Field in display”. Press apply button for the changes to take place.
b- Click on “Save Result As: Table” to store the information in the result table. This file can be opened in Excel as shown below. c- Click on FASTA to save protein sequences. 3. Save your results as table or FASTA format iProClass Text Search Result (III) a- Select Entries using check boxes in the Protein AC/ID column. To select all, check the box in the column heading.
4. Select entries using checkboxes and perform analysis using tool bar options iProClass Text Search Result (IV) a- Select Entries using check boxes in the Protein AC/ID column. To select all, check the box in the column heading. Then select tool, e.g., Domain Display Domain Display shows Pfam domains present in the proteins selected
iProClass Text Search Result (V) 5. Links to protein records, to protein names (BioThesaurus), to protein families (PIRSF) Link to protein reports Link to PIRSF report Link to pre-computed BLAST Link to taxonomy Link to protein names
Shows ID correspondence to other databases iProClass Protein Report (I) See protein synonyms pre-computed BLAST Rich links & extensive cross-references
iProClass Protein Report (II) Integrated added-value information from other databases
iProClass Protein Report (III) Links to different protein family classification databases Interactive Domain and Sequence Display
See protein synonyms and the source attribution iProClass Text Search Result (VII)
iProClass Text Search Result (VII) Related Sequences (pre-computed BLAST) show proteins similar to the query, significantly faster than running BLAST in real time, and may also evidence tight protein clusters (related sequence number low).
Batch Retrieval in iProClass Due to the diversity of databases and the lack of consistency in protein/gene names and/or identifiers in the literature, it can be difficult to retrieve multiple entries when protein and gene identifiers come from different sources. The batch retrieval tool overcomes this problem and provides high flexibility, allowing the retrieval of multiple entries from the iProClass database by selecting a specific identifier or a combination of them. If possible, specify the type of ID 3979833 304131 24660393
Batch Retrieval Result Page Retrieve more sequences Choose columns to be displayed Links to iProClass and UniProtKB reports
Search a Pattern in iProClass Pattern search at PIR allows:1- The search of a specific PROSITE or user-defined pattern against one of the following sequence database: (i) UniProtKB is the central hub for the collection of functional information on proteins, with accurate, consistent, and rich annotation. It consists of two sections: a section containing manually-annotated records (UniProtKB/Swiss-Prot), and a section with computationally analyzed records that await full manual annotation (UniProtKB/TrEMBL). (ii) A subset of UniProtKB entries belonging to a certain organism or taxon group. (iii) UniRef100 provides clustered sets of sequences at 100% identity from UniProtKB (including splice variants and isoforms) and selected UniParc records. A pattern is a formula (regular expression) that represents the conserved region of a group of related proteins. Enter pattern P-D-x(2)-H-[DE]-[LIVF]-[LIVMFY]-G-H-[LIVMC]-[PA] PROSITE is a database that contains patterns and profiles specific for more than a thousand protein families or domains. Enter PROSITE ID
Search a Pattern Result in iProClass Display the query pattern Sequence range where pattern is found
Link to PROSITE documentation Search a Pattern in iProClass Pattern search at PIR allows:2- The search of PROSITE patterns (note that profiles are excluded) in a query sequence, entering the single amino acid code sequence or its unique ID. Enter sequence MNDRADFVVPDITTRKNVGLSHDANDFTLPQPLDRYSAEDHATWATLYQRQCKLLPGRACDEFMEGLERLEVD Enter ID
Protein ID Mapping Service Maps between UniProtKB and more than 30 other data sources to support data interoperability among disparate data sources and to allow integration and querying of data from heterogeneous molecular biology databases. Enter IDs Load file with ID list
Select ID type for source database Select ID type for target database Mapping Protein ID Mapping Service Example: we want to obtain a list of Entrez Gene IDs for a group of UniProtKB proteins P04176 P16331 P00439 P17276 Enter IDs IDs can be cut and pasted if needed or saved as a text file using the "save as" option provided by your web browser.
iProClass Protein Knowledgebase Cite iProClass: The iProClass Integrated database for protein functional analysisWu CH, Huang H, Nikolskaya A, Hu Z, Yeh LS, Barker WC.Computational Biology and Chemistry, 28: 87-96, 2004. iProClass Distribution: iProClass is freely available for academic institutions. Vendors and commercial entities who want to use and/or redistribute iProClass need to contact PIR to request a license (pirmail@georgetown.edu).