1 / 48

Protein Structure Exercise

Protein Structure Exercise. Bioinformatics Tools and Databases Foothill College. Protein Sequences. From NCBI, search either proteins or genomes – using keywords etc. From Genome, type in HIV-1 What links do you have from there? Choose NC_001802, then coding region

Download Presentation

Protein Structure Exercise

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Protein Structure Exercise Bioinformatics Tools and Databases Foothill College

  2. Protein Sequences • From NCBI, search either proteins or genomes – using keywords etc. • From Genome, type in HIV-1 • What links do you have from there? • Choose NC_001802, then coding region • From that entry, save FASTA protein • Identify the gag-pol and env sequence

  3. HIV-1 Gag-Pol AA Sequence >gi|28872819|ref|NP_057849.4| Gag-Pol; Gag-Pol polyprotein [Human immunodeficiency virus 1] MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQT GSEELRSLYNTVATLYCVHQRIEIKDTKEALDKIEEEQNKSKKKAQQAAADTGHSNQVSQNYPIVQNIQG QMVHQAISPRTLNAWVKVVEEKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAA EWDRVHPVHAGPIAPGQMREPRGSDIAGTTSTLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPT SILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTETLLVQNANPDCKTILKALGPAATLEEMMTAC QGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKIVKCFNCGKEGHTARNCRAPRKKGCWKCGKEG HQMKDCTERQANFLREDLAFLQGKAREFSSEQTRANSPTRRELQVWGRDNNSPSEAGADRQGTVSFNFPQ VTLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKMIGGIGGFIKVRQYDQILIEICGHKAI GTVLVGPTPVNIIGRNLLTQIGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEK EGKISKIGPENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGD AYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQY MDDLYVGSDLEIGQHRTKIEELRQHLLRWGLTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWT VNDIQKLVGKLNWASQIYPGIKVRQLCKLLRGTKALTEVIPLTEEAELELAENREILKEPVHGVYYDPSK DLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAHTNDVKQLTEAVQKITTESIVIWGKTPKFKLPI QKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFYVDGAANRETKLGKAGYVTNRGR QKVVTLTDTTNQKTELQAIYLALQDSGLEVNIVTDSQYALGIIQAQPDQSESELVNQIIEQLIKKEKVYL AWVPAHKGIGGNEQVDKLVSAGIRKVLFLDGIDKAQDEHEKYHSNWRAMASDFNLPPVVAKEIVASCDKC QLKGEAMHGQVDCSPGIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTI HTDNGSNFTGATVRAACWWAGIKQEFGIPYNPQSQGVVESMNKELKKIIGQVRDQAEHLKTAVQMAVFIH NFKRKGGIGGYSAGERIVDIIATDIQTKELQKQITKIQNFRVYYRDSRNPLWKGPAKLLWKGEGAVVIQD NSDIKVVPRRKAKIIRDYGKQMAGDDCVASRQDED

  4. BLASTing PDB • Open two browsers • Open the URL to NCBI BLAST P • BLAST the PDB database with the amino acid sequence from gag-pol and then env • Go to http://us.expasy.org/tools/blast/ • BLAST the PDB database as above • What are the top structures at each site?

  5. Digging Deeper into Sequence • From the expasy PDB BLAST return: • Choose another sequence (close or far) and do a Multiple Sequence Alignment • Choose BLOSSUM or PAM matrices • View the alignments in HTML format

  6. NCBI BLASTp of PDB • After doing the BLAST P of PDB: • Click on related structures to see more • Follow the PDB links to the MMDB • Hint: you can use some of these structures at VAST for structure comparisons • How can you display the structure? • RasMol, SPDBV, and Cn3D viewers

  7. BLAST P – Other Data • From NCBI BLAST P – what are the conserved domains that are detected? • Click on each to find the Pfam entries • Show domain relatives (CDART) • (The next two images show results for gag-pol and env proteins – try both) • Path is from CDD to CDART – explore!

  8. Conserved Domain Databases • NCBI contains a database of conserved domains. These are linked, by sequence to BLAST and other tools. • Conserved domains represent “functional folds” in nature’s playbook. • You can compare your sequence by alignment (Pfam) to other protein folds. • Use CDART for graphical domain display.

  9. exPASy Proteomics Tools • http://us.expasy.org/tools/ • Protein identification and characterization • DNA -> Protein • Similarity searches, pattern and profile searches • Post translational modifications • Topology prediction • Primary structure analysis • Secondary structure prediction • Tertiary structure • Sequence alignment • Biological text analysis

  10. exPASy ScanProsite • Go to exPASy ScanProsite • http://www.expasy.ch/tools/scanprosite/ • Enter either HIV sequence (gag-pol or env) into the search box • You can choose email data return here • What are the post translational modifications? Click on the references.

  11. PIR – Georgetown University • Go to http://pir.georgetown.edu/ • Choose the iProClass database http://pir.georgetown.edu/iproclass/ • Paste in the gag-pol sequence • Look at the BLAST hits • Try the links to domain display and pattern match. What do you see?

  12. Pfam • Go to The Pfam Home page at: http://www.sanger.ac.uk/Software/Pfam • Choose Protein search • Enter the HIV-1 gag-pol sequence • The search may take 3 to 5 minutes • The page return will show protein families and conserved domains

  13. SMART • Simple Modular Architecture Research Tool • Sequence analysis • Architecture analysis • Search with sequence or accession • Don’t forget to check a database: • Pfam • Signal peptides • Internal peptides

  14. NCBI Structure Tools • http://www.ncbi.nlm.nih.gov/Structure/ • Modeling to for the MMDB and PDB • MMDB – Molecular Modeling Data Base • PDB – Protein Data Bank • Search by keyword (HIV-1 or gag-pol) • Follow links in and out of MMDB / PDB • RasMol, Chime, Cn3D structure viewers

  15. MMDB • Molecular Modeling Database • http://www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.shtml • Contains weekly updates from PDB • “The structure database is considerably smaller than Entrez's protein or nucleotide databases, but a large fraction of all known protein sequences have homologs in this set”

  16. Cn3D Structure Viewer • Structure viewer • PC, Unix / Linux, Mac OSX etc. • Helper application • Structure view / sequence view • Can align and show multiple sequences • Has a great online tutorial • (read carefully) and try it out! • Exports files as PNG for great presos too!

  17. 1RTH Using Cn3D Saved as PNG

  18. VAST and VAST Search • Vector Alignment Search Tool • VAST Search is a service that allows searching for structural neighbors starting with a set of 3D-coordinates specified by the user. • Type in a structure code (PDB) view similar alignments and click to import. • (Start by BLASTing the PDB database).

  19. CDD and CDD Search • http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml • Enter a sequence, accession number, or search by keyword • CDD is linked from BLAST so you may enter it while doing sequence analysis • Where there are CDDs there often is homology – or close cousins (UniGene)

  20. The Protein Machine • http://www2.ebi.ac.uk/translate/ • For translating nucleotide sequences into protein in three different modes • You can choose the sense strand or complement or any reading frame • You can start and end at any position • You can select any translation table • Or enter an accession number

More Related