1 / 46

Annotation Presentation Week 3

Annotation Presentation Week 3. Sequence-based Similarity Module (BLAST & CDD only ) & Horizontal Gene Transfer Module (Ortholog Neighborhood & GC content only). Phylogenetic tree of Bacteria. Insert Figure 1 from Handelsman (2004) Microbiol. Mol. Biol. Rev . 68 : 669-685.

wiley
Download Presentation

Annotation Presentation Week 3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Annotation Presentation Week 3 Sequence-based Similarity Module (BLAST & CDD only )& Horizontal Gene Transfer Module (Ortholog Neighborhood & GC content only)

  2. Phylogenetic tree of Bacteria Insert Figure 1 from Handelsman (2004) Microbiol. Mol. Biol. Rev. 68: 669-685. • Recall: Planctomycetes are one of the GEBA genomes, representing an under-represented phylum within domain Bacteria GEBA: Genomic Encyclopedia of Bacteria & Archaea

  3. Recent phylogenetic analysis using 23S rRNA gene supports the monophyletic grouping and branch order for these four bacterial phyla Insert Figure 4A from Pilhofer et al. (2008) Characterization and Evolution of Cell Division and Cell Wall Synthesis Genes in the Bacterial Phyla Verrucomicrobia, Lentisphaerae, Chlamydiae, and Planctomycetes and Phylogenetic Comparison with rRNA Genes. J Bacteriology190: 3192-3202.

  4. Members of the Planctomycetaceae Family http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=126

  5. The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between two sequences. • Conserved Domain Database Search (CDD) finds sequence similarity with genes in conserved orthologous groups (COGs).

  6. Verifying Function Based onSequence Conservation • Different types of BLAST searches • blastp • blastn • blastx • tblastn • tblastx >35% identity to experimentally characterized protein (especially in conserved regions) can be considered good evidence for function E-value  less than 10-3 is significant  equal to or less than 10-15 may indicate good match http://www.ncbi.nlm.nih.gov/ Beware!!! Mindless BLAST – Similarity score and E-value do not tell whole story! Must also consider length of match (query coverage) & biological function (organismal context) Be cautious of auto-annotated gene function – GenBank not a curated database

  7. Follow this link from the lab notebook BLAST: Altschul et al. (1997) Nucleic Acids Research25: 3389-2402. Genbank: Benson et al. (2006) Nucleic Acids Research35: D21 – D25.

  8. Retrieve query sequence from first module in imgACT Lab Notebook

  9. Copy amino acid sequence in FASTA format from in imgACT Lab Notebook

  10. Paste query sequence into box “Click”

  11. WHAT YOU SHOULD SEE. . . BLAST RESULTS Scroll down

  12. Accession ID Top significant hit • Start with first hit. . . • Click on Accession ID

  13. NOTE: Top hit is from class organism; Do not include results in P. limnophilus in lab notebook

  14. Accession ID Next significant hit • Click on Accession ID

  15. Copy/paste this information into imgACT notebook NOTE: Function assigned by automatic Gene Caller(not experimentally verified)

  16. Reminder: Make sure you are in EDIT mode when making changes to imgACT notebook and SAVE your work along the way Return to BLAST results for this information

  17. “Click” on Bit score

  18. Sequence length of database hit (not alignment length) Pair-wise alignmentwith statistics(including E-value) • Copy/paste into imgACT notebook: • Length of alignment • Score • Expect (E-value) • Identities • Positives • Gaps • Pair-wise alignment between “Query” and “Sbjct” sequences.

  19. NOTE: You need to modify your notebook for requested info (statistics include E-value) 725 • REPEAT procedure with second BLAST hit.

  20. “Click” on Accession ID “Click” on Bit score Copy/paste requested information in lab notebook 733

  21. CDD:Conserved Domain Database COG 1 – ion transport COG 2 – energy production COG 3 – cell division etc. COG genes have sequence similarity & functional conservation Bi-directional best hit in curated database Figure from Sanders-Lorenz and Miller (2010)

  22. Return to top of BLAST Results page CDD: Marchler-Bauer et al. (2006) Nucleic Acids Research35: D237-D240.

  23. “Click” on Conserved Domain image “Click”

  24. If there are no hits, write “no significant hits” in notebookIf there are hits, scroll down & click the + sign next to the top hit Click here

  25. Copy top COG hit and COG name into notebook Modify BOX to include length, bit score, and E-value COG description COG hit COG name Length, bit score, and E-value

  26. Change headings and enter COG information as shown for top hit • If obtain more than one significant hit, record this info for at least the top 2 hits • Hint: Look at Score & E-value

  27. Retrieve from Gene Detail page

  28. How do I return to the Gene Detail page for my proposed gene? “Click” on URL saved for your geneduring first module (week 2)

  29. Then what? Keep the Gene Detail page open in separate tab while working on imgACT Lab Notebook modules Scroll down

  30. “Click” here on Gene Detail page

  31. Change to 40

  32. Note the red arrow corresponds to your gene • Plus strand genes on top (right to left) • Minus strand genes on bottom (right to left) Is your gene a stand alone ORF or is it clustered with other geneson same DNA strand and in same orientation? • Could be evidence that your gene is part of an operon • What are the functions of adjacent genes? Do they have related function? How conserved is the gene neighborhood? • Are there similar patterns in other organisms that contain a gene from same orthologous group? • If considerably different, may be evidence for HGT

  33. Need to save individual panels as JPEG or PNG files. Include P. limnophilus as well as 4-5 different organisms in imgACT notebook.

  34. “Click” here to insert images into notebook Delete ‘gene neighborhood images’ and place cursor in the box

  35. 1- Click “Browse” to find image file. 2- Press “Attach” button. Thumbnail image should appear in window. 3- Repeat for each individual neighborhood panel until all are loaded in the window prompt.

  36. 4- Next, select one image at a time and press [OK] to insert them into imgACT notebook at cursor position. NOTE: The images should be inserted in same order that the organisms were listed in img/edu Insert next image

  37. Results: Ortholog Neighborhood Scroll down

  38. Enter comments about homology & context: Is your gene a stand alone ORF or is it clustered with other genes or same DNA strand and in same orientation? • Could be evidence that your gene is part of an operon • What are the functions of adjacent genes? Do they have related function? How conserved is the gene neighborhood? • Are there similar patterns in other organisms that contain a gene from same orthologous group? • If considerably different, may be evidence for HGT

  39. Retrieve from Organism Details page Retrieve from Gene Detail page

  40. On Gene Detail page, you will find the GC content for your gene.

  41. To find GC content for the entire P. limnophilus genome, select “Find Genomes” tab from the Gene Detail page.

  42. Search for Planctomyces limnophilus and click on the corresponding hyperlink.

  43. WHAT YOU SHOULD SEE. . . Scroll down

  44. GC content will be listed under Genome Statistics.

  45. NOTE: A gene with a GC content that is more than a few percentage points above or below the the average GC content in the genome may have originated from another organism by HGT. Add acomment box & make note of this if your genemeets this criterion.

More Related