1 / 34

Computational Analysis of Transcript Identification Using GenBank

Computational Analysis of Transcript Identification Using GenBank. Slides by Terry Clark. Differentiation of hematopoietic cells. Genome-wide gene expression. SAGE (Serial Analysis of Gene Expression). Figure 1 Schematic illustration of the SAGE process.

Download Presentation

Computational Analysis of Transcript Identification Using GenBank

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Analysis of Transcript Identification Using GenBank Slides by Terry Clark

  2. Differentiation of hematopoietic cells

  3. Genome-wide gene expression

  4. SAGE (Serial Analysis of Gene Expression)

  5. Figure 1 Schematic illustration of the SAGE process Jes Stollberg et al. Genome Res. 2000; 10: 1241-1248

  6. SAGE & GLGI Overview

  7. What is the chance of duplicate tags? • We can assume we are drawing randomly from the set of all 4-letters sequences of the given tag length • This is the same problem as having unique overlaps in the contig matching problem for shotgun sequencing

  8. Random Model

  9. Random model does not reflect biological process • Genes evolve by duplication as well as point mutation • Many motifs are repeated • Function widgets at work? • Result is a strong bias in observed biological sequences, not a uniform distribution as the simple model hopes. • Here are some numbers ….

  10. SAGE tags match to many genes(Tags from Hashimoto S, et al. Blood 94:837, 1999)

  11. Tag Frequency Groups for 10-base Tag SetContaining 878,938 Tags for UniGene Human

  12. Unique Tags among 878,938 EST Derived Tags

  13. Unique Tags among 32,851 Gene Derived Tags

  14. Converting tag into longer 3’ sequence

  15. Generation of Longer 3'cDNA for Gene Identification (GLGI)

  16. UniGene Human 3’ Part Length Distribution

  17. Myeloid Tag Matches with UniGene Human SAGE Tag Reference Database

  18. SAGE Tag Processing with GIST

  19. k-mer tree

  20. GIST Performance with Improved IO

  21. Conspirators Terry Clark Andrew Huntwork Josef Jurek L. Ridgway Scott Sanggyu Lee Janet D. Rowley San Ming Wang

More Related