1 / 37

Searching for Novel Fusion Genes

Searching for Novel Fusion Genes. Noura Dabbouseh Genís Parra 1 April 2004. Outline. Review of Fusion Genes Study of current known fusion gene possibilities Human Genome Mouse Genome Searching for possible novel fusion transcripts in geneid predictions in ENCODE regions

Download Presentation

Searching for Novel Fusion Genes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Searching for Novel Fusion Genes Noura Dabbouseh Genís Parra 1 April 2004

  2. Outline • Review of Fusion Genes • Study of current known fusion gene possibilities • Human Genome • Mouse Genome • Searching for possible novel fusion transcripts in • geneid predictions in ENCODE regions • Future work – along the way

  3. Fusion Genes In a chimeric transcript, two adjacent transcripts, from two different genes, get transcribed (and spliced) into one unique mRNA molecule. This mRNA molecule can share exons from both genes, encoding domains from the two different proteins. Regulated chimerism could be a mechanism to generate additional protein diversity in the human genome.

  4. Kua and UEV

  5. Motivation - Known Fusion Genes • Thomson, et al. (2000) – Fusion of the human gene for the polyubiquitination coeffector UEV1 with Kua, a newly identified gene • Poulin, et al. (2003) – Gene fusion in the mamalian genes for 4E-BP3 and MASK • Hernandez - Insulin precursor and Tyrosine hydroxylase

  6. Motivation, cont’d. • With at most a handful of examples, worth it to search for novel fusion genes • Search in multiple organisms and examine conservation • In theory, if new cases found, conservation should be higher than that of other proteins (Veeramachaneni et al. proved otherwise in their paper, which dealt with genes with overlapping UTRs) • Number of genes we think exist in human genome is like the value of the U.S. dollar: it keeps shrinking; fusion genes could add complexity

  7. Study of currently “known” fusion gene possibilities

  8. Looking for Already Annotated Fusion Genes • Find clusters of mRNAs with overlapping exons following one of these two patterns:

  9. Searching Algorithm • Find pairs of overlapping genes • For genes with more than one overlap, “process” these Gene A types: • Cluster the set: group all genes (Gene B types) that cross the same Gene A • Keep only those Gene B’s in a cluster whose coding regions overlap Gene A’s coding regions • Filter filter for clusters where there are at least two non-overlapping Gene B’s

  10. Results: Human vs. Mouse Genome

  11. Examples

  12. Human I: INPP5Finositol polyphosphate-5-phosphatase F isoform

  13. Human II: PCDHA1protocadherin alpha 1 isoform 1 precursor

  14. Human III: SPAG11

  15. Mouse I: Igsf1immunoglobulin superfamily member 1 long

  16. Mouse II: Serpina1dserine (or cysteine) proteinase inhibitor clade

  17. Conclusions • Many cases found where the definition of fusion gene can be applied • In the database these cases mostly annotated as gene variants or isoforms • Is functional annotation correct? All those cases have two promoter regions and two non-overlapping transcripts: Are those cases really variants of the same gene ?

  18. Next Step • Search for homologous sequences in Mouse and Human result sets

  19. Searching for Novel Fusion Genes using ESTs

  20. Methods • Search for adjacent non-overlapping genes • Keep those pairs where at least one EST crosses both genes • Further filter set by looking for overlapping coding regions • Use spidey to map ESTs back to mRNA sequences: keep those pairs where there are overlapping coding regions • ORF must be maintained in both genes • Of result set, check which ones exist in NR database

  21. Results: Human vs. Mouse Genome

  22. Examples in Mouse Genome

  23. Next steps • Experimental Verfication • Send mouse possibilities to Switzerland to see if any hybrid transcripts check out • Look for conservation between Mouse and Human fusion genes

  24. Future Work • Revise scripts to make sure we didn’t lose any data • Increase data set beyond just RefSeqs to include known genes and VEGA Annotations • Maybe develop scripts into more contained software to be run on other genomes rapidly

  25. Using geneid to predict fusion genes

  26. Methods • Build a more complete dataset using known genes, VEGA annotations and refseq UCSC mapped annotations • Search for adjacent non-overlapping genes • Get the genomic sequence containing the two transcripts • Force geneid to predict one gene in the genomic regions • Further filter set by looking for overlapping coding regions

  27. Chimeric transcript prediction geneid

  28. Validation of the method • We use the 11 genes with positive rt-pcr to see the posible ratio of succes following this protocol. * taking the last predicted exons overlapping the real refGenes

  29. No amplification (5/11)

  30. Amplification (2/11)

  31. Perfect prediction (4/11)

  32. Recovering mouse regions • Homologene – We only recover 3/11 cases • Using blat tight aligments (200 bp average) we can recover the corresponding region in mouse for 10/11 cases

  33. Homologous regions

  34. Homologous regions 2

  35. Predictions (encode region)

  36. Future Work • Obtain the homologous genomic regions in mouse and compare geneid predictions in mouse (?) • Sent this data to Switzerland

  37. Acknowledgments • Roderic Guigo Lab • The Fulbright Commission • U.S. Congress and State Department • Spanish Fulbright Comision

More Related