1 / 26

Genomic ORFans: Past, Present and Future

Genomic ORFans: Past, Present and Future. Naomi Siew and Daniel Fischer Ben-Gurion University Be’er-Sheva, Israel. 1995: The Genomic Revolution. Dozens of genomes were fully sequenced Dozens more are underway. ORF – Open Reading Frame start codon ……… stop codon.

lynna
Download Presentation

Genomic ORFans: Past, Present and Future

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genomic ORFans: Past, Present and Future Naomi Siew and Daniel Fischer Ben-Gurion University Be’er-Sheva, Israel

  2. 1995: The Genomic Revolution • Dozens of genomes were fully sequenced • Dozens more are underway ORF – Open Reading Frame start codon ……… stop codon

  3. Descent With Modification(Divergent Evolution) ..KSMEDQRRIMIRPID.. ..QSMEQIRRIMLRPTD.. ..KSLDDIRRIPIRPID..

  4. M. genitalium T. volcanium S. cerevisiae C. elegans E. coli M. tuberculosis S. sofataricus H. influenzae E. coli B. subtilis B. subtilis M. pneumoniae B. halodurans B. subtilis B. halodurans ORF

  5. Orphan ORFs = ORFans(Fischer and Eisenberg, Bioinformatics,15(9),1999) Singleton ORFan : An ORF that has no sequence similarity to any other sequence in the databases. Little can be inferred about ORFansusing bioinformatic tools.

  6. 20-30% of ORFs in each new genome are singleton ORFans.

  7. ORFans May Be… • New, previously unseen proteins, (with new function, new structure) unique to one organism (species-specific). • Distant relatives of known families (similar function, similar 3D structure) whose sequence diverged beyond recognition by sequence comparison tools.

  8. The Puzzle of ORFans • If new ORFs, where did they come from? How did they evolve? • If distant relatives, why aren’t there similar sequences? Where are the intermediates?

  9. Census and Dynamics of ORFans • Built a database of fully sequenced genomes. • Added genomes one by one in chronological order of publication. • For each ORF, ran BLAST: if there is a match  non-ORFan if there is no match  ORFan Previous ORFans can become non-ORFans.

  10. The number of ORFans is growing, while their percentage is declining.

  11. Each new genome contains ORFs that match previous ORFans, but also new ORFans

  12. Addition of a closely related organism causes a large drop in the percentage of ORFans of the relative

  13. Future Trends: the number of ORFans may start dropping, and their percentage may keep declining ? ?

  14. Length Distribution

  15. Length Bias • Bias among short sequences for ORFans. • (almost half of short sequences are ORFans) • Bias among ORFans for short sequences. • (half of ORFans are short)

  16. Separate dynamics analyses of short and long ORFans show different behaviors • Percentage ofshort ORFans is declining more slowly. Possible explanations: not expressed; frame shifts; wrong stop codons; technical limitations. • Percentage oflong ORFans is declining faster. Possible explanations: more conserved; ORFan modules.

  17. ORFan Modules MGTGDKFCKDKIECAPL KFSRDKIECAFLHGRFCGRFCGDGSP GEISFLIGGRYL ORFan Module: A segment of a sequence that has no matches with other sequences.

  18. Interim Conclusions • Evolution has left us with two types of sequences: homologs and ORFans. • The number of singleton ORFans has been growing. • Their percentage is diminishing.

  19. Interim Conclusions II • There is a bias towards short sequences among singleton ORFans, and vice versa. • Most longer singleton ORFans may disappear with time. • New genomes of closely related organisms will have fewer singleton ORFans.

  20. ORF B. subtilis B. halodurans A Broader ORFan Perspective Orthologous ORFan: An ORF with matches in a family of closely related genomes only and none outside this family.

  21. Currently orthologous ORFansare counted as non-ORFans. • Family-specific? • Most probably expressed proteins.

  22. Paralogous ORFan: An ORF with matches in the same genome only and none outside the genome.

  23. Currently paralogous ORFans are counted as non-ORFans. • Species-specific? • Most probably expressed proteins.

  24. Future and On-Going Work • Study the other types of ORFans (orthologous, paralogous, modules). • Try to assign distantly related ORFans to known families: * in silico: using more sensitive bioinformatic tools such as fold recognition. * In the lab: determining the 3D structure of selected ORFans. • However, even if all ORFans were assigned to known families, the puzzle of their evolution will still remain.

  25. Ongoing in silico/experimental ORFan studies in BGU • Mini-structural genomics project to study selected paralogous ORFans in the archeon Halobacterium NRC-15. • Bioinformatics (our group) • Archea biology (Dr. Gerry Eichler) • Crystallography (Prof. Boaz Sha’anan)

  26. Acknowledgements Prof. Joel Bernstein Department of Chemistry, BGU

More Related