1 / 45

Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome

Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome. Brooke Peterson-Burch Voytas Laboratory Iowa State University. Beyond genes. Most DNA in eukaryotes doesn’t code for anything necessary for the survival and replication of the organism.

giza
Download Presentation

Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome Brooke Peterson-Burch Voytas Laboratory Iowa State University

  2. Beyond genes • Most DNA in eukaryotes doesn’t code for anything necessary for the survival and replication of the organism. • How did that sequence get there? • Why isn’t it eliminated? • Genome sequences can teach us about genome evolution and the part that retroelements play

  3. What’s a retroelement? • Type of transposable element • A mRNA copy of the parental element ‘genome’ is reverse transcribed into DNA and inserted into a new location in the host • Transposition is replicative

  4. RH RT RH RT Retroelement genomes gag retroposons EN RT RH AAAn gag Dirs λ Recombinase RH RT Pseudoviridae MA CA NC PR IN BEL gag PR RT RH IN CA MA NC Metaviridae PR IN env nef HIV-1 pol vpr TM SU gag tat LTR RH PR RT IN LTR Retroviridae p6 NC MA CA rev vpu vif

  5. env nef HIV-1 pol vpr TM SU gag tat LTR RH PR RT IN LTR p6 NC MA CA rev vpu vif Transcription Element mRNA Translation Pseudoviridae LTR LTR MA CA NC RH PR IN RT Retro living…

  6. env nef HIV-1 pol vpr TM SU gag tat LTR RH PR RT IN LTR p6 NC MA CA rev vpu vif Particle Packaging Only viruses escape host cell Pseudoviridae LTR LTR MA CA NC RH PR IN RT Retroelement life cycle Element

  7. env nef HIV-1 pol vpr TM SU gag tat LTR RH PR RT IN LTR p6 NC MA CA rev vpu vif Reverse Transcription cDNA Element Pseudoviridae LTR LTR MA CA NC RH PR IN RT Retroelement life cycle

  8. env nef HIV-1 pol vpr TM SU gag tat LTR RH PR RT IN LTR p6 NC MA CA rev vpu vif cDNA New Copy IN Integration Pseudoviridae LTR LTR MA CA NC RH PR IN RT Retroelement life cycle Element

  9. Retroelements play a major role in the structure and evolution of many genomes • Genome sequences provide a great resource for diversity, distribution, and element identification studies

  10. Retroelements and Genomes • Genome data-mining can help answer questions about: • Number of Elements • Types of Elements • Diversity • Physical distribution • Impact on host • Odd or interesting elements • Evolutionary history • Element sequence and domain characteristics

  11. Diversity of the Pseudoviridae

  12. Retroviridae Metaviridae Dirs Retroposons BEL Pseudoviridae A retroelement family tree

  13. Lueckenbuesser (G) 4 8080198 Osser (G) Endovir1-1 PREM 2 SIRE 1 Opie-2 ToRTL1 5 Art1 2 Tpv2-6 Evelknievel Hopscotch Retrofit AtRE1 1 97 86 1 16648808 95 100 92 copia (I) X66399 Tst1 54 91 5 14977057 RIRE1 68 94 BARE 1 4 Sto 4 5 21307623 78 70 95 Tnt1 94 Tto1 85 Panzee 6 2 2904626 Tgmr 3 Ta1-3 100 Melmoth 1731 Mosqcopia (I) Ty5-6p (F) Tca5 (F) 0.1 5 8783861 Ty4 (F) Ta11 Tca2 (F) Ty1 (F) Retroviridae Metaviridae Dirs Retroposons BEL Pseudoviridae A.thaliana captures all plant Pseudoviridae diversity

  14. LTR LTR MA CA NC RH PR IN RT Mapping proteases to HIV-1 structure helps explain patterns of conservation

  15. LTR LTR MA CA NC RH PR IN RT Proline rich region G KGY ILGD C C C C H H H H D D D D E E Pseudoviridae +/- common region ILGD motif present G K G Y * * * * * * - -- 1 -- …217 1731 - -- 1 -- …211 BARE-1 - copia -- 1 -- …311 + Endovir1-1 -- 1 -- …239 GKGY - Melmoth -- 1 -- …223 - Mosqcopia -- 1 -- …218 + Opie-2 -- 1 -- …257 - -- 1 -- …290 Osser - -- 1 -- …327 Retrofit - -- 1 -- …231 Tnt1-94 - - -1 -- …465 Ty1 - Ty5 -- 4 -- …476 …249 + -- 60 -- Del …189 - -- 60 -- MMLV GPF/Y - …238 SnRV -- 57 -- - …248 Tf1 -- 58 -- - …201 Ty3-2 -- 60 -- - …198 Athila5-1 -- 68 -- Chromodomain present G P F Y gypsy …133 ...137 HIV1 Other osvaldo …192 RSV …167 WDSV …167 (Meta/Retro)viridae +/- Chromodomain GPF/Y Integrase: what’s happening in the back?

  16. Putative env gene is conserved across species

  17. Retroviridae HIV-1 Rousv Pseudoviridae MoMLV Metaviridae Putative retroviruses Ty5-6p Evelknievel Osser 0.1 changes Ty3 Gypsy Hopscotch Del1 Retrofit Tst1 Reina Cyclops SIRE-1 Calypso Endovir1-1 Fababean Opie-2 Athila4-6 ToRTL1 Grande Art1 Tat4-1 Ta1-3 Tnt1-94 Tto1 Cinful-1 copia MAG Ty1 SURL Retroviruses independently evolved at least twice in plants

  18. ToRTL1 env 668 aa 31% ID Endovir1-1 env 476 aa 24% ID SIRE-1 env 648 aa env nef HIV-1 pol vpr TM SU gag tat LTR RH PR RT IN LTR p6 NC MA CA rev vpu vif retrovirus envlike-coding regions show a bipartite structural organization

  19. A C B A C B LTR LTR RH PR IN RT MA CA NC Gag surprises… Putative retrovirus group A B C A B C (Hemi/Pseudo)virus • Gag is much larger in the retroviral lineage • Sequence and structural conservation is evident

  20. Diversity of the Pseudoviridae family summary • Enzymatic regions appear to be highly constrained other than the IN C-terminus. • Arabidopsis LTR retrotransposons are representative of plant elements in the family • The putative retrovirusesrepresent an uniquely evolving Pseudoviridae lineage bearing numerous changes in the retrotransposon genome. • Sub-lineage differences suggest areas to focus experimental efforts for functional studies. • Gag shows greater sequence conservation than previously thought

  21. Summary continued… • envlike-coding regions have been evolutionarily conserved indicating a functional role for the ORF • features suggestive of viral env proteins have been identified in all LTR retrotransposon envlike ORFs • putative env proteins have evolved in at least two independent plant LTR retrotransposon lineages, giving credence to the hypothesis that retroviruses evolved from retrotransposons

  22. Organization of the retroelement populations of the Arabidopsis genome

  23. Do retroelements of higher eukaryotes choose where they integrate? • Is yeast a good model? • Multicellular organism genome projects have noted that transposable element numbers are markedly increased near centromeres. • This project quantitatively documents these anecdotal observations for the Arabidopsis genome

  24. MB 10 20 30 40 50 60 70 80 90 Completed genome? 2 28.0 3 4 X

  25. RetroMap: a graphical tool for simplifying whole-genome analysis of retroelements

  26. RetroMap Features • RetroMap provides the following tools to work with genome data: • Parse blast results • Assign Lineages or arbitrary groupings to retroelements • View chromosomal locations • Identify and extract LTRS • Identify and extract full length elements • Assign ages to complete LTR retroelements • Extract sequence(s) for hits • Visualize hit open reading frames • Generate information about neighboring annotated features (Arabidopsisthaliana only) • Generate tab-delimited datafiles of retroelement information for direct import into statistical software packages

  27. Overview of how RetroMap generates retroelement data for a genome

  28. WDSV MMLV SnRV Cer1 Ce Osvaldo Db RSV Athila At con HIV1 Ty3 Sc sushi Fr PAT Pred Tf1 Spom Dirs1 Dd TAtRL ta11 946 Prt1 Pbla 988 861 996 0.1 L1 Hs 1000 Roo Dm 1000 R2 Dm. Mazi Dm R1 Dm BEL Dm Jockey Dm Pao Bm SIRE1 Gm Tca2 Ca. Ty5 Sp Endovir1 1 At Art1 At copia Dm Starting eprobe sequences

  29. Tat Metavirus 0.1 Athila root 0.2 Pseudoviridae root Metaviridae A. thaliana LTR retrotransposon genome overview

  30. A. thaliana retroelements consist of retroposons and only two LTR families • Pseudoviridae elements are significantly shorter (p=.0001)

  31. identical at time of insertion gag pol Dating LTR retrotransposons Relative ages can be estimated from the sequence divergence (genetic distance) of the LTRs e.g. T = d (genetic distance: 1 – (% identity ÷ 100)) 2k (k: nucleotide substitution rate for genome)

  32. Pseudos are younger than Metas. The Athila sublineage being the oldest tested

  33. A. thaliana RT distributions

  34. Full-length element host DNA homologous recombination loops out and deletes retroelement internal sequences host DNA solo LTR Going solo

  35. Where have they been?

  36. No family distribution is random • Metaviridae Athila and Tat are found preferentially inside heterochromatic regions, others groups are not • Pseudoviridae and retroposon distributions are not significantly different • Solo LTRs show same distributions as full-length family members

  37. Hypotheses • Retroelement lineages show ‘universal’ organizational characteristics on the family level • General retroelement abundance at centromeres is due to reduced elimination…the ‘graveyard scenario’ • Metaviridae in Arabidopsis are targeted to heterochromatin

  38. Conclusions • Heterochromatic regions DO appear to act as graveyards, at least in the case of the Pseudoviridae (and presumably the retroposons) • Younger Pseudoviridae elements tend to be found outside of heterochromatin • Solo LTR distributions indicate that homologous recombination between LTRs is not greatly inhibited in heterochromatin • The Metaviridae lineages appear to use targeting in their interactions with the host genome

  39. So many people helped make this research happen, I couldn’t have done it without their support and input. Special thanks go to the many members of the Voytas lab, past and present, undergrads too! I’ve been lucky to have good collaborators who are interesting and fun to work with. These have included Dr. Nettleton, Dr. Wright, Dr. Laten from Loyola University, and always Dr. Voytas. To the head honcho: no one can say it hasn’t been a crazy, crazy ride. Thanks. :o) Acknowledgements

  40. case 1 case 2 case 3 case 4 • Simple match, no overlap with nearest hit, no compression 2) Overlap case(s) both hits merged into one representing their combined maximum extent on the database sequence • 3) Two non-overlapping hits which should be combined: • Left checks it’s boundary position on its query sequence and determines if the other hit falls within that range. If so merge. • Right repeats the proceedure if Left failed to indicate a merge 4) An example of a merge case which may lead to false positives Basic Hit Redundancy Elimination Scheme Query sequence

  41. LTR R T Blast Round 1 LTR LTR LTR LTR RT RT Blast Round 2 LTR R T R T RT RT RT RT RT BLAST false-positive amplification problem RT

  42. Genome sequence Hit 10 kb 10 kb Blast2Sequences H it Hit LTR prediction • Works only for hits of a sequence interior to LTRs • Blast2Sequences is used to detect repeats • 10kb of sequence upstream and downstream are compared • Innermost matching repeats are taken to be the LTRs

  43. Tandem elements Hit 10 kb 10 kb Hit Predicted element Nested elements Hit1 Hit2 10 kb 10 kb Predicted element Hit2 Degenerate or simple internal repeat elements pA pA Hit 10 kb 10 kb Hit LTR Identification Errors

  44. Sample distribution data Sample hit neighbors annotation data

More Related