330 likes | 636 Views
Somatic Activity Of Retrotransposons In Human. Hani Ben Shmuel Project advisor: Dr. Erez Levanon. Scientific Background. Repetitive elements comprise 30–50% of mammalian genomes. Transposable elements. Transposable elements are discrete pieces of DNA that can move within genomes.
E N D
Somatic Activity Of Retrotransposons In Human Hani Ben Shmuel Project advisor: Dr. Erez Levanon
Scientific Background • Repetitive elements comprise 30–50% of mammalian genomes.
Transposable elements • Transposable elements are discrete pieces of DNA that can move within genomes. • TEs can be separated into two major classes:DNA transposonsand retrotransposons. • DNA Transposons can excise themselves from the genome, move as DNA and insert themselves into new genomic sites. • Retrotransposons duplicate through RNA intermediates that are reverse transcribed and may integrate back to the genome.
Retrotransposons • Retrotransposons can be subdivided into two groups distinguished by the presence or absence of long terminal repeats (LTRs): • non-LTR retrotransposons - LINE-1 (L1), Alu and SVAelements. • LTR retrotransposons - Human LTR elements are endogenous retroviruses (HERVs).
Non-LTR Retrotransposons • L1, Alu and SVA non-LTRretrotransposons, collectively account for approximately one-third of the human genome, and are the only TEs currently active in humans. • Previous methods for Identifying the Activity of Retrotransposons: • Comparison between Human and chimpanzee genomes. • Genomic comparison between two different people. • Insertions which cause diseases (Hemophilia). • Inject element into bacteria and detect its expression.
SVA element • non-LTR retrotransposon. • ~3,000 copies in the human genome. • A typical full length SVA element is ~2 kb. • Made up of a short interspersed element (SINE) region, a variable number of tandem repeats (VNTR) region and an Alu-like region.
Effects on the human genome • Generating insertion mutations and genomic instability. • Altering gene expression. • Contributing to genetic innovation. • Increase genome size. • Impact on the evolution of primate genomes in terms of both structure and function.
Project Goal Explore Retrotransposons’ somatic activity in the human genome.
Project Objectives • Building the infrastructure for the project (software, data, formats, choosing parameters etc). • Finding the most active element of each retrotransposon. • Comparing expression level of retrotransposons from brain vs. cell line. • Identifying recent insertion events of retrotransposons in human. • Finding evidence for DNA editing in retrotransposons.
Project Importance • Retrotransposons’ activity can cause dramatic changes in the human’s genome. Therefore,activity of those elements meaning risk in many somatic mutations. • Insertion events in protein-coding or regulatory regions can alter genome function and influence genome evolution.
Next Generation Sequencing • Revolutionary improvements in cost and speed of data generation. • Shorter read length are produced. • Increase data generation. Helicos Illumina
The Project input data • RNAseq (transcriptome) from 12 people. • 4 billion sequences (~0.5 terra). • Sequenced with SOLiD, Illumina, 454 and helicos. • New data just arrived: • 5 billion sequences. • HiSeq2000, Illumina. • Better quality.
Tools • BowTie - An ultrafast memory-efficient short read aligner. • UCSC - Genome browser website. • Perl- Scripting language.
Bowtie • Bowtie is an ultrafast, memory-efficient short read aligner geared toward quickly aligning large sets of short DNA sequences (reads) to large genomes. • Bowtie indexes the genome with a Burrows-Wheeler index. The idea: work hard to create a compact version of the reference(‘indexing’) that can be easily scanned by the short reads.
UCSC • The site contains the reference sequence and working draft assemblies for a large collection of genomes. • Tools for retrieving data associated with repeats.
Project steps (1) • Find the most active element of SVA: • Reads alignment to SVA sequences using BowTie. • Parameters adjustment . • Relevant reads selection. • Alignment to the human genome. • Filter reads with more than 3,000 alignments. • Find maximal number of hits for each chromosome. • Select the position with maximal hits.
Initial Results (1) • Alignment to SVA dataset: • 24, 427 reads (total: 58,578,322). • 10, 969,837 alignments. • Alignment to Human Genome: • 24,388 reads (99.95%). • ~ 200 million alignments. • After filtering reads with alignments > 3000 ,12,619,951 alignments.
Project steps (2) 2. Compare expression level of retrotransposons from brain vs. cell line: • Parameters selection. • UHR (cell line) reads alignment to SVA sequences. • Brain reads alignment to SVA sequences. • Filter the results. • Statistic calculations & Graphs.
Initial Results (2) • 100bp • 50bp
Summary • Building infrastructure for the project. • Using next generation sequencing as a research platform to identify the most active element of SVA in the human genome. • Indication that there is no significant difference between the expression level of retrotransposons from brain and cell line.
What is next? • Running the alignments and scripts on the new data. • Getting results for other retrotransposons. • Identifying recent insertion events of retrotransposons in human. • Finding evidence for DNA editing in retrotransposons.
Thanks to: Dr. Erez Levanon