310 likes | 468 Views
Describe the various elements associated with genetic variation and for each of these include, with examples, how they may be associated with genetic disease. Jayne Duncan MRCPath Self Help Course 2010. DNA Integrity. The DNA molecule is large, complex and fragile.
E N D
Describe the various elements associated with genetic variation and for each of these include, with examples, how they may be associated with genetic disease Jayne DuncanMRCPath Self Help Course 2010
DNA Integrity • The DNA molecule is large, complex and fragile. • Integrity is constantly challenged by environmental facors such as chemicals and radiation. • Integrity can also be challenged by the conformations adopted by the DNA molecule itself. • All DNA sequences adopt the right-handed B form with Watson-Crick base pairing for the majority of the time. • At least 10 non-B DNA conformations can form transiently at specific sequence motifs as a result of the free energy derived from negative supercoliling.
DNA Supercoiling • DNA exists in cells as a circular molecule. • When it is coiled upon itself it is supercoiled. • In vitro it is generally in a right-handed supercoiled form, this is termed negative supercoiling. • Negative supercoiling can be generated by factors such as transcription and protein binding. • The free energy derived can be used to allow DNA to adopt high energy non-B DNA conformations. • Examples include: Cruciforms, Triplexes, tetraplexes, Slipped strand (hairpin) structures and Z DNA. • These contribute to genetic instability.
Non-B DNA Conformations Taken from Bacolla et al 2004.
Cruciforms • Cruciforms form at inverted repeats. • The structure consists of two hairpin loop arms and and a four-way junction similar to the Holliday junction recombination intermediate. • Two inverted repeats as short as 7bp can be sufficient for the formation of hairpin structures. • Cruciforms are apparently responsible for non-random translocations and gross deletions.
Cruciforms and Human Disease • Chromosome 22q11 is characterised by four low copy repeat sequences (LCRs) (A-D) that encompass 350kb DNA over a 3Mb region. • Sequence analysis of the constitutional breakpoints between chromosome 22 and chromosomes 1, 4, 11 and 17 show the hall marks of cruciform DNA. • For example the breakpoints on LCR-B clustered in a narrow AT-rich region that in conjunction with flanking NF-1 like sequence is predicted to form a 600bp cruciform. • Clustering occurred within 15bp of the inverted repeat centre and involved the symmetric deletion of a few bases on either side of the inverted repeat apex. • This indicates Cruciform loops may be susceptible to double strand breaks and repair. • In the case of translocations this could take the form of Non-Homologous End Joining (NHEJ). • Translocation carriers are usually asymptomatic but come to attention because of chromosomally unbalanced offspring.
Cruciforms and Human Disease • Recombination events between the highly homologous sequences (97-98% identity) within LCRA-D at 22q11 are associated with the recurrent deletions and duplications associated with Di George syndrome and Velocardiofacial syndrome. • All patients have either a common recurrent 3Mb deletion or a less common 1.5Mb deletion. • Homologous Recombination between non allelic low copy flanking repeats may be triggered by the formation of cruciforms.
Triplex DNA (H-DNA) • Triplex DNA structures form at homopurine:homopyrimidine (R:Y) sequences with mirror symmetry. • A single stranded region binds the major groove of the underlying DNA duplex to form a three stranded helix. • Triplex DNA can be classified according to the orientation and composition of the third strand. • The third strand can be either pyrimidine rich and parallel to the complimentary strand (found under normal pH) or purine rich and antiparallel to the complimentary strand (acidic pH).
Triplex DNA and Human Disease • Homopurine:Hompyrimidine (R:Y) sequences have been proposed to be involved in the regulation and expression of several genes. • An oligo R:Y tract with mirror repeat symmetry is present in the promoter of the c-myc proto-oncogene. • 80% of Burkitt’s lymphoma cases carry a t(8;14)(q24:q32) translocation resulting in the juxtaposition of the c-myc gene on chromosome 8 with IgH enhancer elements on chromosome 14. Leading to c-myc mRNA over expression and cancer. • Experiments carried out in vitro implicate triplex-induced double strand breaks in c-myc translocation and disease. • Vasquez et al showedthat by using the endogenous triplex forming sequences found in the c-myc promoter, mutation frequencies in a reporter gene were increased 20-fold compared to background in mammalian COS-7 cells. • Triplex-induced double strand breaks were detected near the triplex locus and micro-homologies were also present at the breakpoints indicating that repair proceded by NHEJ.
Tetraplex (G-Quadruplex DNA) • This is a four stranded structure that consists of a square co-planar array of 4 guanine bases formed by a stretch of guanine rich DNA. • Each guanine acts a donor and acceptor of hydrogen bonds. • Quadruplexes may be formed by one, two or four interacting strands and exist in a variety of conformations (either parallel or anti-parallel) depending on the polarity of the strands. • Quadruplexes are found in the immunoglobulin switch regions, telomeric DNA, poly (dG) runs and promoter regions. • They may be involved in the regulation of transcription, homologous recombination and telomere maintenance.
Tetraplex DNA and Human Disease • Fanconi Anaemia (FA) is characterised by bone marrow failure and a presisposition to cancer as a result of chromosome instability. • Some patients also show congenital abnormalities, growth and endocrine abnormalities, infertility and haematologic manifestations. • FA results from mutations in one or more of the 13 complementation genes. One of these genes FANCJ, is a helicase that binds to the BRCT repeats of BRCA1. • Studies suggest that FANCJ deficient cells accumulate genomic deletions close to sequences that from G-quadruplex structures. • This is because during replication, the fork stalls at G-quadruplex structures and because FANCJ is not present to unwind them, the accumulation of stalled replication forks leads to genetic instability
Telomeric Shortening and Human Disease • In humans telomeres are hexameric repetitive DNA sequences (TTAGGG in theleading strand and CCCTAA in the lagging strand capped by specific proteins (Shelterin complex). • The leading strand ends in a 3’ overhang, generated by processing of the lagging strand. • The overhang folds back into the telomeric DNA, invades the double helix and anneals with the C-rich strand forming a T-loop that hides the ends of the DNA. • Telomeres are coated by the Shelterin complex of at least 6 proteins, which allows the cellular repair machinery to distinguish telomeres from double strand breaks. Taken from Calado et al 2009
Telomeric Shortening and Human Disease • Telomeres cannot be fully duplicated after cell division, leading to shortening. • To counter this highly proliferative cells such as lymphocytes express telomerase (RT) • This enzyme uses TERC (Telomerase RNA component) as the template to extend the 3’end of the leading strand by adding TTAGGG repeats. • The telomeric complex is composed of TERT (RT), TERC and NOP10, NHP2 and GAR which stabilise the complex. Taken from Calado et al 2009
Telomere Shortening and Human Disease • Dyskeratosis Congenita is a constitutional aplastic anaemia caused by excessive telomere shortening. • Characterised by bone marrow failure, mucocutaneous abnormalities (nail dystrophy, leukoplakia( white spots on mucosal membranes) and skin hyperpigmentation) and increased susceptibility to cancer. • X-linked form caused by mutations in DKC1 which encodes Dyskerin • Autosomal forms caused by mutations in TERT, TERC, NHP2, NOP10 or TINF2. • Mutations dramatically reduce the function of the telomerase complex to extend telomeres or lead to inappropriate capping (TINF2) • Leads to excessive telomere shortening and failure of stem cells to replicate, resulting in bone marrow failure. • In most patients telomere lengthsin granulocytes and lymphocytes are 3-5kb as opposed to 7-8kb in a healthy 20 year old.
Left-Handed Z DNA Left-handed helix with 12 base pairs per turn. Forms at GT:AC repeat sequences that account for more than 0.25% of human genome. Located preferentially in the 5’UTR of genes suggesting that formation depends on the negative supercoiling associated with transcription. Z DNA is also found at the chromosomal breakpoints in human tumors suggesting that Z DNA causes genomic instability by inducing double strand breaks and large deletions. Z DNA can also have a protective effect in Myotonic Dystrophy type 2 by preventing the formation of slipped strand structures and hence repeat expansion.
Z DNA and Human Disease Myotonic dystrophy type 2 is caused by expansion of a tetranucleotide CCTG:CAGG repeat in intron 1 of the ZNF9 gene. Clinical symptoms include: myotonia, proximal muscle weakness and atrophy, cardiac conduction defects and cataracts. The tetranucleotide repeat forms slipped strand DNA structures, in a length dependent fashion on reduplexing (i.e transcription). The threshold for non-B DNA structure is between 36 and 42 repeats. Z DNA forms in the TG:CA tract which lies 5’ of the tetranucleotide repeat. This may cause a relaxation in negative super coiling and hence have a protective effect against the formation of slipped strand DNA structures that can lead to pathogenic trinucleotide repeat expansion.
Z DNA and Human Disease A. Protective effect for Z DNA against the formation of slipped strand DNA structures at the DM2 locus. 1. Representation of a region of the DM2 allele with the DNA organised in nucleosomes (shaded circles). Z DNA is denoted by light shading and the tetranucleotide repeat by darker shading. 2. Unrestrained supercoiling present in the DNA as a result of loss of nucleosomes. 3. Z DNA formation leads to relaxation of negative supercoiling which prevents supercoiling dependent formation of slipped strand DNA structures.
Z DNA and Human Disease • B. CCTG Tetranucleotide Repeat expansion present. • In longer CCTG:CAGG repeat tracts slipped strand DNA structures form easily. • Relaxation of negative supercoiling by Z DNA may not be sufficient to prevent slipped strand DNA structures from forming. • Structures form at a decreased level compared to a tract lacking a Z DNA forming sequence.
Slipped Strand DNA Structures Slipped strand DNA structures are associated with Triplet repeat expansion disorders such as Myotonic dystrophy type 1 and 2. Formed by (CTG)n, (CGG)n and (CAG)n repeat stretches that can fold into hairpin structures comprising both Watson-Crick and non-Watson-Crick base pairs. These contain different mismatches contributing to their stability in the following order CGG>CCG~CTG>CAG. Denaturing and renaturing repeat containing duplexes leads to the formation of unusually stable slip-stranded DNA molecules in which the loop outs form hairpin structures. These hairpins kinetically trap repetitive DNA in an otherwise unfavourable slip stranded conformation.
Slipped Strand Structures and Human Disease In all cases repeats are stably inherited until they exceed a threshold of approximately 100-200bp. To help prevent expansion triplet repeat runs typically contain stabilising interruptions that make the formation of unusual DNA structures less likely, but in the largest alleles these are lost creating lengthy homogeneous runs. The structural imperfections of mismatched bases in slip stranded structures and significant supercoiling required for formation also ensure repeat lengths remain stable. Only when repeats become excessively long and lose the interruptions will they expand. This can lead to Anticipation which is associated with triplet repeat expansion disorders.
Slipped Strand DNA Structures and Human Disease Mechanism of Repeat Expansion-Lagging Strand Hypothesis • One study found that the stability of CTG repeats in bacterial plasmids undergoing DNA replication depended on their orientation relative to the origin of replication. • When expansion prone CTG repeats were situated in the lagging strand template, repeats frequently contracted. • When the CTG repeats were in the nascent lagging strand expansions were detectable. • Hairpin structures formed by the repeats in either the lagging strand template or nascent lagging strand caused contractions or expansions
Forkstalling Hypothesis of Repeat Expansion Repeat Contraction • Leading strand DNA polymerase runs into an expandable repeat. • Single stranded part of lagging strand therefore becomes repetitive and forms a stable secondary structure, stalling the leading strand polymerase and ultimately the replication fork. • Lagging strand synthesis could continue after skipping an Okzaki fragment. This would leave a gap in the nascent lagging DNA strand around the repeat. Repair of this gap would cause repeat contraction as the lagging strand polymerase skips the structure on its template. Repeat Expansion • Replication stalling can cause fork reversal, a four-way junction forms and single stranded repeat extension occurs at the 3’ end of the leading strand. • This folds into a secondary structure. • To restart replication the the reversed fork is flipped back and if the repetitive structure holds extra repeats will be added to the leading strand.
Repeat expansion in Non-dividing Cells • Mechanism of Expansion in Non-Dividing Cells- Flap Model of Repeat Expansion Hypothesis • Triplet repeat expansions such as DM1 are known to be somatically unstable in non-dividing cells such as brain and skeletal muscle. • Oxygen radicals or other environmental agents generate nicks and small gaps in repetitive tracts. • In the process of DNA synthesis in gap repair the non-template DNA strand is displaced, forming a flap. • This flap is normally removed by FEN1 (FLAP endonuclease) which loads onto the 5’end of the flap and migrates to its junction with the duplex and cleaves the flap. • If the flap contains a repetitive run it can form hairpin structure and prevent FEN1 loading. • Slipped strand intermediates form and can be converted into expanded products.
Splicing and Human Disease Splicing is the post transcriptional modification of RNA, during which introns are removed and exons are joined together. Constitutive splicing includes exons that are essential and common to the various transcripts of a gene. Alternative splicing allows the inclusion of additional exons , elongated or shortened exons, the retention of introns or the exclusion of exons (exon skipping) Alternative splicing is often tissue specific or developmentally regulated.
Splicing and Human Disease Changes in splicing regulatory elements can lead to exon skipping, intron retention, the creation of ectopic splice sites or activation of cryptic splice sites. Aberrantly spliced transcripts may lead to protein isoforms with altered properties, or included exons with premature termination codons, which activate nonsense mediated decay. Alternative splicing of the untranslated regions of transcripts may lead to changes in mRNA stability or localisation or control of translation.
Genetic Variation in cis Motifs and Human Disease • Mutations in cis-acting motifs such as exon splicing enhancers and repressors have been linked to disorders such as Spinal Muscular Atrophy (SMA). • Spinal Muscular Atrophy is caused by the homozygous loss of the (Survival of Motor Neurone 1) SMN1 gene located at 5q11.2-13.3. • SMN1 lies within the centromeric portion of a complex 500kb inverted duplication containing repetitive sequences, pseudogenes and retrotransposable elements. • SMN2 is a paralog that is almost identical to SMN1 and lies in the telomeric region of the inverted repeat. • SMN2 has C>T transition 6th nucleotide exon 7. This leads to production of non-functional protein due to skipping of exon 7. • The Exon Splicing Enhancer model suggests that the mutation in SMN2 causes disruption of an Exon Splicing Enhancer (ESE) site bound by the SR proteins S2F/ASF.
Genetic Variation affecting Trans-Splicing Factors and Human Disease • Trans-Splicing factors are proteins that bind to cis-acting motifs. • Examples include the SR proteins which bind to Exon Splicing Enhancer Sequences and heterogeneous ribonucleoproteins (hnRNP) which bind to Exon Specific Supressor sequences. • Expression is often tissue specific and developmentally regulated. • Genetic variation that affects trans factors is demonstrated in Myotonic Dystrophy. • Myotonic dystrophy type 1 (DM1) is caused by a CTG trinucleotide repeat expansion in the 3’UTR of the DMPK gene. Expansion causes the generation of toxic RNA molecules which has trans-dominant effects. • Expanded RNA transcripts accumulate in nuclear foci and recruit CUG binding protein 1(CUGBP1) possibly through the action of Muscle-blind protein, which binds double stranded RNA and is down regulated in DM1 cells. • Sequestering of these proteins leads to the disruption of developmentally regulated splicing events, leading to the production of fetal transcripts and the symptoms of DM1. • For example retention of intron 2 and inclusion of exon 6b and 7a in dult CLCN1 gene causes insertion of a premature termination codon and targets the transcript for nonsense mediated decay • This results in decreased chloride channel conduction in striated muscle and the characteristic myotonia. Aberant splicing also occurs in the insulin receptor and cardiac troponin T.
Genetic Variation affecting Alternative Splicing and Human Disease • Almost every human gene can produce different mRNA isoforms through alternative splicing. • Half of all alternative splicing events in ovarian and breast tissues are altered in tumours. • Sequence analysis has shown that binding sites (AGCAUG) for the RNA binding protein FOX2 are located downstream of one third of exons skipped in breast and ovarian cancer. • FOX2 itself is down-regulated in ovarian cancer and is alternatively spliced in breast cancer. • This suggests that decreased expression of FOX2 in cancer tissues modulates the aberrant alternative splicing pattern and controls proliferation. • Decreased expression of FOX2 in cancer tissues leads to a ‘signature’ of splice variants compared to normal tissue. • It has been suggested that tissue-specific alternative splicing reverts to a common default pattern in cancer, or it may reflect differences in the composition of tumour tissue compared to normal. • For both cases alternative splicing is proposed to aid malignant growth by encoding proteins such as those associated with actin filaments, myosin trafficking and microtubule binding.
References • Barcolla, A, Wells, R.T. (2009) Mol Carcin. 48. 273-285. • Barcolla, A, Wells, R.T. (2004) Non-B DNA Conformations, genomic rearrangements and human disease. J. Biol Chem. 279 (46) 47411-47414 • Calado, R.T. (2009) Telomeres and Marrow Failure. Am Soc Haem. 338-343. • Edwards et al. (2009) A Z-DNA sequence reduces slipped-strand structure formation in the myotonic dystrophy type 2 (CCTG)(CAGG) repeat. PNAS. 106 (9) 3270-3275. • Jensen et al. (2009) Splicing, cis genetic variation and disease. Biochemical Society Transactions. 37 (6) 1311-1315. • Venebles et al. (2009) Cancer-associated regulation of alternative splicing. Nature, Structural & Molecular Biology. 16 670-676. • Wells, R.D. (2006) Non-B DNA Conformations, mutagenesis and disease. Trends in Biochem Sci. 32 (6) 271-278. • Wu, Y, Brosh,R.M. (2010) G-Quadruplex nucleic acids and human disease. FEBS. 227. 3470-3488