510 likes | 575 Views
Chromosome Structure and DNA Sequence Organization. Timothy G. Standish, Ph. D. Eukaryotes Have Large Complex Geneomes. The human genome is ≈ 3 x 10 9 bp 3 x 10 9 bp x 0.34 nm/bp x 1 m/10 9 nm ≈ 1 m Because humans are diploid, each nucleus contains 6 x 10 9 bp or ≈ 2 m of DNA
E N D
Chromosome StructureandDNA Sequence Organization Timothy G. Standish, Ph. D.
Eukaryotes Have Large Complex Geneomes • The human genome is ≈ 3 x 109 bp • 3 x 109 bp x 0.34 nm/bp x 1 m/109 nm ≈ 1 m • Because humans are diploid, each nucleus contains 6 x 109 bp or ≈ 2 m of DNA • That is a lot to pack into a little nucleus! • Eukaryotic DNA is highly packaged
Eukaryotic DNA Must be Packaged • Eukaryotic DNA exhibits many levels of packaging • The fundamental unit is the nucleosome, DNA wound around histone proteins • Nucleosomes arrange themselves together to form higher and higher levels of packaging.
Nucleosomes • Nucleosome - Nucle - kernel, some - body • The lowest DNA packaging level • Can be thought of as like a length of thread wound around a spool, the thread representing DNA and the spool being histone proteins
Nucleosome Structure • Approximately 200 bp of DNA: • Core DNA - 146 bp associated with the histone octomer • 19 bases complete the two turns around the histone octomer • Linker DNA - 8 to 114 bp linking nucleosomes together
The Histone Octomer • Four proteins: H2A, H2B, H3, and H4 • H3 and H4 are arginine rich and highly conserved • H2A and H2B are slightly enriched in lysine • Both arginine and lysine are basic amino acids making the histone proteins both basic and positively charged • The octomer is made of two copies of each protein
The Fifth Histone, H1 • A fifth protein, H1, is part of the nucleosome, but seems to be outside the octomer • H1 varies between tissue and organisms and seems to stick to the 19 bases attached to the end of the core sequence • Ausio (2000) discusses data showing that, at least in fungi, survival is possible without H1 • Lack of H1 does not impact cell viability but shortens the lifespan of the organism • This raises the question of how H1 evolved in single celled organisms Ausio J
T G A Histone octomer C GC TA Histone proteins GC CG TA AT AT CG 2 nm GC TA Packaging DNA B DNA Helix
Histone proteins 2 nm Packaging DNA T G A Histone octomer C GC TA GC CG TA AT AT CG B DNA Helix GC TA
11 nm Histone proteins Nucleosome 2 nm Packaging DNA T G A Histone octomer C GC TA GC CG TA AT AT CG B DNA Helix GC TA
Histone H1 GC CG TA AT AT CG GC TA Packaging DNA
GC CG TA AT AT CG GC TA Packaging DNA Histone H1
11 nm GC CG TA AT 30 nm 200 nm AT CG GC TA Protein scaffold Packaging DNA “Beads on a string” Looped Domains Tight helical fiber
11 nm Nucleosomes 30 nm 700 nm 200 nm T Looped Domains Tight helical fiber G C A 2 nm Protein scaffold B DNA Helix Packaging DNA Metaphase Chromosome
Highly Packaged DNA Cannot be Expressed • The most highly packaged form of DNA is “heterochromatin” • Heterochromatin cannot be transcribed, therefore expression of genes is prevented • Constitutive heterochromatin - Permanently unexpressed DNA e.g. satellite DNA • Facultative heterochromatin - DNA that could be expressed if it was not packaged
Junk DNA • During the late 1960s papers began to appear that showed eukaryotic DNA contained large amounts of repetitive DNA that did not appear to code for proteins (ie, Britten and Kohne, 1968). • By the early 1970s, the term Junk DNA had been coined to refer to this non-coding DNA (ie. Ohno, 1972).
Evidence • Conservation of protein (and DNA) sequences is commonly interpreted to indicate functionality • Significant variation in non-coding DNA is evident between relatively closely related species and even within species (ie Zeyl and Green, 1992). • Mutation of some non-coding DNA does not produce significant changes in phenotype (Nei, 1987).
What is Junk DNA? • “Junk DNA” is DNA that does not code for proteins, this is the definition that we will use. • The meaning of “junk DNA” has become restricted significantly in recent years as the functionality of much of what was once considered junk has become obvious. Most modern genetics texts avoid the term. Even when junk DNA is mentioned, it may be given significantly different definitions. For example, Lodish et al. (1995) called it “Extra DNA for which no function has been found.”
Types of Junk DNA • Nine different types of DNA were listed as junk DNA by Nowak (1994) • These nine types can be grouped into three larger groups: • Repetitive DNA sequences • Untranslated parts of RNA transcripts (pre-mRNA) • Other non-coding sequences
Repetitive DNA • Repeated sequences seem too short to code for proteins and are not known to be transcribed. • Five major classes of repetitive DNA: • Satellites - Up to 105 tandem repeated short DNA sequences, concentrated in heterochromatin at the ends (Telomeres) and centers (centromeres) of chromosomes. • Minisatellites - Similar to satellites, but found in clusters of fewer repeats, scattered throughout the genome • Microsatellites - Shorter still than minisatellites. • 4 and 5 Short (300 bp) and Long (up to 7,000 bp) Interspersed Elements (SINEs and LINEs) - Units of DNA found distributed throughout the genome
Untranslated Parts of mRNA • Not all of the pre-mRNA transcribed from DNA actually codes for the protein. These non-coding parts are never translated. • Three non-coding parts of eukaryotic mRNA: • 5' untranslated region • Introns - Segments of DNA that are transcribed into RNA, but are removed from the RNA transcript before the RNA leaves the nucleus as mRNA • 3' untranslated region
Transcription Start Site 3’ Untranslated Region 5’ Untranslated Region Introns 5’ 3’ Int. 1 Int. 2 Exon 1 Exon 2 Exon 3 Promoter/ Control Region Terminator Sequence Exons RNA Transcript A “Simple” Eukaryotic Gene
Other Non-coding Sequences • Pseudogenes - DNA that resembles functional genes, but is not known to produce functional proteins. Two types: • Unprocessed pseudogenes • Processed pseudogenes • Heterogeneous Nuclear RNA - A mixture of RNAs of varying lengths found in the nucleus. Approximately 25 % of the hnRNA is pre-mRNA that is being processed, the source and role of the remainder is unknown.
Problems With Junk DNA • Junk DNA makes up a significant portion of total genomic DNA in many eukaryotes. • 97 % of human DNA is “junk” • If this DNA is functionless, this phenomenon presents interpretation problems for both naturalism and intelligent design theory.
The Problem for ID • It is hard to imagine a designer creating so elegantly and efficiently at higher levels, but leaving a lot of junk at the DNA level. • This calls into question the intelligent design argument that organisms are so complex and efficient that they must be the result of design rather than the result of random events. • Darwinists have eagerly proclaimed junk DNA to be molecular debris left behind in the genome as organisms have changed over time - The pot shards of evolution.
Straw Gods • This argument is based on assumptions about the way the designer/God must be • God is God and He can create in any way He wants. If He wants to create organisms with lots of unnecessary DNA, then He can do that if He wants • In other words, God can’t be defined, then argued against on the basis of a faulty definition
Darwinists Jumped on the Data • Dawkins (1993) and Orgel and Crick proposed that successful genes are selfish in that they “care” only about perpetuation of their own sequence. Thus repetitive DNA represents successful selfish genes. • Brosius and Gould (1992) suggested nomenclature assuming junk DNA was once functional DNA, currently functionless, and is raw material for future functional genes. • Walter Gilbert and others (Gilbert and Glynias, 1993; Dorit and Gilbert, 1991; Dorit et al., 1990) suggested exons are the nuts and bolts of evolution while introns are the space between them. Thus, to make a functional protein, standard parts can be used, just as we use standard nuts, bolts and other parts to make a bridge or bicycle
The Problem for Darwinists • Darwinism predicts at least some degree of efficiency as natural selection should select against less “fit” or efficient members of a population. • Only the most efficient organisms would be expected to survive in a selective environment. The large amount of junk DNA in some eukaryote’s genomes seems very inefficient. • One would think that a trend would be evident in organisms going from less to more efficient use of DNA. In fact, if junk DNA really is junk, then the trend is almost the opposite with the most primitive organisms having the least junk DNA.
Changes in the Quantity of DNA • The amount of non-coding DNA can vary significantly between closely related organisms (ie salamanders) indicating that changes in non-coding DNA is an easy evolutionary step. • If change is easy, why are those with more than the average not less fit? • If DNA is junk, it would be an added burden, but the burden might not be significant, thus change would be neutral in terms of fitness
Do Changes in Junk DNA Quantity Impact Fitness? • Making DNA requires significant input of energy as dNTPs, along with production of enzymes to produce and maintain the DNA. Factor all that in to the human average of 75 trillion cells with 6 x 109 bp/nucleus and the cost seems significant. • Unneeded DNA presents a danger to the cell. • Mutations could resulted in the production of junk RNA wasting resources and potentially interfering with production of needed RNAs and consequently proteins. • Junk proteins could be made that would waste cell resources at best, or, at worst, may alter the activity of other proteins
Non-coding DNA has a Significant Impact • Sessions and Larson (1987) showed that in salamanders larger amounts of genomic DNA correlates with slower development • Meagher and Costich (1996) showed significant negative correlation between junk DNA content and calyx diameter in S. latifolia • Petrov and Hartl (1998) have shown that, at least in Drosophila species, functionless DNA is rapidly lost
Evidence for Functionality in Non-coding DNA • As early as 1981 (Shulman et al, 1981) statistical methods were published for obtaining coding sequences out of the morass of noncoding DNA. • More recently neural networks have been used to locate protein coding regions (Uberbacher and Mural, 1991). • Searls (1992, 1997) suggested that DNA exhibits all the characteristics of a language, including a grammar. • Mantegna et al (1994) applied a method for studying languages (Zipf approach) to DNA sequences and suggested “noncoding regions of DNA may carry biological information.” (This has not gone unchallenged, see Konopka and Martindale, 1995.)
Roles of Non-coding DNA Expressed as RNA • Introns - May contain genes expressed independently of the exons they fall between. • Many introns code for small nuclear RNAs (snoRNAs). These accumulate in the nucleolus, and may play a role in ribosome assembly. Thus the introns cut out of pre-mRNA, may play a role in producing, or regulating production of machinery to translate the mRNA’s code • 3' Untranslated Regions - Play an important role in regulating some genes (Wickens and Takayama, 1994). • Heterogeneous nuclear RNA - Only speculation is possible, but with the discovery of ribozymes and RNAi it is possible these RNAs are playing an important role
Roles of Non-coding DNA • Satellite DNA: • Attachment sites of spindle fibers during cell division • Telomeres protect the ends of chromosomes • Mini and Microsatellites - Defects are associated with some types of cancer, Huntingtons disease and fragile X disease • May serve as sites for homologous recombination with the Alu SINE • A and T boxes resembling A-rich microsatellites are found associated with the nuclear scaffold • The AGAT minisatellite has a demonstrated function in regulation
Conclusions • Less and less non-coding DNA looks like junk • Some classes of non-coding DNA remain problematic, particularly processed pseudogenes • Discovery of important functions for non-coding DNA calls into question any support the idea of junk DNA provides Darwinism • Proponents of ID must be cautious in accepting the interpretation put on data by Darwinists • Darwinists need to consider the predictions made by their own theory before interpreting data to discredit ID when the interpretation is equally problematic in the context of natural selection
The End
a b Fe b a The Globin Gene Family • Globin genes code for the protein portion of hemoglobin • In adults, hemoglobin is made up of an iron containing heme molecule surrounded by 4 globin proteins: 2 a globins and 2 b globins • During development, different globin genes are expressed which alter the oxygen affinity of embryonic and fetal hemoglobin
Ancestral Globin gene Duplication Mutation a b Transposition Chromosome 16 Chromosome 11 a b Duplication and Mutation z a e g b Duplication and Mutation a2 a1 yz ya2 yq ya1 z Gg yb Ag e d b Embryo Fetus and Adult Embryo Fetus Adult Model For Evolution Of The Globin Gene Family Pseudo genes (y) resemble genes, but may lack introns and, along with other differences typically have stop codons that come soon after the start codons.
Eukaryotic mRNA 5’ Untranslated Region 3’ Untranslated Region 5’ 3’ G AAAAA Exon 1 Exon 2 Exon 3 Protein Coding Region 5’ Cap 3’ Poly A Tail • RNA processing achieves three things: • Removal of introns • Addition of a 5’ cap • Addition of a 3’ tail • This signals the mRNA is ready to move out of the nucleus and may control its life span in the cytoplasm
“Junk” DNA • It is common for only a small portion of a eukaryotic cell’s DNA to code for proteins • In humans, only about 3 % of DNA actually codes for the about 100,000 proteins produced by human cells • Non-coding DNA was once called “junk” DNA as it was thought to be the molecular debris left over from the process of evolution • We now know that much non-coding DNA is involved in important functions like regulating expression and maintaining the integrity of chromosomes
Eukaryotes Have Large Complex Geneomes • The human genome is about 3 x 109 base pairs or ≈ 1 m of DNA • That’s a lot more than a typical bacterial genome • E. coli has 4.3 x 106 bases in its genome • Because humans are diploid, each nucleus contains 6 x 109 base pairs or ≈ 2 m of DNA • That is a lot to pack into a little nucleus!
Only a Subset of Genes is Expressed at any Given Time • It takes lots of energy to express genes • Thus it would be wasteful to express all genes all the time • By differential expression of genes, cells can respond to changes in the environment • Differential expression, allows cells to specialize in multicelled organisms. • Differential expression also allows organisms to develop over time.
Eukaryotic DNA Must be Packaged • Eukaryotic DNA exhibits many levels of packaging • The fundamental unit is the nucleosome, DNA wound around histone proteins • Nucleosomes arrange themselves together to form higher and higher levels of packaging.
Highly Packaged DNA Cannot be Expressed • The most highly packaged form of DNA is “heterochromatin” • Heterochromatin cannot be transcribed, therefore expression of genes is prevented • Chromosome puffs on some insect chomosomes illustrate where active gene expression is going on
Increasing cost Logical Expression Control Points The logical place to control expression is before the gene is transcribed • DNA packaging • Transcription • RNA processing • mRNA Export • mRNA masking/unmasking and/or modification • mRNA degradation • Translation • Protein modification • Protein transport • Protein degradation
A “Simple” Eukaryotic Gene Transcription Start Site 3’ Untranslated Region 5’ Untranslated Region Introns 5’ 3’ Int. 1 Int. 2 Exon 1 Exon 2 Exon 3 Promoter/ Control Region Terminator Sequence Exons RNA Transcript
DNA 5’ 3’ Enhancer Promoter Transcribed Region 3’ 5’ TF 3’ 5’ TF TF RNA Pol. RNA Pol. RNA 5’ Enhancers Many bases TF TF TF