1.33k likes | 2.93k Views
MOLECULAR MARKER TECHNOLOGIES. Training Workshop on Forest Biodiversity 5-16 June 2006. Lee Soon Leong Forest Research Institute Malaysia. Outlines. Organization and flow of genetic information Molecular techniques to reveal genetic variation Type of molecular markers
E N D
MOLECULAR MARKER TECHNOLOGIES Training Workshop on Forest Biodiversity 5-16 June 2006 Lee Soon Leong Forest Research Institute Malaysia
Outlines • Organization and flow of genetic information • Molecular techniques to reveal genetic variation • Type of molecular markers • Which marker for what purpose • Microsatellite marker • Case study 1: using microsatellites to estimate gene flow via pollen • Case study 2: using microsatellites for individual-specific DNA fingerprints
Deoxyribonucleic Acid (DNA): The molecule that encodes genetic information A pairs with T C pairs with G DNA molecule consists of two strands that wrap around each other to resemble a twisted ladder
Nuclear DNA:Diploid; biparental inherited; recombination occur; can be viewed as a huge ocean of largely nongenic DNA, with some tens of thousands of genes and gene clusters scattered around like small islands and archipelagos. A high proportion of this apparently nonfunctional DNA consists of repeated motifs and may be considered as junk DNA or selfish DNA Choroplast DNA:Haploid; usually maternally inherited in angiosperms and paternally inherited in gymnosperms; typically ranging from 135 to 160 kb in size, is packed with genes and thus resembles the streamlined configuration of its cyanobacterial ancestral genome Mitochondrial DNA:Haploid; typically maternally inherited; about 370 to 490 kb, about 10% of these sequences represent genes, another 10 to 26% were found to be made up of repetitive DNA, including retrotransposons. Thus, the majority of plant mtDNA sequences lack any obvious features of information
The rates of mutation are depending on: • Biology of organism • Genomes under consideration • Types of mutations • Organism’s genomic DNAs are subjected to mutation as a result of normal cellular operations or interactions with environment
Base substitution Deletion GATCCGAGTATCGCAATTAGCA GATCCGAGTGTCGCAATTAGCA GATCCGAGTATCGCAATTAGCA GATCCGAGTAATTAGCA Insertion GATCCGAGTATCGCAATTAGCA GATCCGAGTATCGCAGCATTAGCA Duplication Inversion GATCCGAGTATCGCAATTAGCA GATCCGAGTATCTCGCAATTAGCA GATCCGAGTATCGCAATTAGCA GATGCCAGTATCGCAATTAGCA • Mutations in genomic DNA can be classified into several categories:
Through long evolutionary accumulation, many different instances of mutation as mentioned above should exist in any given species The number and degree of the various types of mutations define the genetic diversity within a species It has been widely recognized that loss of genetic diversity is a major threat for the maintenance and adaptive potential of species
Example - if low genetic diversity, when a virulent form of a disease arises, many individuals may be susceptible and die • But as a result of natural genetic diversity within local plant populations, there may be some individuals that are at least partially resistant and there are able to survive and thus perpetuate the species S S S S S S S S S S Low Genetic diversity High Genetic diversity S S S S S S S S S S R S S S R S S R S S R S S S R S S R S S All die Partially resistant
For many plant species, ex situ and in situ conservation strategies have been developed to safeguard the extant of genetic diversity • To manage this genetic diversity effectively the ability to identify genetic diversity is indispensable • In addition, for this variation to be useful, it must beheritableanddiscernable;as recognizable phenotypic variation or as genetic mutation distinguishable throughmolecular marker technologies
Mutation Mutation arises genetic variation at the DNA level DNA markers Subsequently, mutation arises genetic variation at DNA will cause variation at the protein level Protein markers Definition of molecular markers A sequence of DNA or protein that can be screened to reveal key attributes of its state or composition and thus used to reveal genetic variation
Four major molecular techniques are commonly applied to reveal genetic variation. These are: • Polymerase chain reaction(PCR) • Electrophoresis • Hybridization • DNA sequencing
PCR is a procedure used to amplify (make multiple copies of) a specific sequence of DNA POLYMERASE CHAIN REACTION The method was invented by Kary Banks Mullis in 1983, for which he received the Nobel Prize in Chemistry ten years later three temperature-controlled steps
The term 'electrophoresis' literally means "to carry with electricity" ELECTROPHORESIS Technique for separating the components of a mixture of charged molecules (proteins, DNAs, or RNAs) in an electric field within a gel or other support Migration rate depend on electrical charge and size
HYBRIDIZATION One of the most commonly used nucleic acid hybridization techniques is Southern blot hybridization Southern blotting was named after Edward M. Southern who developed this procedure at Edinburgh University in the 1975
The process of determining the order of the nucleotide bases along a DNA strand is called sequencing Principle: single-stranded DNA molecules that differ in length by just a single nucleotide can be separated from one another using PAGE Chain elongation proceeds until, by chance, DNA polymerase inserts a dideoxynucleotide, blocking further elongation SEQUENCING In 1977, 24 years after the discovery of the structure of DNA, two separate methods for sequencing DNA were developed: chain termination method and chemical degradation method
Recent detection techniques TaqMan – a probe used to detect specific sequences in PCR products by employing 5’ to 3’ exonuclease activity of the Taq DNA polymerase Pyrosequencing – refers to sequencing by synthesis, a simple to use technique for accurate analysis of DNA sequences Microarray Technology – a high throughput screening technique based on the hybridization between oligonucleotide probes (genomic DNA or cDNA) and either DNA or mRNA
Biochemical marker Allozyme Traditional marker systems Non-PCR based marker RFLP, Minisatellite (VNTR) PCR based marker PCR generation: in vitro DNA amplification Microsatellite, RAPD, AFLP, CAPS (PCR-RFLP), ISSR, SSCP, SCAR, SNP, etc. TYPES OF MOLECULAR MARKERS • Due to rapid developments in the field of molecular genetics, a variety of molecular markers has emerged during the last few decades
Allozyme (biochemical marker) • The alternative forms of a particular protein visualized on a gel as bands of different mobility. Polymorphism due to mutation an amino acid has been replaced, the net electric charge of the protein may have been altered Technique: Electrophoresis and enzyme staining
RFLP (Non-PCR based marker) • Targets variation in DNA restriction sites and in DNA restriction fragments. Sequence variation affecting the occurrence (absence or presence) of endonuclease recognition sites is considered to be main cause of length polymorphisms Techniques: Electrophoresis and hybridization
RAPD (PCR-based marker) Uses primers of random sequence to amplify DNA fragments by PCR. Polymorphisms are considered to be primarily due to variation in the primer annealing sites, but they can also be generated by length differences in the amplified sequence between primer annealing sites Techniques: PCR and Electrophoresis
AFLP (PCR-based marker) • A variant of RAPD. Following restriction enzyme digestion of DNA, a subset of DNA fragments is selected for PCR amplification and visualization Techniques: PCR and Electrophoresis
Microsatellite (PCR based marker) • Targets tandem repeats of a small (1-6 base pairs) nucleotide repeat motif. Polymorphism due to the number of tandem repeats Techniques: PCR and Electrophoresis
Other markers • Cleaved Amplified Polymorphic Sequence (CAPS/PCR-RFLP) • Inter Simple Sequence Repeat (ISSR) • Single-strand conformation Polymorphism (SSCP) • Sequence Characterized Amplified Region (SCAR) • More recent markers • Single-Nucleotide Polymorphism (SNP) • Retrotransposon-based markers • Sequence-Specific Amplified Polymorphism (S-SAP) • Inter-retrotransposon Amplified Polymorphism (IRAP) • Retrotransposon-Microsatellite Amplified Polymorphism (REMAP) • Retrotransposon-Based Insertional Polymorphism (RBIP)
Weising, K., Nybom, H., Wolff, K. and Kahl, G. 2005. DNA Fingerprinting in Plants, Priciples, Methos, and Applications. 2nd Edition. CRC Press, Boca Raton, Florida, USA. Spooner, D., van Treuren, R. and de Vicente, M.C. 2005. Molecular markers for genebank management. IPGRI Technical Bulletin No. 10. International Plant Genetic Resources Institute, Rome, Italy. Henry, R.J. 2001. Plant Genotyping: The DNA Fingerprinting of Plants. CAB International Publishing, Wallingford, U.K.
Markers differ with respect to important features: • Genomic abundance • Polymorphism level • Locus specificity • Reproducibility • Technical requirements • Financial investment
Codominance or dominace Dominant marker: A marker shows dominant inheritance with homozygous dominant individuals indistinguishable from heterozygous individuals Codominant marker: A marker in which both alleles are expressed, thus heterozygous individuals can be distinguished from either homozygous state
Intraspecific (among individuals) – markers target less conserve region Interspecific (among species) – markers target more conserve region None of the available techniques is superior to all others for a wide range of applications, but the key-question rather is which marker to use in which situation • Within and among population variation – Allozyme, SSR, AFLP and RAPD • Mating system study – Allozyme or microsatellite • Estimating gene flow via pollen and seed – Microsatellite (SSR) • Phylogeography – cpSSR • Clonal identification – AFLP or RAPD • Polyploidy – multilocus dominant marker (AFLP) • Genetic Linkage Mapping – AFLP, RAPD, Allozyme, RFLP, SSR, CAPS, SNP • Phlogenetic study – conserve within species (DNA sequencing) .
A framework for selecting appropriate techniques for plant genetic resources conservation can be referred to: Karp, A., Kresovich, B., Bhat, K.V., Ayad, W.G. and Hodgkin, T. 1997. Molecular Tools in Plant Genetic Resources Conservation: A Guide to the Technologies. IPGRI Technical Bulletin No. 2. International Plant Genetic Resources Institute, Rome, Italy
Microsatellite marker • What are microsatellite? • Where are microsatellites found? • How do microsatellites mutate? • Abundance in genome • Why do microsatellite exist? • Models of mutation • Development of microsatellite primers • Genotyping procedure • Advantages • Disadvantages • Applications
What are microsatellite? • Tandem repeated sequences with a 1-6 repeat motif • Dinucleotide (CT)6 - CTCTCTCTCTCT • Trinucleotide (CTG)4 - CTGCTGCTGCTG • Tetranucleotide (ACTC)4 - ACTCACTCACTCACTC • Synonymous to SSR and STR; Depending on nature of repeat tract, SSR can further divided into four categories:
Where are microsatellites found? Majority are in non-coding region
DNA polymerase slippage Unequal crossing over How do microsatellites mutate? • Microsatellites alleles change rather quickly over time • E. coli – 10-2 events per locus per replication • Drosophila – 6 X 10-6 events per locus per generation • Human – 10-3 events per locus per generation
Microsatellites have been found in every organism studied so far • Most frequent in human > insect > plant > yeast > nematode • Most common dinucleotide: • Human CA/GT • Conifer GA/CT & CA/GT • Dipterocarp GA/CT Abundance in genome
Why do microsatellite exist? • Majority are found in non-coding regions; thought no selective pressure; as "junk" DNA? • Regulate gene expression and protein function, e.g., human diseases caused by expansions of polymorphic trinucleotide repeats in genes fragile X and myotonic dystrophy • In plant, high density of SSRs were found in close proximity to coding regions; regulatory properties • High level of polymorphism; a necessary source of genetic variation
The mutation model still unclear but stepwise mutation appears to be the dominant force creating new alleles in the few model organisms studied to date • Stepwise Mutation Model (SMM) - when SSRs mutate, they gain or lose only one repeat • Two alleles differ by one repeat are more closely related than alleles differ by many repeats Models of Mutation • Several statistics based on estimates of allele frequencies (e.g., Fst & Rst) rely explicitly on a mutation model • Size matters when doing statistical tests of population substructuring
Development of microsatellite primers • Can be time consuming and expensive. May be obtained by screening sequence in databases or screening libraries of clones • Standard method to isolate microsatellites from clones • Creation of a small insert genomic library • Library screening by hybridization • DNA sequencing of positive clones • Primer design and PCR analysis • Identification of polymorphisms • This approach can be extremely tedious and inefficient for species with low microsatellite frequencies
Alternative strategies to overcome • Selective hybridization using nylon membrane • Selective hybridization using steptavidin coated beads • RAPD based • Primer extension
Electrophoresis Agarose PAGE Denaturing PAGE Capillary Visualization Silver staining SybrGreen staining Autoradio-graphy Fluorescent dyes Genotyping procedure PCR
Locus 1 Locus 2 Primer1 Primer4 Primer2 Primer3 Locus 3 Locus 4 • The use of fluorescently labeled primers, combine with automated electrophoresis system greatly simplified the analysis of microsatellite allele sizes
Numberous bands differ in size by 2 bp caused by slippage of DNA polymerase Non-templated addition of an extra A to 3’ end of PCR products Stutter Extra A 120/120 122/122 120/122 120/124 120/126 120/128
Advantages • Low quantities of template DNA required (10-100 ng) • High genomic abundance • Random distribution throughout the genome • High level of polymorphism • Band profiles can be interpreted in terms of loci and alleles • Codominance of alleles • Allele sizes can be determined with an accuracy of 1 bp, allowing accurate comparison across different gels • High reproducibility • Different SSRs may be multiplexed in PCR or on gel • Wide range of applications • Amenable to automation
Disadvantages • High development costs in case primers are not yet available. Primers might be species specific • Heterozygotes may be misclassified as homozygotes when null-alleles occur due to mutation in the primer annealing sites • Stutter bands on gels may complicate accurate scoring of polymorphisms • Underlying mutation model (infinite alleles model or stepwise mutation model) largely unknown • Homoplasy due to different forward and backward mutations may underestimate genetic divergence
Applications Generally, high mutation rate makes them informative and suitable for intraspecific studies but unsuitable for studies involving higher taxonomic levels • Population genetics: investigations within a genus of centers of origin, genetic diversity, population structures and relationships among species • Parentage analysis: seed orchard monitoring, mating systems and gene flow via pollen & seed • Fingerprinting: clone confirmation and individual-specific fingerprints • Genome mapping - Constructing full coverage or QTL maps • Comparative mapping - Genome structure, framework maps, or transferring trait and marker data among species
Case study 1: Using microsatellites to estimate gene flow via pollen
Pollen flow distance? • Outcrossing rate? • Effective breeding unit?
Shorea parvifolia Shorea leprosula
Sample collection DNA extraction SSRs analysis SSRs development Data analysis • Gene flow: exclusion and likelihood approaches • Effective breeding unit: Nason et al. (1998) • Model of pollen dispersal to get maximum pollen flow distance Methodology
Microsatellite Loci Lee, S.L. et al. 2004. Isolation and characterization of 21 microsatellite loci in an important tropical tree Shorea leprosula and their applicability to S. parvifolia. Molecular Ecology Notes 4: 222-225