BLAT: Molecular and Immunological Methods

BLAT: Molecular and Immunological Methods Lyle McMillen Contact: lylemcm@bigpond.net.au

Molecular and Immunological Methods • 2 basic approaches will be covered, both based on specific interactions found in vivo • Nucleic acid specificity (DNA and RNA binding) • Antibody recognition and interactions

Nucleic acid techniques • These techniques all depend upon the specific nature of nucleic acid interactions. • Namely, Adenosine forms 2 hydrogen bonds with Thymine (DNA) or Uracil (RNA). Guanine form 3 hydrogen bonds with Cytosine. • Purines hydrogen bond with pyrimidines. • So, A is complemented with T/U, while G is complemented with C. • Interactions outside of these specific pairings are not stable. • Specific nature of these pairings allows one strand of DNA or RNA to specify the nucleic acid sequence of a complementary sequence.

Base pairing

PCR – a quick review • DNA is replicated and transcribed to RNA in vivo by DNA or RNA polymerases, which covalently bond single nucleotides (deoxynucleoside triphosphates – dA, dC, dG, and dT or dU) into a complementary sequence to the single stranded DNA template. • Each strand of DNA serves as a template for the synthesis of a second, complementary strand of DNA.

PCR – a quick review • The use of DNA polymerases allowed duplication of DNA when used in conjunction with a pair of primers complementary to the ends of the target DNA sequence. Unfortunately the polymerase (isolated from E. coli) degraded rapidly at high temperatures, and high temperatures were needed to denature the double stranded DNA produced, and allow more DNA replication. • The discovery of a thermostable DNA polymerase in Thermus aquaticus allowed it’s inclusion in a series of repeated thermal cycles, in which the DNA was denatured to single strands, the primers annealed, and the Taq polymerase allowed to synthesise new complementary DNA. • Since the discovery of Taq DNA polymerase, a number of alternative thermostable DNA polymerases have been discovered or engineered to provide different characteristics and performance in PCR.

Taq polymerase • Taq polymerase activities: • Activity optimum at 75-80 ºC • 5’-3’ DNA polymerase (~100 bases/second) • No 3’-5’ exonuclease activity (ie no proofreading, and an error rate of 1 in 9000 bases) • Low 5’-3’ exonuclease activity • Polyadenylates at the 3’ end, creating 3’-dA overhang Taq polymerase – 94 kDa monomer

PCR – a quick review

DNA sequencing – a variant PCR • As DNA polymerases synthesise a second strand of DNA complementary to the sequence of the template strand, dNTP’s are covalently linked to the growing polymer in a specific order. • A modified PCR reaction is used to determine the order in which these nucleotides are added to the DNA polymer – DNA sequencing. • Addition of dideoxynucleotides (ddNTP’s, lacking the 3’-OH required for formation of the phosphodiester bond between 2 nucleotides) in a low concentration to the mix terminates extension of the DNA polymer at random points. A series of fragments terminated at random points in the DNA sequence are generated.

DNA sequencing – a variant PCR

Key concept: Reporter molecules • DNA and RNA are fairly hard to see in a research environment, particularly in low concentrations. • A variety of reporter molecules, or labels, are used to make DNA/RNA easier to detect. • Fall into 2 broad categories • Molecules which bind to NA’s and fluoresce (Dyes) – used in agarose gels and some other applications. Examples: Ethidium bromide, GelRed, SYBR green • Modified nucleotides which have an integral label, which are incorporated into the DNA or RNA (labels). Examples: Radioisotope (35S or 32P) labelled dNTP’s, fluorescently tagged ddNTP’s

DNA sequencing – a variant PCR • Historically, radioactively labelled dATP was included in four separate sequencing mixes, along with one of four ddNTP’s (which terminate extension when incorporated) in low concentrations. This was the Sanger, or dideoxy terminator, method, developed by Frederick Sanger and colleagues in the UK in 1975. • Each mix generates a population of varying length DNA’s, radioactively labelled, which start with the primer sequence. • These mixed populations could be separated on the basis of size (and therefore number of bases) by gel electrophoresis on a denaturing polyacrylamide-urea gel, and the different sized fragments visualized on an autoradiograph. • The terminal nucleotide for each fragment was determined by which ddNTP was incorporated into the reaction.

DNA sequencing – a variant PCR • A number of limitations arise from this technique. • 4 datasets per DNA fragment, which need to be intregrated. • Data collected manually. • Short lengths of sequence data - generally 200-300 bases was as much as could be realistically achieved, although 500-800 bases were possible. • Radioisotopes present a hazard to researchers and a problem for waste disposal.

Reporter molecules – Fluorophores, fluorescent labels and dyes • A fluorophore is a portion of a molecule which causes that molecule to be fluorescent. It’s a functional group which absorbs a specific wavelength of light and re-emits the energy at a different, specific wavelength. • The wavelength absorbed is the excitation frequency, while the wavelength emitted is the emission frequency. • The wavelength shift is due to a loss in energy as heat, resulting in the emission of a longer wavelength photon. This is a Stoke’s shift. • Fluorescent labels bind specifically to the target molecule, and include a fluorophore. They bind specifically to a target nucleic acid sequence. • Fluorescent dyes bind to the target molecule type (eg. All DNA, or all double stranded DNA), but binding is not dependent on the target sequence. Dyes also include a fluorophore functional group.

Reporter molecules – Fluorophores, fluorescent labels and dyes • Examples include: • Fluorescein and the derivative Fluorescein isothiocyanate: Excitation at 494 nm, emission at 521 nm. Fluorescent dye or fluorophore used in immunohistochemistry and Fluorescent In-Situ Hybridisation (FISH) • Ethidium Bromide (EtBr): A nucleic acid dye commonly used to stain DNA in electrophoresis. • SYBR green: A nucleic acid dye, that fluoresces when intercalated in double-stranded DNA. Typically excited at one of three wavelengths (290 nm, 380 nm, and 497 nm), and emits at 520 nm. • Dichlororhodamine: A range of fluorophores with different emission spectra. Used to label dNTP’s • 6-carboxyfluorescein (6-FAM): Fluorophore used to label oligonucleotide in real time PCR.

DNA sequencing – current technologies Dichlororhodamine dyes are used to label ddNTP’s in a dideoxy terminator reaction. Each ddNTP is labelled with a particular variant dye, with different emission wavelengths (i.e. Different colour), resulting in a single reaction generating random fragments, with each fragment labelled with a dye that corresponds to the terminal base. Dichlororhodamine dye

DNA sequencing – current technologies • These fragments can be separated on a gel or using capillary gel electrophoresis. Detection is via a laser filtered to the dye excitation wavelengths, with a corresponding emission wavelength filter to detect any fluorescence. • Generates a chromatographic trace of the four emission wavelengths (corresponding to the four labelled ddNTP’s).

DNA sequencing – current technologies This trace is easily interpreted, with each peak corresponding to the terminal base on the labelled DNA fragment.

DNA sequencing – current technologies • This technology presents a number of advantages compared to radioisotope labelling approaches: • Single tube reaction vs 4 reactions/sample. • Automated data collection, into a single data set vs manual data collection, collating 4 data sets. • Generally able to read 800-1200 bases/reaction vs 200-300/reaction. • No significant hazardous waste vs radioisotope waste. A number of high throughput sequencing technologies are being developed, with the goal of sequencing millions of bases very rapidly.

PCR – end-point analysis • Conventional PCR is typically analysed by electrophoresis and visualisation of the amplicon (PCR product) on an agarose gel. Visualisation is achieved through the use of a fluorescent dye such as ethidium bromide. • This occurs at the end of the PCR reaction. This is an end-point analysis.

PCR – end-point analysis

PCR kinetics • Three distinct phases during a PCR reaction. • Exponential phase – exact doubling of product every cycle (assuming 100% efficiency). Very specific and precise. • Linear phase – highly variable, with reaction components starting to be consumed, products degrade, and the reaction is slowing. The extent of slowing will vary from replicate to replicate. • Plateau/end-point – the reaction has stopped, and no more products are being prepared. Product may begin to degrade. Final yield will vary significantly between replicates.

PCR kinetics

PCR kinetics • So, conventional PCR (via end-point analysis) is not an accurate way to quantitate the PCR template. It is also limited in it’s ability to quantitate different yields of amplicon using staining. • It would be preferable to measure the accumulation of amplicon during the exponential phase, when the rate limiting factors are the amount of template and efficiency of amplification.

Real time PCR • Also called quantitative or kinetic PCR (but not RT-PCR, which is Reverse Transcriptase PCR). • Adds a reporter molecule to a PCR reaction, allowing detection of the amplicon through the course of the PCR. This is the most important difference to conventional PCR methodologies. These reporter molecules are attached to primers, oligonucleotide probes, or the amplicon, conferring fluorescent potential on these molecules. • Reporter molecules are fluorescent molecules, and are detected using a fluorescent spectrophotometer in the real time PCR platform. • Two broad categories of reporter molecule – they interact either specifically (labels) or non-specifically (dyes) with the amplicon’s nucleotide sequence. • Quantitative analysis is based on detection of the amplicon during the exponential phase of the PCR. • Data is presented as the thermal cycle at which the level of fluorescence reaches an arbitrary threshold, set within the exponential phase of the PCR. This is referred to as the CT value.

Real time PCR • So, how does it work? • Two commonly used approaches. • Double stranded DNA detection • This approach utilises a fluorescent dye which specifically binds to double stranded DNA (intercalating agent) – SYBR green, and later derivatives such as SYBR greener, LC green 1, SYTO 9, EVA Green. • The PCR proceeds as normal, and the dye intercalates into the double stranded amplicon. • The more amplicon is produced, the more dye is intercalated. • As these dyes intercalate, their emission intensity increases (over 100-fold for SYBR green), due to conformational changes on binding. • It is worth noting that SYBR green is toxic to PCR, and is therefore used at extremely low concentrations. There are saturation dyes available that are not toxic, and can be used at higher concentrations giving stronger fluorescence.

Real time PCR

Real time PCR • The second major approach utilises hydrolysis of a specific oligonucleotide containing a fluorescent label – often called the TaqMan method, but also called 5’ nuclease, Taq nuclease or dual-labelled probes. • Taq polymerase has a 5’-3’ exonuclease activity. Hydrolyses DNA on the same strand as the newly synthesised DNA. • The oligonucleotide probe contains 2 functional groups: a 5’ fluorophore, and a 3’ fluorophore (e.g. TAMRA) or non-fluorescent quencher (NFQ). Energy generated by the excitation of the 5’ fluorophore is captured by the 3’ quencher, and emitted as fluorescence or heat (NFQ). If a second fluorophore is the quencher, the emission wavelength is different to that of the 5’ fluorophore. This process is called Fluorescence Resonance Energy Transfer, or FRET. • The probe anneals to the target region specifically. As the Taq polymerase synthesises DNA, it hydrolyses the probe. Cleavage of the 5’ fluorophore from the rest of the probe enables it to emit fluorescence, which can be detected. • The level of fluorescence detected is proportional to amount of probe hydrolysis, and therefore the amount of amplicon synthesised.

Real time PCR

Real time PCR – alternative probe strategies • There are a number of other probe strategies available, many of which are patented. • Hybridisation probing entail using two probes, each labelled with a different fluorophore (typically 6-FAM and a red fluorophore). • These probes hybridise within 1-5 bases of each other on the amplicon. • Excitation of the first fluorophore allows the excitation of the second fluorophore via FRET. • Leads to fluorescence at the second fluorophore’s emission wavelength (610, 640, 670 or 705 nm, depending on fluorophore), while exciting using the wavelength of the first (470 nm). • Detection occurs at the end of annealing step. • Once detection is complete, an increase in temperature triggers DNA polymerase activity, displacing probe and amplifying the target region. • Note that when not hybridised, the first fluorophore will emit fluorescence in its emission wavelength (530 nm for 6-FAM).

Hybridisation probes Annealing Denaturation Extension Completion

Real time PCR platforms • The real time PCR platform consists of a few basic elements • Thermal cycler – basically a PCR machine, usually capable of rapid and precise variations in temperature (usually between 15 and 99 ºC). • Excitation wavelength emitter, capable of transmitting the excitation wavelength of the fluorescent reporter to each sample. • Emission detector, capable of precise quantitation of the amount of fluorescence being emitted by the sample at the fluorophore’s emission wavelength. • Data recorder, recording the fluorescence from each sample at the end point of each thermal cycle (end of extension step).

Real time PCR platforms • There are a few major types of real time PCR platform from a range of suppliers, but they all perform the same function. • Most are capable of managing multiple fluorophores simultaneously, allowing multiple amplicons to be probed in a multiplex assay (we’ll discuss this in more detail later). • All are also associated with sophisticated data management and analysis software, which makes data analysis easy, reliable and reproducible. • Raw data integrity is always protected – important for clinical and diagnostic applications.

Data output

Real time PCR applications • The most obvious application of real time PCR is for detection and quantitation of a specific DNA sequence. • May also be used for monitoring changes in gene expression, genotyping, or detection of genetic variations such as single nucleotide polymorphisms (SNP’s).

Quantitative real time PCR • Real time PCR data is presented as CT (Cycle threshold) values, defined as the thermal cycle at which the fluorescence reaches an arbitrary threshold. • If a series of samples with known concentrations of initial template DNA is included in the assay, a linear plot of CT vs log [initial template] may be generated. • These standards can be a known number of cells, a defined number of copies of a plasmid, or any other defined, quantifiable and reproducible number of target templates. • This plot permits linear regression analysis, allowing the calculation of the copy number of any unknown target relative to the standards. • The plot also indicates amplification efficiency (slope) and some indication of sensitivity (y-intercept).

Quantitative real time PCR

Quantitative real time PCR • There are 4 basic assumptions underlying quantitation by real time PCR: • The initial template is double stranded. • When analysing RNA, reverse transcriptase produce single stranded cDNA, which is made double stranded in the first amplification cycle. True amplification begins in cycle 2. • PCR efficiency is 100%, and both strands of all templates are copied into full length copies each cycle. • This never happens, due to inefficient primer hybridisation, template folding and probe and dye interference. • PCR efficiency is constant throughout the amplification process. • Secondary structures may inhibit amplification from long templates such as genomic DNA or from supercoiled plasmids, mitochondria and bacterial genomes. • Compare with standards based on the same starting material. • Fluorescence is proportional to the amount of template. • This depends on the dye used, the sequence amplified, the length of the amplicon, the optical properties of the platform, data acquisition and instrument settings.

Gene expression analysis • One application of quantitative real time PCR is analysis of gene expression in different tissues or under different treatment regimes. • mRNA expression from the gene of interest is quantitated from each sample using a real time RT-PCR. • These are real time PCR’s performed on cDNA, generated from RNA (extracted from the target tissue) by reverse transcriptase. The reverse transcriptase primer can be the same as one of the real time PCR primers, or be just outside the real time PCR amplicon. Only one primer is needed. • Reverse transcriptase efficiency is a significant contributor to variability observed in real time RT-PCR, and needs to be taken into account when developing any real time RT-PCR method. • Samples are normalised to the level of expression of a house-keeping gene. These are genes that are always expressed at constant levels in each cell, thought to be involved in routine cellular metabolism. e.g. glyceraldehyde-3-phosphate dehydrogenase (G2PDH or GAPDH), beta actin, some ribosomal proteins. • This allows proportional comparison of target mRNA levels between samples.

Genotyping • A range of real time PCR methods can be used to determine genotype of a target amplicon. • The simplest approach relies upon determining the melting temperature of the amplicon using a melting curve. • The real time PCR is performed as normal, incorporating a non-hydrolysed probe or dye – typically performed with SYBR Green or a saturation dye such as SYTO 9 or LC Green 1. • Once the amplification program is complete (and quantitation data collected), the samples is heated through a gradient, with fluorescence data gathered at set temperature intervals (typically every 1 ºC, but can be as often as every 0.2 ºC in high resolution equipment). The gradient is typically from 50 ºC to 95 ºC, but can be refined to a narrower range. • As the temperature increases, the amplicon will denature, “unzipping” from double stranded to single stranded. The fluorescence of the dyes will decrease as more of the amplicon denatures. • The temperature at which this decrease in fluorescence is at its fastest is called the melting temperature (TM), and varies with the G+C% and sequence of the amplicon. Determined by plotting reduction in fluorescence against change in temperature (dF/dT). • Different genotypes of the amplicon will have different TM’s. TM analysis is also used to ensure that the desired target has been amplified.

Genotyping

SNP analysis – a specific form of genotyping • Single nucleotide polymorphisms (SNP’s) are the most common form of genetic variation. • SNP’s are a single base variation at a specific locus within a gene, usually consisting of two alleles. The rare allele is generally present in >1% of the population. • The SNP may be in the coding sequence, non-coding region, or intergenic regions, and can have impacts on polypeptide sequence, gene splicing , transcription factor binding or non-coding RNA sequence. • They can be detected through melting curve analysis , with specific patterns being generated for each homozygote (both diploid alleles containing the wild type or mutant) and for a heterozygote (one wild type and one mutant allelle). • Heterozygote amplifications include both alleles in the one reaction, and they will anneal, but do not match perfectly (a heteroduplex). As a consequence, the TM of the heteroduplex will be lower than that of the amplicons from the homozygotes.

SNP analysis NB: This plot is normalised to the mutant allele melting curve.

SNP analysis • A second option involves the inclusion of hydrolysis probes for each allele in the one reaction. Each hydrolysis probe contains a different fluorophore with distinct excitation and emission wavelengths. This is an example of a multiplex assay. • If probe design is right and reaction conditions are stringent enough, a single base variation from the probe target will be sufficient to prevent annealing of the probe to the variant sequence. • Amplification of each allele will be reported by the probe specific for that allele. • Multiple SNP’s can interrogated simultaneously, although this is limited by the number of distinct probes available, the optimisation of reaction conditions, and the capabilities of the real time PCR platform.

Real time PCR design considerations • In order to have the most efficient amplification and detection possible, a number of guidelines have been developed for optimum real time PCR assay design. • The length and structure of the amplicon are fundamental to good real time PCR design. In general, real time PCR amplicons are very short compared to conventional PCR amplicons (70-300 bp). • G+C% of the amplicon should be between 30-80%, and runs of identical nucleotides, particularly 4 or more G’s, should be avoided. • Primers should not be complementary to themselves or each other to avoid primer-dimer formation. • Primer TM‘s should be within 2 ºC of each other (58-60 ºC is optimal), and primers should be 18-22 nucleotides long. • Number of G’s and C’s in the last 5 bases of the 3’ end of the primer should not exceed 2. • Hydrolysis probes should have a TM 10 ºC higher than the primers (but not above 75 ºC), and should anneal close to the primer on the same strand. • Hydrolysis probes should be 20-30 bases long, unless stabilised with a minor groove binder moiety (a tricyclic functional group that folds back into the minor groove of the probe-target duplex and stabilises the interaction). MGB probes may be only 13-18 bases long, and still achieve the desired TM. • The probe should not have a G at the 5’ end, to avoid unpaired Gs quenching fluorescence, and there should be more C’s than G’s (increases the change in fluorescence when the probe is hydrolysed).

Real time PCR design considerations • Fortunately, there are a number of software tools available (e.g. PrimerExpress, Oligo 6.0, Vector NTI) which can generate a number of design options from a DNA sequence. These possible designs will meet the basic rules for assay design, and will need to be checked manually by the researcher to ensure they are suitable for the specific application.

Another use for fluorescent probes - microarrays • A DNA microarray is a series of microscopic spots of specific oligonucleotides (typically stretches of a gene) covalently bound to a matrix (ie. A slide or chip). • Under high stringency conditions, only a complementary sequence will bind to these probes. • If the sample to be probed is fluorescently labelled, array sites containing probes that bind the sample will fluoresce.

Another use for fluorescent probes - microarrays • Typical uses include gene expression profiling, comparing genome content, and SNP detection. • In gene expression profiling, mRNA is isolated from two samples, and cDNA is prepared by reverse transcriptase. • During cDNA synthesis, fluorescently labelled nucleotides are incorporated – different fluorophores are used in each sample. Cy3 (emission at 570nm, or green) and Cy5 (emission at 670 nm, or red) are commonly used. • The cDNA samples are then hybridised to the microarray, containing thousands of oligos specific to individual genes in known locations on the array. • Fluorescence is measured at each location on the array, and variations in expressed genes are identified. • Note that if both Cy3 and Cy5-labelled cDNA bind to an array location, the spot appears yellow. This gives four possible outcomes: • Black – no expression of that gene in either sample. • Red or green – expression of that gene in only one of the samples. • Yellow – expression of the gene in both samples. • Housekeeping genes are always included as a reference.

Microarrays

Microarrays A 40000 spot two-colour oligo microarray.

BLAT: Molecular and Immunological Methods

BLAT: Molecular and Immunological Methods

Presentation Transcript

Cell and Molecular Effects of Low Doses of Radiation

Molecular Orbital Theory

Chapter 9: Molecular Structures

Chapter 6: Birth Control

GENETIC DIAGNOSTIC METHODS

THE MOLECULAR BASIS OF CANCER

Chapter 7. Cluster Analysis

Welcome Each of You to My Molecular Biology Class

PHASES OF MATTER AND KINETIC MOLECULAR THEORY

HUMAN MOLECULAR GENETICS

Chapter 8 “Covalent Bonding”

Molecular Phylogeny

Different methods for insect collection and preservation

Clustering Methods

Mesh Parameterizations

Molecular Biology of Cancer

Welcome Each of You to My Molecular Biology Class

Welcome Each of You to My Molecular Biology Class

Biology 30

GASES