170 likes | 576 Views
Microarrays. Jonathan Sage 5/29/2008 Bioinformatics 417. Contents. Overview of Microarrays Types of Microarray Types of DNA Microarray Manufacture of Microarray Chips Bioinformatics and Microarrays Bioinformatics Problems Competing Technology (SuperSAGE). Overview.
E N D
Microarrays Jonathan Sage 5/29/2008 Bioinformatics 417
Contents • Overview of Microarrays • Types of Microarray • Types of DNA Microarray • Manufacture of Microarray Chips • Bioinformatics and Microarrays • Bioinformatics Problems • Competing Technology (SuperSAGE)
Overview • A loaded microarray is usually a silicon chip covered in a grid of reporters or probes that are designed to bind to specific substances in a sample. The sample is incubated on the chip and allowed to bind to these reporters. • Designed to investigate complete cellular samples. • Can be used to analyze expression of many different types of biological molecule. • Collect enormous amounts of data on a single chip (entire eukaryotic genomes). • Can be used to show expression differences between normal and diseased samples.
Types of Microarray • DNA Microarray Rely on oligonucleotide reporters that hybridize with genomic DNA, mRNA, or cDNA (most common) • Protein Microarray Rely on protein reporters that complex with proteins being assayed. • Antibody Microarray Rely on antibody reporters that bind with proteins or • Chemical Compound Microarray Rely on chemical compounds reporters usually for use as potential drugs that complex with proteins being assayed. These techniques vary only in the type of probe (AKA reporter) is bound to the solid surface.
Types of DNA Microarray • One Channel DNA Microarray • Used to observe only one cDNA source at once. • Fluorophores are bound to the DNA and cause it to fluoresce at a specific wavelength allowing visualization. • Used to determine absolute levels of RNA (in the form of cDNA) expression. • A cDNA spike can be used to normalize expression levels and to allow comparison of multiple arrays. • The primary advantage is that only a single sample is loaded and therefore contamination from a second sample does not damage the results of the first. • Two Channel DNA Microarray • Used to observe multiple cDNA sources at once. • Fluorophores that are excited at different wavelength (and therefore different colors) allow visualization of multiple cDNA sources at once. • Used to determine relative levels of cDNA expression (usually between diseased and normal sources). • Often still utilize a cDNA spike to normalize the data obtained. • The primary advantage is that microarray comparison of the two sources is much easier. Two Channel cDNA Microarray
Manufacture • Reporters (or probes) are bound to a glass or treated silicon chip. • Many copies of different reporters can be bound to a single chip in a grid pattern. This can be accomplished by a number of different techniques. Spotting involves the use of a robotized arm which spots pre-synthesized oligonucleotides or PCR fragments onto the plate in a grid pattern. Oligonucledotide microarrays are another method. They use oligonucleotides just as spotting methods often do, but the oligonucleotides are synthesized directly onto the chip one base pair at a time.
Manufacture Continued • When using DNA microarrays, the longer the probe, the more specific the array is. • The complexity of arrays can be varied from 10 to 390,000 different reporters.
Bioinformatics and Microarrays • Microarrays yield an enormous quantity of data on a single chip which makes analysis by hand nearly impossible. At best a small subset of the genes being expressed can be viewed at once. • The broad name for this technique is expression profiling. • New bioinformatics approaches automate interpretation of biological significance necessary to categorize this information. • Using a microarray thousands of genes can be monitored at once. This is problematic because statistically a 5% probability of observing an occurrence by chance is normally considered significant. But in the case of 10,000 genes being monitored, 500 genes would be falsely identified as significant by chance. Increasing the stringency also causes problems because significant genes can be overlooked and often the number of significant genes can end up at zero before the false positives are completely eliminated. • A number of different methods have been developed to determine data significance to combat these problems. • Many microarray analysis techniques include machine learning, bootstrapping (statistics), or the Monte Carlo methods. • Once significant genes have been correctly identified gene and protein databases are important to properly annotate what proteins are being differentially expressed.
Bioinformatics Problems The data generated from microarrays is often difficult to normalize and compare between different arrays. While this is not a huge problem with manufactured arrays from large companies like Affymetrix, Eppendorf, Agilent, and TeleChem, it is especially prevalent with in house arrays. Several policies have been created to standardize publishable data. MIAME (Minimum Information About a Microarray Experiment) is a checklist that several journals have adopted to enforce standards. The MAQC (MicroArray Quality Control) Project is being headed by the FDA in order to standardize microarray data which will allow it to be used in drug discovery and substance regulation. The MGED (Microarray and Gene Expression Data) Group is working on a standard for representing data collected from microarrays.
Competing Techniques(SuperSAGE) • Microarrays have a few key disadvantages • Manufactured microarrays are expensive. • In house microarrays are much cheaper, but the data often can’t be normalized and therefore is not well suited for publication. • The correct reporters must be bound to the chip in order for expression to be shown. Expressed cDNA without a reporter on the chip will not bind. • SuperSAGE • The newest SAGE (Serial Analysis of Gene Expression) technique. Developed and published in 2006. • Works by cleaving cDNAs at a specific location using an endonuclease to create a 26 base pair tag for each expressed gene. • Each tag is long enough to identify the gene it came from. • The tags are ligated together and cloned into a vector which can be sequenced • The number of times a tag is counted in the sequence gives its absolute expression level. • This process is much faster than DNA microarray technology (since there is no need to synthesize the array) • Novel genes can be detected as there is no need for using correct reporters • Currently more expensive than in house microarrays, but this technology is the future of mRNA expression analysis. SAGE Technique